Skip to content

OMOPCDMFeasibility

Documentation for OMOPCDMFeasibility.

OMOPCDMFeasibility._concept_col Method
julia
_concept_col(tblsym::Symbol) -> Symbol

Generates the concept column name for a given table symbol.

This is an internal helper function that constructs the appropriate concept column name based on table naming conventions. Special handling is provided for the person table which uses gender_concept_id.

Arguments

  • tblsym - The table symbol

Returns

  • Symbol - The concept column name for that table

Examples

julia
col = _concept_col(:condition_occurrence)
# Returns: :condition_concept_id

col = _concept_col(:person)
# Returns: :gender_concept_id
source
OMOPCDMFeasibility._counter_reducer Method
julia
_counter_reducer(sub, funcs) -> Any

Applies a sequence of functions to a subject, reducing through function composition.

This internal helper function sequentially applies each function in the funcs vector to the result of the previous function, starting with sub.

Arguments

  • sub - Initial subject/input to transform

  • funcs - Vector of functions to apply sequentially

Returns

  • Any - Result after applying all functions

Examples

julia
result = _counter_reducer([1,2,3], [x -> x .* 2, sum])
# Equivalent to: sum([1,2,3] .* 2) = sum([2,4,6]) = 12
source
OMOPCDMFeasibility._create_cartesian_profile_table Method
julia
_create_cartesian_profile_table(df, cols, cohort_size, database_size, conn; schema="dbt_synthea_dev", dialect=:postgresql)

Create a Cartesian product profile table with all covariate combinations.

Arguments

  • df - DataFrame with demographic data

  • cols - Vector of column names to include in combinations

  • cohort_size - Total cohort size

  • database_size - Total database population size

  • conn - Database connection object

  • schema - Database schema name (default: "dbt_synthea_dev")

  • dialect - SQL dialect (default: :postgresql)

Returns

  • DataFrame - Table with all covariate combinations and statistics
source
OMOPCDMFeasibility._create_individual_profile_table Method
julia
_create_individual_profile_table(df, col, cohort_size, database_size, conn; schema="dbt_synthea_dev", dialect=:postgresql)

Create an individual profile table for a single covariate column.

Arguments

  • df - DataFrame with demographic data

  • col - Column name to profile

  • cohort_size - Total cohort size

  • database_size - Total database population size

  • conn - Database connection object

  • schema - Database schema name (default: "dbt_synthea_dev")

  • dialect - SQL dialect (default: :postgresql)

Returns

  • DataFrame - Profile table with covariate categories and statistics
source
OMOPCDMFeasibility._domain_id_to_table Method
julia
_domain_id_to_table(domain_id::String) -> Symbol

Maps OMOP domain_id strings to their corresponding database table symbols.

This function provides the mapping between OMOP domain classifications and the actual database tables where those concepts are stored. It includes special handling for person-related domains and falls back to a naming convention for unknown domains.

Arguments

  • domain_id - OMOP domain identifier string (e.g., "Condition", "Drug")

Returns

  • Symbol - Database table symbol (e.g., :condition_occurrence, :drug_exposure)

Examples

julia
table = _domain_id_to_table("Condition")
# Returns: :condition_occurrence

table = _domain_id_to_table("Gender") 
# Returns: :person

table = _domain_id_to_table("CustomDomain")
# Returns: :customdomain_occurrence
source
OMOPCDMFeasibility._format_number Method
julia
_format_number(n) -> String

Formats a number into a human-readable string with appropriate scaling.

This utility function formats numbers using common abbreviations:

  • Numbers ≥ 1,000,000 are formatted as "X.XM" (millions)

  • Numbers ≥ 1,000 are formatted as "X.XK" (thousands)

  • Numbers < 1,000 are formatted as integers with ties rounded up

Arguments

  • n - Number to format

Returns

  • String - Formatted number string

Examples

julia
_format_number(1234567)  # Returns: "1.2M"
_format_number(5432)     # Returns: "5.4K" 
_format_number(123)      # Returns: "123"
_format_number(0.5)      # Returns: "1"
source
OMOPCDMFeasibility._funsql Method
julia
_funsql(conn; schema::String="main", dialect::Symbol=:postgresql) -> SQLConnection

Creates a FunSQL connection with database schema reflection.

This internal function sets up a FunSQL SQLConnection with the appropriate database dialect and schema reflection for query building. Use :postgresql for DuckDB and :sqlite for SQLite.

Arguments

  • conn - Raw database connection

Keyword Arguments

  • schema - Database schema name. Default: "main"

  • dialect - Database dialect. Default: :postgresql (for DuckDB compatibility)

Returns

  • SQLConnection - FunSQL connection object with reflected schema
source
OMOPCDMFeasibility._get_category_name Method
julia
_get_category_name(value, col, conn; schema="dbt_synthea_dev", dialect=:postgresql)

Get the human-readable category name for a covariate value.

Arguments

  • value - The value to convert (concept ID or string)

  • col - The column name

  • conn - Database connection object

  • schema - Database schema name (default: "dbt_synthea_dev")

  • dialect - SQL dialect (default: :postgresql)

Returns

  • String - Human-readable category name
source
OMOPCDMFeasibility._get_cohort_person_ids Method
julia
_get_cohort_person_ids(cohort_definition_id, cohort_df, conn; schema="dbt_synthea_dev")

Extract person IDs from either a cohort definition ID or a cohort DataFrame.

Arguments

  • cohort_definition_id - ID of the cohort definition in the cohort table (or nothing)

  • cohort_df - DataFrame containing cohort with person_id column (or nothing)

  • conn - Database connection object

  • schema - Database schema name (default: "dbt_synthea_dev")

Returns

  • Vector - Vector of unique person IDs

Notes

  • You must provide exactly one of cohort_definition_id or cohort_df (not both).

  • If both are provided, an error is thrown.

source
OMOPCDMFeasibility._get_concept_name Method
julia
_get_concept_name(concept_id, conn; schema="main", dialect=:postgresql) -> String

Retrieves the human-readable name for a given OMOP concept ID.

Arguments

  • concept_id - OMOP CDM concept ID to look up

  • conn - Database connection using DBInterface

Keyword Arguments

  • schema - Database schema name. Default: "main"

  • dialect - Database dialect. Default: :postgresql (for DuckDB compatibility)

Returns

  • String - The concept name, or "Unknown" if the concept ID is not found

Examples

julia
name = _get_concept_name(8507, conn)
# Returns: "Male"

name = _get_concept_name(999999, conn) 
# Returns: "Unknown"
source
OMOPCDMFeasibility._get_concepts_by_domain Method
julia
_get_concepts_by_domain(concept_ids::Vector{<:Integer}, conn; schema="main", dialect=:postgresql) -> Dict{String, Vector{Int}}

Groups a list of OMOP concept IDs by their domain classification.

This function queries the concept table to determine which domain each concept belongs to (e.g., "Condition", "Drug", "Procedure") and returns them grouped by domain.

Arguments

  • concept_ids - Vector of OMOP concept IDs to classify

  • conn - Database connection using DBInterface

Keyword Arguments

  • schema - Database schema name. Default: "main"

  • dialect - Database dialect. Default: :postgresql (for DuckDB compatibility)

Returns

  • Dict{String, Vector{Int}} - Dictionary mapping domain names to vectors of concept IDs

Examples

julia
concepts = [201820, 192671, 1503297]
domains = _get_concepts_by_domain(concepts, conn)
# Returns: Dict("Condition" => [201820, 192671], "Drug" => [1503297])
source
OMOPCDMFeasibility._get_database_total_patients Method
julia
_get_database_total_patients(conn; schema="dbt_synthea_dev")

Get the total number of patients in the database.

Arguments

  • conn - Database connection object

  • schema - Database schema name (default: "dbt_synthea_dev")

Returns

  • Int - Total count of people in the person table
source
OMOPCDMFeasibility._get_person_ids_from_cohort_table Method
julia
_get_person_ids_from_cohort_table(cohort_definition_id, conn; schema="dbt_synthea_dev")

Extract person IDs from the cohort table using a cohort definition ID.

Arguments

  • cohort_definition_id - ID of the cohort definition

  • conn - Database connection object

  • schema - Database schema name (default: "dbt_synthea_dev")

Returns

  • Vector - Vector of unique person IDs (subject_id from cohort table)
source
OMOPCDMFeasibility._get_person_ids_from_dataframe Method
julia
_get_person_ids_from_dataframe(cohort_df)

Extract person IDs from a cohort DataFrame.

Arguments

  • cohort_df - DataFrame containing cohort with person_id column

Returns

  • Vector - Vector of unique person IDs from the DataFrame
source
OMOPCDMFeasibility._resolve_table Method
julia
_resolve_table(fconn::SQLConnection, tblsym::Symbol) -> Table

Resolves a table symbol to its corresponding FunSQL table object.

This internal function looks up a table by name in the FunSQL catalog, performing case-insensitive matching.

Arguments

  • fconn - FunSQL SQLConnection object

  • tblsym - Table symbol to resolve

Returns

  • Table - FunSQL table object

Throws

  • ErrorException - If the table is not found in the catalog
source
OMOPCDMFeasibility._setup_domain_query Method
julia
_setup_domain_query(conn; domain::Symbol, schema::String="main", dialect::Symbol=:postgresql) -> NamedTuple

Sets up the necessary components for querying a specific domain table.

This internal function prepares all the components needed to query a domain-specific table including the FunSQL connection, resolved table objects, and appropriate concept column name.

Arguments

  • conn - Database connection

Keyword Arguments

  • domain - Domain table symbol (e.g., :condition_occurrence)

  • schema - Database schema name. Default: "main"

  • dialect - Database dialect. Default: :postgresql (for DuckDB compatibility)

Returns

  • NamedTuple - Contains fconn, tbl, concept_table, and concept_col components

Examples

julia
setup = _setup_domain_query(conn; domain=:condition_occurrence)
# Returns: (fconn=..., tbl=..., concept_table=..., concept_col=:condition_concept_id)
source
OMOPCDMFeasibility.analyze_concept_distribution Method
julia
analyze_concept_distribution(
    conn;
    concept_set::Vector{<:Integer},
    covariate_funcs::AbstractVector{<:Function} = Function[],
    schema::String = "main",
    dialect::Symbol = :postgresql
)

Analyzes the distribution of medical concepts across patient demographics by automatically detecting domains.

Arguments

  • conn - Database connection using DBInterface

  • concept_set - Vector of OMOP concept IDs to analyze; must be subtype of Integer

Keyword Arguments

  • covariate_funcs - Vector of OMOPCDMCohortCreator functions for demographic stratification. Default: Function[]

  • schema - Database schema name. Default: "main"

  • dialect - Database dialect. Default: :postgresql (for DuckDB compatibility)

Returns

  • DataFrame - Summary statistics with columns for concept information, domain, covariate values, and patient counts (count)

Examples

julia
# Basic concept summary with automatic domain detection
df = analyze_concept_distribution(conn; concept_set=[31967, 4059650])

# With demographic breakdown
df = analyze_concept_distribution(
    conn;
    concept_set=[31967, 4059650], 
    covariate_funcs=[GetPatientGender, GetPatientAgeGroup]
)
source
OMOPCDMFeasibility.create_cartesian_profiles Method
julia
create_cartesian_profiles(;
    cohort_definition_id::Union{Int, Nothing} = nothing,
    cohort_df::Union{DataFrame, Nothing} = nothing,
    conn,
    covariate_funcs::AbstractVector{<:Function},
    schema::String = "dbt_synthea_dev",
    dialect::Symbol = :postgresql
)

Creates Cartesian product demographic profiles for a cohort by analyzing all combinations of covariates.

This function generates a single DataFrame containing all possible combinations of demographic covariates (e.g., gender × race × age_group), providing comprehensive cross-tabulated statistics for detailed post-cohort feasibility analysis. Column order matches the input covariate_funcs order, and results are sorted by covariate values for interpretable output.

Arguments

  • conn - Database connection using DBInterface

  • covariate_funcs - Vector of covariate functions from OMOPCDMCohortCreator (must contain at least 2 functions)

Keyword Arguments

  • cohort_definition_id - ID of the cohort definition in the cohort table (or nothing). Either this or cohort_df must be provided

  • cohort_df - DataFrame containing cohort with person_id column (or nothing). Either this or cohort_definition_id must be provided

  • schema - Database schema name. Default: "dbt_synthea_dev"

  • dialect - Database dialect. Default: :postgresql (for DuckDB compatibility)

Returns

  • DataFrame - Cross-tabulated profile table with all covariate combinations and statistics

Examples

julia
using OMOPCDMCohortCreator: GetPatientAgeGroup, GetPatientGender, GetPatientRace

cartesian_profiles = create_cartesian_profiles(
    cohort_df = my_cohort_df,
    conn = conn,
    covariate_funcs = [GetPatientAgeGroup, GetPatientGender, GetPatientRace]
)
source
OMOPCDMFeasibility.create_individual_profiles Method
julia
create_individual_profiles(;
    cohort_definition_id::Union{Int, Nothing} = nothing,
    cohort_df::Union{DataFrame, Nothing} = nothing,
    conn,
    covariate_funcs::AbstractVector{<:Function},
    schema::String = "dbt_synthea_dev",
    dialect::Symbol = :postgresql
)

Creates individual demographic profile tables for a cohort by analyzing each covariate separately.

This function generates separate DataFrames for each demographic covariate (e.g., gender, race, age group), providing detailed statistics including cohort and database-level percentages for post-cohort feasibility analysis. Results are sorted alphabetically by covariate values for consistent, readable output.

Arguments

  • conn - Database connection using DBInterface

  • covariate_funcs - Vector of covariate functions from OMOPCDMCohortCreator (e.g., GetPatientGender, GetPatientRace)

Keyword Arguments

  • cohort_definition_id - ID of the cohort definition in the cohort table (or nothing). Either this or cohort_df must be provided

  • cohort_df - DataFrame containing cohort with person_id column (or nothing). Either this or cohort_definition_id must be provided

  • schema - Database schema name. Default: "dbt_synthea_dev"

  • dialect - Database dialect. Default: :postgresql (for DuckDB compatibility)

Returns

  • NamedTuple - Named tuple with keys corresponding to covariate names, each containing a DataFrame with covariate categories and statistics

Examples

julia
using OMOPCDMCohortCreator: GetPatientGender, GetPatientRace, GetPatientAgeGroup

individual_profiles = create_individual_profiles(
    cohort_df = my_cohort_df,
    conn = conn,
    covariate_funcs = [GetPatientGender, GetPatientRace, GetPatientAgeGroup]
)
source
OMOPCDMFeasibility.generate_domain_breakdown Method
julia
generate_domain_breakdown(
    conn;
    concept_set::Vector{<:Integer},
    covariate_funcs::AbstractVector{<:Function} = Function[],
    schema::String = "main",
    dialect::Symbol = :postgresql,
    raw_values::Bool = false
)

Generates a detailed breakdown of feasibility metrics by medical domain.

This function provides domain-specific statistics showing concepts, patients, records, and coverage for each medical domain in the concept set. This is useful for understanding which domains contribute most to study feasibility.

Arguments

  • conn - Database connection using DBInterface

  • concept_set - Vector of OMOP concept IDs to analyze; must be subtype of Integer

Keyword Arguments

  • covariate_funcs - Vector of OMOPCDMCohortCreator functions for demographic analysis. Default: Function[]

  • schema - Database schema name. Default: "main"

  • dialect - Database dialect. Default: :postgresql (for DuckDB compatibility)

  • raw_values - If true, returns raw numerical values; if false, returns formatted strings. Default: false

Returns

  • DataFrame - Domain-specific metrics with columns: metric, value, interpretation, and domain

Examples

julia
# Get formatted breakdown (default)
breakdown = generate_domain_breakdown(conn; concept_set=[31967, 4059650])

# Get raw numerical values for calculations
breakdown_raw = generate_domain_breakdown(conn; concept_set=[31967, 4059650], raw_values=true)
source
OMOPCDMFeasibility.generate_summary Method
julia
generate_summary(
    conn;
    concept_set::Vector{<:Integer},
    covariate_funcs::AbstractVector{<:Function} = Function[],
    schema::String = "main",
    dialect::Symbol = :postgresql,
    raw_values::Bool = false
)

Generates a summary of feasibility metrics for the given concept set.

This function provides high-level summary statistics including total patients, eligible patients, total records, and population coverage metrics. This is useful for getting a quick overview of study feasibility without detailed domain breakdowns.

Arguments

  • conn - Database connection using DBInterface

  • concept_set - Vector of OMOP concept IDs to analyze; must be subtype of Integer

Keyword Arguments

  • covariate_funcs - Vector of OMOPCDMCohortCreator functions for demographic analysis. Default: Function[]

  • schema - Database schema name. Default: "main"

  • dialect - Database dialect. Default: :postgresql (for DuckDB compatibility)

  • raw_values - If true, returns raw numerical values; if false, returns formatted strings. Default: false

Returns

  • DataFrame - Summary metrics with columns: metric, value, interpretation, and domain

Examples

julia
# Get formatted summary (default)
summary = generate_summary(conn; concept_set=[31967, 4059650])

# Get raw numerical values for calculations
summary_raw = generate_summary(conn; concept_set=[31967, 4059650], raw_values=true)
source