Package Examples
This page gives focused, runnable examples of each JuliaHealth package used in the Patient-Level Prediction workflow. Each section corresponds to one step in the pipeline - from initializing a study to querying the database.
In this tutorial, you will:
Initialize an observational study scaffold with HealthBase.jl
Download cohort definitions and concept sets using OHDSIAPI.jl
Translate cohort JSON to SQL expressions using OHDSICohortExpressions.jl
Execute translated SQL over OMOP CDM with FunSQL.jl and DBInterface.jl
Prerequisites
First, install the required packages into your Julia environment:
import Pkg
Pkg.add(
[
"DataFrames",
"DBInterface",
"DuckDB",
"FunSQL",
"HealthBase",
"OHDSIAPI",
"OHDSICohortExpressions"
]
)Then load everything needed for the examples below:
using DataFrames
import DBInterface:
connect,
execute
import DuckDB:
DB
import FunSQL:
reflect,
render
import OHDSIAPI:
download_cohort_definition,
download_concept_set
import OHDSICohortExpressions:
translate1. Initialize a Study with HealthBase.jl
HealthBase.jl provides the scaffolding for a reproducible observational health study. It creates a standardized directory layout and activates a dedicated Julia environment for your project.
Initialize the study:
using HealthBase
import HealthBase: cohortsdir
# Creates a new project directory using the observational template
# and activates a dedicated Julia environment named after the study
initialize_study("hypertension_to_pneumonia_plp", "Kosuri Lakshmi Indu"; template = :observational)This creates:
hypertension_to_pneumonia_plp/
├-- cohorts/ ← cohort JSON definitions land here
├-- results/
└-- study.tomlcohortsdir() returns the absolute path to the cohorts/ subfolder - used by both the download and translate steps below.
2. Download Cohort Definitions with OHDSIAPI.jl
OHDSIAPI.jl connects to any OHDSI ATLAS instance and downloads phenotype definitions as JSON files that OHDSICohortExpressions.jl can consume.
Download a single cohort definition by its ATLAS ID:
# Returns the local path of the downloaded JSON file
cohort_path = download_cohort_definition(1792865; output_dir = cohortsdir())To download multiple cohort definitions at once with verbose progress:
# Cohort IDs from the ATLAS demo server:
# 1792865 -> Hypertension (target cohort)
# 1790632 -> Pneumonia (outcome cohort)
cohort_ids = [1792865, 1790632]
download_cohort_definition(cohort_ids; progress_bar = true, verbose = true, output_dir = cohortsdir())You can also download the associated OMOP concept sets - the specific clinical codes (SNOMED, RxNorm, etc.) that define each phenotype. This is useful for auditing exactly which diagnoses, drugs, or procedures are included in a cohort definition:
download_concept_set(cohort_ids; deflate = true, output_dir = cohortsdir())Each concept set JSON lists the individual concept IDs and their names, so you can verify that the phenotype captures what you expect.
Tip: Visit atlas-demo.ohdsi.org/#/cohortdefinitions to explore available phenotype definitions and find the ATLAS cohort ID for any condition of interest. You can also browse concept sets directly at atlas-demo.ohdsi.org/#/conceptsets.
3. Translate Cohort JSON to SQL with OHDSICohortExpressions.jl
OHDSICohortExpressions.jl converts an ATLAS cohort JSON file into a FunSQL.jl query expression. No R, Python, or ATLAS WebAPI connection is needed at this stage - everything runs locally in Julia.
using HealthBase: cohortsdir
# Path to the downloaded target cohort JSON
cohort_expression = cohortsdir("1792865.json")
# Translate to a FunSQL expression
# cohort_definition_id must match the ID you will INSERT into the cohort table
fun_sql = translate(cohort_expression; cohort_definition_id = 1)Repeat for the outcome cohort (cohort_definition_id = 2):
outcome_expression = cohortsdir("1790632.json")
fun_sql_outcome = translate(outcome_expression; cohort_definition_id = 2)4. Run SQL Against OMOP CDM with FunSQL.jl and DBInterface.jl
FunSQL.jl provides type-safe, composable SQL query construction. DBInterface.jl gives a unified interface for connecting to and querying any supported database.
Connect to DuckDB
DuckDB is an embedded analytical database - no server, no setup, just a file.
const CONNECTION = connect(DB, "/path/to/omop_cdm.duckdb")
const SCHEMA = "dbt_synthea_dev"
const DIALECT = :duckdbReflect, Render, and Execute
# Read the live schema so FunSQL knows what tables and columns exist
catalog = reflect(CONNECTION; schema = SCHEMA, dialect = DIALECT)
# Render the FunSQL expression to a SQL string
sql = render(catalog, fun_sql)
# Insert the target cohort population into the cohort table
execute(
CONNECTION,
"""
INSERT INTO $SCHEMA.cohort
SELECT * FROM ($sql) AS foo;
"""
)Verify the Cohort Was Populated
After inserting, query the cohort table to confirm row counts:
df = execute(CONNECTION, "SELECT COUNT(*) FROM $SCHEMA.cohort WHERE cohort_definition_id = 1;") |> DataFrame
println(df)Expected output:
1×1 DataFrame Row │ count_star() │ Int64 -----┼-------------- 1 │ 269607
Repeat for the outcome cohort and verify:
sql_outcome = render(catalog, fun_sql_outcome)
execute(
CONNECTION,
"""
INSERT INTO $SCHEMA.cohort
SELECT * FROM ($sql_outcome) AS foo;
"""
)
df2 = execute(CONNECTION, "SELECT COUNT(*) FROM $SCHEMA.cohort WHERE cohort_definition_id = 2;") |> DataFrame
println(df2)Expected output:
1×1 DataFrame Row │ count_star() │ Int64 -----┼-------------- 1 │ 13461
Ad-hoc Queries with FunSQL
FunSQL is also useful for building exploratory queries in a composable, type-safe way - for example, peeking at the person table to inspect raw patient data:
using FunSQL: From, Select, Limit
query = From(:person) |>
Select(:person_id, :year_of_birth, :gender_concept_id) |>
Limit(10)
result = execute(CONNECTION, render(catalog, query))
df = DataFrame(result)
println(df)