Observational Health Study

This workflow guide demonstrates how to initialize and run an observational health study using HealthBase.jl.
By the end of this tutorial, you will be able to:

  • Initialize a new observational health study project using HealthBase.jl
  • Download phenotype definitions and concept sets from OHDSI ATLAS via WebAPI
  • Translate OHDSI cohort definitions to SQL using OHDSICohortExpressions.jl
  • Execute translated SQL against an OMOP CDM v5.4 database using FunSQL.jl

1. Setup and Study Initialization

First, ensure that the required packages are installed in your global Julia environment:

import Pkg
Pkg.add(
  [
    "DrWatson",
    "HealthBase"
  ]
)

Note: The global environment is the default Julia package environment shared across projects. To learn more about environments, see the Pkg documentation.

Then, load the packages:

using DrWatson
using HealthBase

import HealthBase:
  cohortsdir

Initialize a new observational health study:

julia> initialize_study("sample_study", "Jenna Reps"; template = :observational)

This command creates a new directory called sample_study using the :observational template. Then, it activates a new Julia environment named sample_study.

After initializing the study directory and Julia environment, install the remaining required packages:

Pkg.add(
  [
    "DataFrames",
    "Downloads",
    "DBInterface",
    "DuckDB",
    "FunSQL",
    "OHDSIAPI",
    "OHDSICohortExpressions"
  ]
)

Now, load and import all necessary packages and functions:

using DataFrames
using Downloads

import DBInterface:
  connect,
  execute
import DuckDB:
  DB
import FunSQL:
  reflect,
  render
import OHDSIAPI:
  download_cohort_definition,
  download_concept_set
import OHDSICohortExpressions:
  translate

2. Download OHDSI Cohort Definitions

OHDSIAPI.jl is a Julia interface to various OHDSI WebAPI services. We can use it to access OHDSI ATLAS, OHDSI's web-based tool for defining phenotypes and analyses.

Here, we can download a single cohort definition using its ATLAS ID:

cohort_path = download_cohort_definition(1793014; output_dir=cohortsdir())

Tip: To download multiple cohort definitions with more verbose output:

cohort_ids = [1793014, 1792956]
download_cohort_definition(cohort_ids; progress_bar=true, verbose=true)

You can also download associated concept sets:

download_concept_set(cohort_ids; deflate=true, output_dir=cohortsdir())

3. Translate Cohort Definitions to SQL

Now, we can use OHDSICohortExpressions.jl to convert this cohort definition into SQL.

cohort_expression = cohortsdir("1793014.json")

fun_sql = translate(
    cohort_expression;
    cohort_definition_id = 1
)

4. Download Synthetic Database

For this guide, we will use a synthetic OMOP CDM v5.3 database from Eunomia. We will download it as follows:

# TODO: Add download URL
url = ""
db_path = datadir("exp_raw", "omop_cdm.db")
Downloads.download(url, db_path)

5. Execute the Cohort on a Database

Create database connection and configure dialect:

const CONNECTION = connect(DB, datadir("exp_raw", "omop_cdm.db"))
const SCHEMA = ""
const DIALECT = :postgresql

Reflect database catalog and render SQL:

catalog = reflect(CONNECTION; schema=SCHEMA, dialect=DIALECT)
sql = render(catalog, fun_sql)

Execute cohort query and insert results into cohort table:

execute(
  CONNECTION,
  """
  INSERT INTO cohort
  SELECT * FROM ($sql) AS foo;
  """
)

Query results into a DataFrame:

df = execute(CONNECTION, "SELECT COUNT(*) FROM cohort;") |> DataFrame

Display DataFrame:

println(df)

Summary

This workflow demonstrates how to run an observational health study using tools from the JuliaHealth ecosystem:

  • Initialize a project with a standardized structure using HealthBase.jl
  • Download cohort and concept set definitions from OHDSI ATLAS using OHDSIAPI.jl
  • Convert JSON cohort logic to SQL using OHDSICohortExpressions.jl
  • Execute SQL queries on an OMOP CDM v5.4 database using FunSQL.jl and DuckDB