Star us on GitHub!



ICD_GEMs.jl is a Julia package that allows to translate ICD-9 codes in ICD-10 and viceversa via the General Equivalence Mappings (GEMs) of the International Classification of Diseases (ICD).



The ICD provides a common language for the classification of diseases, injuries and causes of death, and for the standardised reporting and monitoring of health conditions. It is designed to map health conditions to corresponding generic categories together with specific variations, assigning to these a designated code, up to six characters long. These data form the basis of comparison and sharing between health providers, regions and countries, and over periods.

In addition to this essential core function, the ICD can also inform a wide range of related activities. It is used for health insurance reimbursement; in national health programme management; by data collection specialists and researchers; for tracking progress in global health; and to determine the allocation of health resources. Patient quality and safety documentation is also heavily informed by the ICD.

The ICD system is designed to promote international comparability in the collection, processing, classification, and presentation of health statistics, and health information in general. Currently, 117 countries report causes of death to WHO. Seventy per cent of the world’s health resources are allocated based on ICD data. Current uses include cancer registration, pharmacovigilance, and more than 20,000 scientific articles cite ICD-10.

For further details about the ICD, please consider to read the References.


The General Equivalence Mappings (GEMs) are the product of a coordinated effort spanning several years and involving the National Center for Health Statistics (NCHS), the Centers for Medicare and Medicaid Services (CMS), the American Health Information Management Association (AHIMA), the American Hospital Association, and 3M Health Information Systems providing a temporary mechanism to link ICD-9 codes to ICD-10 and vice versa.

According to the CMS:

The purpose of the GEMs is to create a useful, practical, code to code translation reference dictionary for both code sets, and to offer acceptable translation alternatives wherever possible. For each code set, it endeavors to answer this question: Taking the complete meaning of a code (defined as: all correctly coded conditions or procedures that would be classified to a code based on the code title, all associated tabular instructional notes, and all index references that refer to a code) as a single unit, what are the most appropriate translation(s) to the other code set?

For further details on how the GEMs work, please consider to read the References.


Press ] in the Julia REPL and then

pkg> add ICD_GEMs


Let us showcase the features of the package.

First, we import the necessary packages:

using ICD_GEMs

The GEMs converting ICD-10 codes into ICD-9 and viceversa have already been downloaded from here and exported both as DataFrame s from DataFrames.jl and OrderedDicts from DataStructures.jl:

  • GEM_I10_I9_dataframe: GEM from ICD-10 to ICD-9 represented as a dataframe;
  • GEM_I10_I9_dictionary: GEM from ICD-10 to ICD-9 represented as a dictionary;
  • GEM_I9_I10_dataframe: GEM from ICD-9 to ICD-10 represented as a dataframe;
  • GEM_I9_I10_dictionary: GEM from ICD-10 to ICD-9 represented as a dictionary.

These are all GEM structs consisting of two fields:

  • data: The actual GEM, which may either be a dataframe or a dictionary;
  • direction: A string, either "I10_I9" or "I9_I10".

The package can be used with custom GEMs (or "applied mappings") as long as they are wrapped inside a GEM struct where the data field is either a dataframe or a dictionary with the exact same format as the exported ones above. If some new GEMs are released by the CDC, the functions that load CDC-formatted .txt files are exported:

path      = "path/to/txt"
direction = "I10_I9" # If the GEM translates from ICD-10 to ICD-9, otherwise "I9_I10"
gem       = get_GEM_dataframe_from_cdc_gem_txt(path, direction) # Or get_GEM_dictionary_from_cdc_gem_txt(path, direction)

Finally, let us show how to translate ICD-10 codes into ICD-9 for, as an example, neoplasms:

ICD_10_neoplasms = "C00-D48" # This is equivalent as explicitly specifying all codes from C00.XX to D48.XX
ICD_9_neoplasms   = execute_applied_mapping(GEM_I10_I9_dictionary, ["C00-D48"])  
932-element Vector{String}:

And back:

ICD_10_neoplasms_back   = execute_applied_mapping(GEM_I9_I10_dictionary, ICD_9_neoplasms)
1186-element Vector{String}:

Are these the same as all the codes we started from, namely all ICD-10 codes from the first starting with C00 to the last that starts with D48 ? These would be:

get_ICD_code_range("C00-D48", "ICD-10") # Specify a code range and the revision it belongs to
1622-element Vector{String}:

No, they are more! Why? Because the mapping between the two revisions is not injective nor surjective (see the official documentation).

More complex translations, involving both single codes and ranges with arbitrary precision on both ends, can be performed:

execute_applied_mapping(GEM_I10_I9_dictionary, ["I60-I661", "I670", "I672-I679"])

All codes must be specified and are returned by the translation utilities without punctuation (no dot before the decimal digits).

How to Contribute

If you wish to change or add some functionality, please file an issue.

How to Cite

If you use this package in your work, please cite this repository using the metadata in CITATION.bib.