Using UMLS Concepts with MeSH
The Medical Subject Headings (MeSH) terms returned from a PubMed search can be further analyzed by mapping them to Unified Medical Language System (UMLS) concepts, as well as filtering the MeSH Terms by concepts.
For both mapping MeSH to UMLS Concepts and filtering MeSH by concept, the following backends are supported:
- MySQL
- SQLite
- DataFrames
Set Up
using SQLite
using MySQL
using BioMedQuery.DBUtils
using BioMedQuery.Processes
using BioServices.UMLS
using BioMedQuery.PubMed
using DataFrames
Credentials are environment variables (e.g set in your .juliarc.jl)
umls_user = ENV["UMLS_USER"];
umls_pswd = ENV["UMLS_PSSWD"];
email = ""; # Only needed if you want to contact NCBI with inqueries
search_term = """(obesity[MeSH Major Topic]) AND ("2010"[Date - Publication] : "2012"[Date - Publication])""";
umls_concept = "Disease or Syndrome";
max_articles = 5;
results_dir = ".";
verbose = true;
results_dir = ".";
"."
MySQL
Map Medical Subject Headings (MeSH) to UMLS
This example demonstrates the typical workflow to populate a MESH2UMLS database table relating all concepts associated with all MeSH terms in the input database.
Note: this example reuses the MySQL DB from the PubMed Search and Save example.
Create MySQL DB connection
host = "127.0.0.1";
mysql_usr = "root";
mysql_pswd = "";
dbname = "pubmed_obesity_2010_2012";
db_mysql = MySQL.connect(host, mysql_usr, mysql_pswd, db = dbname);
MySQL Connection
------------
Host: 127.0.0.1
Port: 3306
User: root
DB: pubmed_obesity_2010_2012
Map MeSH to UMLS
@time map_mesh_to_umls_async!(db_mysql, umls_user, umls_pswd; append_results=false, timeout=3);
┌ Warning: `getindex(df::DataFrame, col_ind::ColumnIndex)` is deprecated, use `df[!, col_ind]` instead.
│ caller = #map_mesh_to_umls_async!#7(::Int64, ::Bool, ::Bool, ::Int64, ::Function, ::MySQL.Connection, ::String, ::String) at pubmed_mesh_to_umls_map.jl:44
└ @ BioMedQuery.Processes ~/build/bcbi/BioMedQuery.jl/src/Processes/pubmed_mesh_to_umls_map.jl:44
----------Matching MESH to UMLS-----------
["Adult", "Aged", "Aged, 80 and over", "Analysis of Variance", "Body Weight", "C-Reactive Protein", "Child", "Cross-Sectional Studies", "Fatigue", "Female", "Fibromyalgia", "Germany", "Health Status", "Humans", "Japan", "Male", "Middle Aged", "Nutrition Surveys", "Obesity", "Pain", "Pain Measurement", "Physical Fitness", "Prognosis", "Quality of Life", "Surveys and Questionnaires", "Reference Values", "Risk Factors", "ROC Curve", "Severity of Illness Index", "Sports", "Television", "Thyrotropin", "Biomarkers", "Weight Gain", "Exercise", "Body Mass Index", "Incidence", "Prevalence", "Logistic Models", "Odds Ratio", "Case-Control Studies", "Age Distribution", "Sex Distribution", "Sleep Apnea, Obstructive", "Metabolic Syndrome", "Overweight", "Waist Circumference", "Young Adult", "Obesity, Abdominal", "Republic of Korea", "Sedentary Behavior", "Pediatric Obesity"]
[ Info: UTS: Requesting new TGT
[ Info: Descriptor 1 out of 52: Adult
[ Info: Descriptor 31 out of 52: Television
[ Info: Descriptor 36 out of 52: Body Mass Index
[ Info: Descriptor 35 out of 52: Exercise
[ Info: Descriptor 32 out of 52: Thyrotropin
[ Info: Descriptor 30 out of 52: Sports
[ Info: Descriptor 29 out of 52: Severity of Illness Index
[ Info: Descriptor 34 out of 52: Weight Gain
[ Info: Descriptor 40 out of 52: Odds Ratio
[ Info: Descriptor 44 out of 52: Sleep Apnea, Obstructive
[ Info: Descriptor 39 out of 52: Logistic Models
[ Info: Descriptor 43 out of 52: Sex Distribution
[ Info: Descriptor 42 out of 52: Age Distribution
[ Info: Descriptor 37 out of 52: Incidence
[ Info: Descriptor 33 out of 52: Biomarkers
[ Info: Descriptor 41 out of 52: Case-Control Studies
[ Info: Descriptor 9 out of 52: Fatigue
[ Info: Descriptor 4 out of 52: Analysis of Variance
[ Info: Descriptor 3 out of 52: Aged, 80 and over
[ Info: Descriptor 6 out of 52: C-Reactive Protein
[ Info: Descriptor 7 out of 52: Child
[ Info: Descriptor 5 out of 52: Body Weight
[ Info: Descriptor 46 out of 52: Overweight
[ Info: Descriptor 2 out of 52: Aged
[ Info: Descriptor 15 out of 52: Japan
[ Info: Descriptor 10 out of 52: Female
[ Info: Descriptor 12 out of 52: Germany
[ Info: Descriptor 14 out of 52: Humans
[ Info: Descriptor 8 out of 52: Cross-Sectional Studies
[ Info: Descriptor 21 out of 52: Pain Measurement
[ Info: Descriptor 11 out of 52: Fibromyalgia
[ Info: Descriptor 13 out of 52: Health Status
[ Info: Descriptor 18 out of 52: Nutrition Surveys
[ Info: Descriptor 17 out of 52: Middle Aged
[ Info: Descriptor 20 out of 52: Pain
[ Info: Descriptor 19 out of 52: Obesity
[ Info: Descriptor 28 out of 52: ROC Curve
[ Info: Descriptor 23 out of 52: Prognosis
[ Info: Descriptor 25 out of 52: Surveys and Questionnaires
[ Info: Descriptor 16 out of 52: Male
[ Info: Descriptor 26 out of 52: Reference Values
[ Info: Descriptor 22 out of 52: Physical Fitness
[ Info: Descriptor 27 out of 52: Risk Factors
[ Info: Descriptor 38 out of 52: Prevalence
[ Info: Descriptor 24 out of 52: Quality of Life
[ Info: Descriptor 50 out of 52: Republic of Korea
[ Info: Descriptor 49 out of 52: Obesity, Abdominal
[ Info: Descriptor 51 out of 52: Sedentary Behavior
[ Info: Descriptor 47 out of 52: Waist Circumference
[ Info: Descriptor 45 out of 52: Metabolic Syndrome
[ Info: Descriptor 48 out of 52: Young Adult
[ Info: Descriptor 52 out of 52: Pediatric Obesity
[ Info: Descriptor 51 out of 52: Sedentary Behavior
11.464593 seconds (8.18 M allocations: 407.230 MiB, 2.39% gc time)
Explore the output table
db_query(db_mysql, "SELECT * FROM mesh2umls")
mesh | umls | |
---|---|---|
String | String | |
1 | Adult | Age Group |
2 | Age Distribution | Quantitative Concept |
3 | Aged | Organism Attribute |
4 | Aged, 80 and over | Age Group |
5 | Analysis of Variance | Quantitative Concept |
6 | Biomarkers | Clinical Attribute |
7 | Body Mass Index | Diagnostic Procedure |
8 | Body Weight | Organism Attribute |
9 | C-Reactive Protein | Amino Acid, Peptide, or Protein |
10 | C-Reactive Protein | Immunologic Factor |
11 | Case-Control Studies | Research Activity |
12 | Child | Age Group |
13 | Cross-Sectional Studies | Research Activity |
14 | Exercise | Daily or Recreational Activity |
15 | Fatigue | Sign or Symptom |
16 | Female | Population Group |
17 | Fibromyalgia | Disease or Syndrome |
18 | Germany | Geographic Area |
19 | Health Status | Qualitative Concept |
20 | Humans | Human |
21 | Incidence | Quantitative Concept |
22 | Japan | Geographic Area |
23 | Logistic Models | Intellectual Product |
24 | Logistic Models | Quantitative Concept |
25 | Male | Organism Attribute |
26 | Metabolic Syndrome | Disease or Syndrome |
27 | Middle Aged | Age Group |
28 | Nutrition Surveys | Research Activity |
29 | Obesity | Disease or Syndrome |
30 | Obesity, Abdominal | Finding |
31 | Odds Ratio | Quantitative Concept |
32 | Overweight | Finding |
33 | Pain | Food |
34 | Pain Measurement | Diagnostic Procedure |
35 | Pediatric Obesity | Disease or Syndrome |
36 | Physical Fitness | Idea or Concept |
37 | Prevalence | Quantitative Concept |
38 | Prognosis | Health Care Activity |
39 | Quality of Life | Idea or Concept |
40 | Reference Values | Quantitative Concept |
41 | Republic of Korea | Geographic Area |
42 | Risk Factors | Finding |
43 | ROC Curve | Quantitative Concept |
44 | Sedentary Behavior | Finding |
45 | Severity of Illness Index | Quantitative Concept |
46 | Sex Distribution | Quantitative Concept |
47 | Sleep Apnea, Obstructive | Disease or Syndrome |
48 | Sports | Daily or Recreational Activity |
49 | Surveys and Questionnaires | Research Activity |
50 | Television | Manufactured Object |
51 | Thyrotropin | Amino Acid, Peptide, or Protein |
52 | Thyrotropin | Hormone |
53 | Thyrotropin | Pharmacologic Substance |
54 | Waist Circumference | Clinical Attribute |
55 | Weight Gain | Finding |
56 | Young Adult | Age Group |
Filtering MeSH terms by UMLS concept
Getting the descriptor to index dictionary and the occurence matrix
@time labels2ind, occur = umls_semantic_occurrences(db_mysql, umls_concept);
(Dict("Obesity"=>1,"Pediatric Obesity"=>2,"Sleep Apnea, Obstructive"=>3,"Metabolic Syndrome"=>4,"Fibromyalgia"=>5),
[1, 1] = 1.0
[5, 1] = 1.0
[1, 2] = 1.0
[3, 2] = 1.0
[2, 3] = 1.0
[1, 4] = 1.0
[4, 5] = 1.0)
Descriptor to Index Dictionary
labels2ind
Dict{String,Int64} with 5 entries:
"Obesity" => 1
"Pediatric Obesity" => 2
"Sleep Apnea, Obstructive" => 3
"Metabolic Syndrome" => 4
"Fibromyalgia" => 5
Output Data Matrix
Matrix(occur)
5×5 Array{Float64,2}:
1.0 1.0 0.0 1.0 0.0
0.0 0.0 1.0 0.0 0.0
0.0 1.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 1.0
1.0 0.0 0.0 0.0 0.0
SQLite
This example demonstrates the typical workflow to populate a MESH2UMLS database table relating all concepts associated with all MeSH terms in the input database.
Note: this example reuses the SQLite DB from the PubMed Search and Save example.
Create SQLite DB connection
db_path = "$(results_dir)/pubmed_obesity_2010_2012.db";
db_sqlite = SQLite.DB(db_path);
Getting 5 articles, starting at index 0
------ESearch--------
------EFetch--------
------Save to database--------
Saving 5 articles to database
┌ Warning: `getindex(df::DataFrame, col_ind::ColumnIndex)` is deprecated, use `df[!, col_ind]` instead.
│ caller = select_columns at sqlite_db_utils.jl:9 [inlined]
└ @ Core ~/build/bcbi/BioMedQuery.jl/src/DBUtils/sqlite_db_utils.jl:9
┌ Warning: `getindex(df::DataFrame, col_ind::ColumnIndex)` is deprecated, use `df[!, col_ind]` instead.
│ caller = insert_row!(::SQLite.DB, ::String, ::Dict{Symbol,Any}, ::Bool) at sqlite_db_utils.jl:59
└ @ BioMedQuery.DBUtils ~/build/bcbi/BioMedQuery.jl/src/DBUtils/sqlite_db_utils.jl:59
Finished searching, total number of articles: 5
Map MeSH to UMLS
@time map_mesh_to_umls_async!(db_sqlite, umls_user, umls_pswd; append_results=false, timeout=3);
┌ Warning: `getindex(df::DataFrame, col_ind::ColumnIndex)` is deprecated, use `df[!, col_ind]` instead.
│ caller = #map_mesh_to_umls_async!#7(::Int64, ::Bool, ::Bool, ::Int64, ::Function, ::SQLite.DB, ::String, ::String) at pubmed_mesh_to_umls_map.jl:44
└ @ BioMedQuery.Processes ~/build/bcbi/BioMedQuery.jl/src/Processes/pubmed_mesh_to_umls_map.jl:44
----------Matching MESH to UMLS-----------
Union{Missing, String}["Reference Values", "Republic of Korea", "ROC Curve", "Fatigue", "Obesity", "Risk Factors", "Logistic Models", "Severity of Illness Index", "Male", "Case-Control Studies", "Analysis of Variance", "Sedentary Behavior", "Prevalence", "Quality of Life", "Odds Ratio", "Exercise", "Body Mass Index", "Aged", "Child", "Sex Distribution", "Adult", "Germany", "Sports", "Thyrotropin", "Pediatric Obesity", "Humans", "Japan", "Cross-Sectional Studies", "Weight Gain", "Middle Aged", "Surveys and Questionnaires", "Health Status", "Young Adult", "Incidence", "Prognosis", "Body Weight", "Pain Measurement", "Waist Circumference", "Metabolic Syndrome", "Pain", "Nutrition Surveys", "Fibromyalgia", "Sleep Apnea, Obstructive", "Television", "Age Distribution", "Overweight", "Physical Fitness", "Female", "Biomarkers", "Obesity, Abdominal", "C-Reactive Protein", "Aged, 80 and over"]
[ Info: UTS: Reading TGT from file
[ Info: Descriptor 5 out of 52: Obesity
[ Info: Descriptor 4 out of 52: Fatigue
[ Info: Descriptor 8 out of 52: Severity of Illness Index
[ Info: Descriptor 1 out of 52: Reference Values
[ Info: Descriptor 9 out of 52: Male
[ Info: Descriptor 6 out of 52: Risk Factors
[ Info: Descriptor 12 out of 52: Sedentary Behavior
[ Info: Descriptor 10 out of 52: Case-Control Studies
[ Info: Descriptor 16 out of 52: Exercise
[ Info: Descriptor 11 out of 52: Analysis of Variance
[ Info: Descriptor 13 out of 52: Prevalence
[ Info: Descriptor 14 out of 52: Quality of Life
[ Info: Descriptor 15 out of 52: Odds Ratio
[ Info: Descriptor 24 out of 52: Thyrotropin
[ Info: Descriptor 23 out of 52: Sports
[ Info: Descriptor 22 out of 52: Germany
[ Info: Descriptor 26 out of 52: Humans
[ Info: Descriptor 20 out of 52: Sex Distribution
[ Info: Descriptor 17 out of 52: Body Mass Index
[ Info: Descriptor 21 out of 52: Adult
[ Info: Descriptor 19 out of 52: Child
[ Info: Descriptor 25 out of 52: Pediatric Obesity
[ Info: Descriptor 18 out of 52: Aged
[ Info: Descriptor 27 out of 52: Japan
[ Info: Descriptor 28 out of 52: Cross-Sectional Studies
[ Info: Descriptor 30 out of 52: Middle Aged
[ Info: Descriptor 29 out of 52: Weight Gain
[ Info: Descriptor 33 out of 52: Young Adult
[ Info: Descriptor 34 out of 52: Incidence
[ Info: Descriptor 31 out of 52: Surveys and Questionnaires
[ Info: Descriptor 39 out of 52: Metabolic Syndrome
[ Info: Descriptor 43 out of 52: Sleep Apnea, Obstructive
[ Info: Descriptor 41 out of 52: Nutrition Surveys
[ Info: Descriptor 38 out of 52: Waist Circumference
[ Info: Descriptor 44 out of 52: Television
[ Info: Descriptor 46 out of 52: Overweight
[ Info: Descriptor 37 out of 52: Pain Measurement
[ Info: Descriptor 48 out of 52: Female
[ Info: Descriptor 47 out of 52: Physical Fitness
[ Info: Descriptor 45 out of 52: Age Distribution
[ Info: Descriptor 3 out of 52: ROC Curve
[ Info: Descriptor 42 out of 52: Fibromyalgia
[ Info: Descriptor 2 out of 52: Republic of Korea
[ Info: Descriptor 50 out of 52: Obesity, Abdominal
[ Info: Descriptor 7 out of 52: Logistic Models
[ Info: Descriptor 49 out of 52: Biomarkers
┌ Warning: `getindex(df::DataFrame, col_ind::ColumnIndex)` is deprecated, use `df[!, col_ind]` instead.
│ caller = insert_row!(::SQLite.DB, ::String, ::Dict{Symbol,String}, ::Bool) at sqlite_db_utils.jl:59
└ @ BioMedQuery.DBUtils ~/build/bcbi/BioMedQuery.jl/src/DBUtils/sqlite_db_utils.jl:59
[ Info: Descriptor 51 out of 52: C-Reactive Protein
[ Info: Descriptor 40 out of 52: Pain
[ Info: Descriptor 36 out of 52: Body Weight
[ Info: Descriptor 35 out of 52: Prognosis
[ Info: Descriptor 32 out of 52: Health Status
[ Info: Descriptor 52 out of 52: Aged, 80 and over
[ Info: Descriptor 51 out of 52: C-Reactive Protein
2.894164 seconds (2.26 M allocations: 110.942 MiB, 2.26% gc time)
Explore the output table
db_query(db_sqlite, "SELECT * FROM mesh2umls;")
mesh | umls | |
---|---|---|
String⍰ | String⍰ | |
1 | Reference Values | Quantitative Concept |
2 | Severity of Illness Index | Quantitative Concept |
3 | Obesity | Disease or Syndrome |
4 | Fatigue | Sign or Symptom |
5 | Sedentary Behavior | Finding |
6 | Male | Organism Attribute |
7 | Prevalence | Quantitative Concept |
8 | Case-Control Studies | Research Activity |
9 | Risk Factors | Finding |
10 | Analysis of Variance | Quantitative Concept |
11 | Odds Ratio | Quantitative Concept |
12 | Thyrotropin | Amino Acid, Peptide, or Protein |
13 | Thyrotropin | Hormone |
14 | Thyrotropin | Pharmacologic Substance |
15 | Exercise | Daily or Recreational Activity |
16 | Quality of Life | Idea or Concept |
17 | Body Mass Index | Diagnostic Procedure |
18 | Sex Distribution | Quantitative Concept |
19 | Humans | Human |
20 | Pediatric Obesity | Disease or Syndrome |
21 | Germany | Geographic Area |
22 | Child | Age Group |
23 | Sports | Daily or Recreational Activity |
24 | Adult | Age Group |
25 | Cross-Sectional Studies | Research Activity |
26 | Middle Aged | Age Group |
27 | Japan | Geographic Area |
28 | Aged | Organism Attribute |
29 | Surveys and Questionnaires | Research Activity |
30 | Incidence | Quantitative Concept |
31 | Nutrition Surveys | Research Activity |
32 | Weight Gain | Finding |
33 | Sleep Apnea, Obstructive | Disease or Syndrome |
34 | Physical Fitness | Idea or Concept |
35 | Pain Measurement | Diagnostic Procedure |
36 | Young Adult | Age Group |
37 | Age Distribution | Quantitative Concept |
38 | Fibromyalgia | Disease or Syndrome |
39 | Metabolic Syndrome | Disease or Syndrome |
40 | Television | Manufactured Object |
41 | Prognosis | Health Care Activity |
42 | Body Weight | Organism Attribute |
43 | Logistic Models | Intellectual Product |
44 | Logistic Models | Quantitative Concept |
45 | Female | Population Group |
46 | Waist Circumference | Clinical Attribute |
47 | Obesity, Abdominal | Finding |
48 | C-Reactive Protein | Immunologic Factor |
49 | C-Reactive Protein | Amino Acid, Peptide, or Protein |
50 | Republic of Korea | Geographic Area |
51 | Pain | Food |
52 | Health Status | Qualitative Concept |
53 | Overweight | Finding |
54 | Biomarkers | Clinical Attribute |
55 | ROC Curve | Quantitative Concept |
56 | Aged, 80 and over | Age Group |
Filtering MeSH terms by UMLS concept
Getting the descriptor to index dictionary and occurence matrix
@time labels2ind, occur = umls_semantic_occurrences(db_sqlite, umls_concept);
(Dict("Obesity"=>1,"Pediatric Obesity"=>2,"Sleep Apnea, Obstructive"=>3,"Metabolic Syndrome"=>4,"Fibromyalgia"=>5),
[1, 1] = 1.0
[5, 1] = 1.0
[1, 2] = 1.0
[3, 2] = 1.0
[2, 3] = 1.0
[1, 4] = 1.0
[4, 5] = 1.0)
Descriptor to Index Dictionary
labels2ind
Dict{String,Int64} with 5 entries:
"Obesity" => 1
"Pediatric Obesity" => 2
"Sleep Apnea, Obstructive" => 3
"Metabolic Syndrome" => 4
"Fibromyalgia" => 5
Output Data Matrix
Matrix(occur)
5×5 Array{Float64,2}:
1.0 1.0 0.0 1.0 0.0
0.0 0.0 1.0 0.0 0.0
0.0 1.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 1.0
1.0 0.0 0.0 0.0 0.0
DataFrames
This example demonstrates the typical workflow to create a MeSH to UMLS map as a DataFrame relating all concepts associated with all MeSH terms in the input dataframe.
Get the articles (same as example in PubMed Search and Parse)
dfs = Processes.pubmed_search_and_parse(email, search_term, max_articles, verbose)
Dict{String,DataFrames.DataFrame} with 8 entries:
"basic" => 5×13 DataFrames.DataFrame. Omitted printing of 9 col…
"mesh_desc" => 52×2 DataFrames.DataFrame…
"mesh_qual" => 9×2 DataFrames.DataFrame…
"pub_type" => 10×3 DataFrames.DataFrame…
"abstract_full" => 5×2 DataFrames.DataFrame. Omitted printing of 1 colu…
"author_ref" => 35×8 DataFrames.DataFrame. Omitted printing of 3 col…
"mesh_heading" => 78×5 DataFrames.DataFrame…
"abstract_structured" => 4×4 DataFrames.DataFrame. Omitted printing of 1 colu…
Map MeSH to UMLS and explore the output table
@time res = map_mesh_to_umls_async(dfs["mesh_desc"], umls_user, umls_pswd)
descriptor | concept | |
---|---|---|
String | String | |
1 | Adult | Age Group |
2 | Age Distribution | Quantitative Concept |
3 | Aged | Organism Attribute |
4 | Aged, 80 and over | Age Group |
5 | Analysis of Variance | Quantitative Concept |
6 | Biomarkers | Clinical Attribute |
7 | Body Mass Index | Diagnostic Procedure |
8 | Body Weight | Organism Attribute |
9 | C-Reactive Protein | Amino Acid, Peptide, or Protein |
10 | C-Reactive Protein | Immunologic Factor |
11 | Case-Control Studies | Research Activity |
12 | Child | Age Group |
13 | Cross-Sectional Studies | Research Activity |
14 | Exercise | Daily or Recreational Activity |
15 | Fatigue | Sign or Symptom |
16 | Female | Population Group |
17 | Fibromyalgia | Disease or Syndrome |
18 | Germany | Geographic Area |
19 | Health Status | Qualitative Concept |
20 | Humans | Human |
21 | Incidence | Quantitative Concept |
22 | Japan | Geographic Area |
23 | Logistic Models | Intellectual Product |
24 | Logistic Models | Quantitative Concept |
25 | Male | Organism Attribute |
26 | Metabolic Syndrome | Disease or Syndrome |
27 | Middle Aged | Age Group |
28 | Nutrition Surveys | Research Activity |
29 | Obesity | Disease or Syndrome |
30 | Obesity, Abdominal | Finding |
31 | Odds Ratio | Quantitative Concept |
32 | Overweight | Finding |
33 | Pain | Food |
34 | Pain Measurement | Diagnostic Procedure |
35 | Pediatric Obesity | Disease or Syndrome |
36 | Physical Fitness | Idea or Concept |
37 | Prevalence | Quantitative Concept |
38 | Prognosis | Health Care Activity |
39 | Quality of Life | Idea or Concept |
40 | ROC Curve | Quantitative Concept |
41 | Reference Values | Quantitative Concept |
42 | Republic of Korea | Geographic Area |
43 | Risk Factors | Finding |
44 | Sedentary Behavior | Finding |
45 | Severity of Illness Index | Quantitative Concept |
46 | Sex Distribution | Quantitative Concept |
47 | Sleep Apnea, Obstructive | Disease or Syndrome |
48 | Sports | Daily or Recreational Activity |
49 | Surveys and Questionnaires | Research Activity |
50 | Television | Manufactured Object |
51 | Thyrotropin | Amino Acid, Peptide, or Protein |
52 | Thyrotropin | Hormone |
53 | Thyrotropin | Pharmacologic Substance |
54 | Waist Circumference | Clinical Attribute |
55 | Weight Gain | Finding |
56 | Young Adult | Age Group |
Getting the descriptor to index dictionary and occurence matrix
@time labels2ind, occur = umls_semantic_occurrences(dfs, res, umls_concept);
(Dict("Obesity"=>1,"Pediatric Obesity"=>2,"Sleep Apnea, Obstructive"=>3,"Metabolic Syndrome"=>4,"Fibromyalgia"=>5),
[4, 1] = 1.0
[1, 2] = 1.0
[2, 3] = 1.0
[1, 4] = 1.0
[3, 4] = 1.0
[1, 5] = 1.0
[5, 5] = 1.0)
Descriptor to Index Dictionary
labels2ind
Dict{String,Int64} with 5 entries:
"Obesity" => 1
"Pediatric Obesity" => 2
"Sleep Apnea, Obstructive" => 3
"Metabolic Syndrome" => 4
"Fibromyalgia" => 5
Output Data Matrix
Matrix(occur)
5×5 Array{Float64,2}:
0.0 1.0 0.0 1.0 1.0
0.0 0.0 1.0 0.0 0.0
0.0 0.0 0.0 1.0 0.0
1.0 0.0 0.0 0.0 0.0
0.0 0.0 0.0 0.0 1.0
This page was generated using Literate.jl.