-
Notifications
You must be signed in to change notification settings - Fork 12
V1.0
If you would like to explore or use the mappings please see our dashboard, which includes mapping statistics, interactive plots and tables, and access to links to download the latest release.
New Data Sources:
- The original set of ontologies has been extended, see the Ontologies section below for more information
- The National of Library Medicine's Unified Medical Language System (UMLS) MRCONSO and MRSTY. Using these data requires a NLM UMLS license agreement
Featured Functionality:
- To improve our mapping pipeline, we have created a Python-based version of Juan Banda's OHDSI Ananke
Downloaded Resource Information:
The specific ontologies used for this release of OMOP2OBO
, including class and axiom counts, are shown in the table below. All ontologies were downloaded and processed on 09/14/20
.
Ontology | Classes | Definitions | Labels | Synonyms | DbXRefs |
---|---|---|---|---|---|
Cell Line Ontology (CL) | 2,238 |
1,859 |
2,238 |
2,124 |
1,376 |
Chemical Entities of Biological Interest (CHEBI) | 126,169 |
48,824 |
126,169 |
269,798 |
231,247 |
Human Phenotype Ontology (HPO) | 15,247 |
12,468 |
15,247 |
19,860 |
19,569 |
Mondo Disease Ontology (MONDO) | 22,288 |
15,271 |
22,288 |
98,181 |
159,918 |
NCBITaxon Organism Taxonomy (NCBITaxon) | 2,241,110 |
0 |
2,241,110 |
263,571 |
18,426 |
Protein Ontology (PRO) | 215,624 |
215,598 |
215,624 |
590,190 |
195,671 |
Uber-Anatomy Ontology (UBERON) | 13,898 |
11,026 |
13,898 |
36,771 |
51,322 |
Vaccine Ontology (VO) | 5,783 |
1,231 |
5,789 |
6 |
0 |
A Chi-Square test of independence with Yate's correction was run to determine if the amount of metadata available differed by ontology. An omnibus test was run to determine whether there was a significant relationship between metadata and ontologies and revealed a significant association between the ontology metadata and ontology type (X2(14) = 2,664,853.817, p<0.0001). Post-hoc tests were run using a Bonferroni adjustment to correct for multiple comparisons and confirmed all ontologies had significantly different distributions of metadata (ps<0.0001).
This section provides an overview of the clinical data available for mapping. To create the mappings, clinical data was pulled in two waves from an OMOP (v5.0
) PEDSNet (v3.0
)-normalized instance of Children's Hospital of Colorado data (#15-0445
).
Wiki Page: Conditions
SQL Queries
- Condition Concepts Used in Practice (
GitHub Gist
) - Standard SNOMED-CT Condition Concepts (
GitHub Gist
)
Data was pulled in two waves. The first waved returned all condition concepts ids used at least once in practice (n=29,129
). The second wave returned all standard SNOMED-CT concepts not used in practice (n = 109,719`). Once the 29,129 concepts used in practice were removed, there 80,590 were standard SNOMED-CT concepts that had not been used in practice.
CONCEPT LEVEL | CODES | LABELS | SYNONYMS | VOCABULARIES |
---|---|---|---|---|
Concepts Used in Practice | ||||
Concept | 29,129 | 29,129 | 86,630 | SNOMED-CT |
Ancestor | 1,421,104 | 1,389,525 | N/A | SNOMED-CT Cohort MedDRA |
Standard SNOMED-CT Concepts Not Used in Practice | ||||
Concept | 80,590 | 80,590 | 194,264 | SNOMED-CT |
Ancestor | 3,458,072 | 3,393,343 | N/A | SNOMED-CT Cohort MedDRA |
Wiki Page: Drug Exposure Ingredients
SQL Queries
- Drug Exposure Ingredients Used in Practice (
GitHub Gist
) - Standard RxNorm Drug Ingredients Concepts (
GitHub Gist
)
Data was pulled in two waves. A total of 56,200
drug-ingredient concepts were eligible for mapping (51,941
drugs; 11,807
ingredients). The first waved returned all drug concepts used at least once in practice (9,175
drugs; 1,697
ingredients). The second wave returned all standard RxNorm concepts (42,766
drugs; 10,110
ingredients).
DATA TYPE | CONCEPT LEVEL | CODES | LABELS | SYNONYMS | VOCABULARIES |
---|---|---|---|---|---|
Concepts Used in Practice | |||||
Drugs | Concept | 9,175 | 9,154 | 19,496 | RxNorm |
Ancestor | 140,937 | 77,135 | N/A | SPL Cohort ATC NDFRT RxNorm VA Class CVX |
|
Ingredients | Concept | 1,697 | 1,696 | 1,868 | RxNorm SPL |
Ancestor | 1,697 | 1,696 | N/A | RxNorm SPL |
|
Standard RxNorm Concepts Not Used in Practice | |||||
Drugs | Concept | 42,766 | 42,640 | 52,688 | RxNorm |
Ancestor | 68,343 | 64,212 | N/A | SPL Cohort ATC NDFRT RxNorm VA Class CVX |
|
Ingredients | Concept | 10,110 | 10,110 | 11,235 | RxNorm |
Ancestor | 10,578 | 10,578 | N/A | RxNorm |
Wiki Page: Measurements
SQL Queries
- CHCO Measurements Used in Practice (
GitHub Gist
) - Standard LOINC2HPO Concepts Not Used In Practice (
GitHub Gist
)
Data was pulled in two waves. The first wave of data was pulled from CHCO
and contained only those concepts that were used at least once in clinical practice. This set contained a total of 1,606
LOINC concepts or 4,425
lab test results (more information on how lab test results were identified below). The initial set of CHCO
data were supplemented by adding the latest LOINC2HPO
annotations. The current annotation set (annotations.tsv; last updated 06/07/2020
) was downloaded from the develop branch of the LOINC2HPO GitHub repository on 08/12/2020
). Of the 3,119
unique codes obtained from LOINC2HPO
(7,421
unique results), 631
overlapped with the OMOP
measurement terms retrieved from CHCO
and were excluded. An additional 11
concepts were excluded due to being deprecated. This set of terms was further processed to remove terms with duplicate result types (n=19
concepts). The final set of processed terms included 2,477
unique LOINC concepts or 6,844
lab test results.
Identifying LOINC Scale and Result Type
- All lab test scale types (i.e., ordinal, nominal, quantitative, qualitative, narrative, doc, and panel) were initially eligible to be mapped. The scale type of each lab test was identified by parsing the concept synonym field for the presence of any of the scale types listed above.
-
Result type was identified using a two-step approach. First, we analyzed the reference ranges available in the patient data. If at least one numeric result was reported, the result type was recorded as
Normal/Low/High
and if apositive
ornegative
result was reported it was recorded asPositive/Negative
. Then, for all lab tests without a reference range in the data, the result type was obtained by parsing the concept synonym field. For all tests with an ordinal scale type, if the keywordspresence
orscreen
were identified, the result type was reported asPositive/Negative
. All tests with a quantitative scale type were given the result typeNormal/Low/High
. All other scale types were annotated withUnknown Result Type
.
CONCEPT LEVEL | CODES | LABELS | SYNONYMS | VOCABULARIES |
---|---|---|---|---|
Concepts Used in Practice | ||||
Concept | 1,606 | 1,606 | 41,981 | LOINC PEDSnet |
Ancestor | 20,781 | 21,191 | N/A | LOINC |
LOINC2HPO Concepts Not Used in Practice | ||||
Concept | 2,477 | 2,477 | 73,612 | LOINC |
Ancestor | 23,457 | 24,306 | N/A | LOINC |
Accuracy
Validation work performed in order to demonstrate the accuracy of the OMOP2OBO
mappings. This work was specifically designed to verify the accuracy of manually
constructed mappings (i.e., mappings that were not created from automatic alignment of existing database cross-references or exact string mappings). A subset of the most difficult manual
and manual constructor
mappings were randomly selected and verified by members of the clinical team shown below. Please see the Accuracy Wiki for additional information.
Consistency
Validation work performed in order to demonstrate the logical consistency of the OMOP2OBO mappings. For additional information on how we created semantic representations of the OMOP2OBO mappings see this wiki page. Please see the Consistency Wiki for additional information.
Generalizability
Validation work aimed at evaluating and characterizing the generalizability or coverage of the OMOP vocabulary terms included in the OMOP2OBO
mapping set to OMOP vocabulary terms utilized in 24
Observational Health Data Sciences and Informatics (OHDSI) Concept Prevalence study sites. Please see the Generalizability Wiki for additional information.
This project is licensed under MIT - see the LICENSE.md
file for details. If you intend to use any of the information on this Wiki, please provide the appropriate attribution by citing this repository:
@misc{callahan_tj_2020_4247939,
author = {Callahan, TJ et al.},
title = {OMOP2OBO},
month = jun,
year = 2021,
doi = {10.5281/zenodo.4247939},
url = {https://doi.org/10.5281/zenodo.4247939}
}