Computational LOINC (in OWL).
- Python 3.11
- Clone repo:
git clone https://github.com/loinc/comp-loinc.git
- Set up virtual environment & activate:
python -m venv venv
&source venv/bin/activate
- Install Poetry:
pip install poetry
- Install dependencies:
poetry install
- Unzip downloaded inputs into the root directory of the repo.
- a. Core developers: Download latest
*-build-sources.zip
from Google Drive. - b. Everyone else: Download releases from each source:
- LOINC
- SNOMED
- LOINC-SNOMED Ontology
- LOINC Tree
- From this app, select from the "Hierarchy" menu at the top of the page. There are 7 options. When you select an option, select 'Export'. Extract the CSVs in each zip, and put them into a single folder, using the following names:
class.csv
,component.csv
,document.csv
,method.csv
,panel.csv
,system.csv
,component_by_system.csv
.
- From this app, select from the "Hierarchy" menu at the top of the page. There are 7 options. When you select an option, select 'Export'. Extract the CSVs in each zip, and put them into a single folder, using the following names:
- General instructions: Ensure that these 4 sources are unzipped to the locations shown in comploinc_config.yaml/, or update the config to match your locations.
- data/ - Static input files that don't need to be downloaded.
- logs/ - Logs
- owl-files/ - Place the default build output files in this directory and then open the comploinc.owl file in Protege to get what is considered to be the default content of CompLOINC (still WIP).
- src/comp_loinc/ - Uses a loinclib
networkx
graph to generate ontological outputs.- builds/ -- LinkML schema
- datamodel/ - generated Python LinkML datamodel
- schema/ - LinkML source schema
- cli.py - Command line interface
- loinc_builder_steps.py - LOINC builder steps
- module.py - Instantiates and processes builder modules.
- runtime.py - Manages the runtime environment. Allows sharing of data between modules.
- snomed_builder_steps.py - SNOMED builder steps
- src/loinclib - Uses inputs from LOINC and other sources to create a
networkx
graph.- config.py - Configuration
- graph.py -
networkx
graph ops - loinc_loader.py - Loads LOINC release data
- loinc_schema.py - Schema for LOINC
- loinc_snomed_loader.py - Loads SNOMED-LOINC Ontology data
- loinc_snomed_schema.py - Schema for SNOMED-LOINC Ontology
- loinc_tree_loader.py - Loads LOINC web app hierarchical data
- loinc_tree_schema.py - Schema for LOINC web app hierarchical data
- snomed_loader.py - Loads SNOMED release data
- snomed_schema_v2.py - Schema for SNOMED release data
- tests/ - Tests
- comploinc_config.yaml/ - Configuration (discussed further below)
If you just want to run a build of default artefacts / options, use the command comploinc build
.
comploinc --help
:
Options:
Arg usage | Description |
---|---|
--work-dir PATH | CompLOINC work directory, defaults to current work directory. [default: (dynamic)] |
--config-file PATH | Configuration file name. Defaults to "comploinc_config.yaml" [default: comploinc_config.yaml] |
-o, --out-dir PATH | The output folder name. Defaults to "output". [default: output] |
--install-completion | Install completion for the current shell. |
--show-completion | Show completion for the current shell, to copy it or customize the installation. |
Commands:
build
: Performs a build from a build file as opposed to the "builder"...builder
...
Usage: comploinc build [OPTIONS] [BUILD_NAME]
Performs a build from a build file as opposed to the "builder" command which takes build steps.
Arguments:
[BUILD_NAME]
The build name or a path to a build file. The "default" build will build all outputs.[default: default]
See: comploinc_config.yaml
If following the setup exactly, this configuration will not need to be modified.
If there are errors related to torch
while running CompLOINC or nlp_taxonomification.py
specifically, try changing
the torch
version to 2.1.0 in pyproject.toml
.
CompLOINC has some functionality to configure provide curator feedback on some of the inputs, which can be used to inform what content will or will not be included in the ontology.
NLP on dangling parts: matches.sssom.tsv
This file is the result of the semantic similarity process which matches dangling part terms (no parent or child)
against those in the hierarchy to try and identify a good parent for them. For each dangling part, only the top match is
included. Confidence is shown in the similarity_score
column.
This file adheres to the SSSOM standard. There are columns subject_id
,
subject_label
, object_id
, and object_label
. The subjects are the dangling part terms, and the objects are the
non-dangling part terms already in the hierarchy.
So where does curator input come into play for this file? There is a curator_approved
column. If the value of this is
set to True (case insensitive) for a given row, the match will be included in the ontology. If it is set to False (case
insensitive), the match will not be included. If it is empty, or some value other than true/valse is present, then that
column will be ignored and the setting for inclusion based on confidence threshold will be used. The default for this is
0.5, and can be configured in comploinc_config.yaml
.
Details
robot
- Files in
output/build-default/fast-run/
- Can populate via
comploinc --fast-run build default
python -m unittest discover
- Create a new
YYYY-MM-DD_comploinc-build-sources.zip
in the Google Drive folder. Ensure it has the correct structure (folder names and files at the right paths). - Make the link public: In the Google Drive folder, right-click the file, select "Share", and click "Share." At the bottom, under "General Access", click the left dropdown and select "Anyone with the link." Click "Copy link".
- Update
DL_LINK_ID
in GitHub: Go to the page for updating it. Paste the link from the previous step into the box, and click "Update secret."