Skip to content

loinc/comp-loinc

Repository files navigation

CompLOINC

Computational LOINC (in OWL).

QC build and test

Setup

Prerequisities

  1. Python 3.11

Installation

  1. Clone repo: git clone https://github.com/loinc/comp-loinc.git
  2. Set up virtual environment & activate: python -m venv venv & source venv/bin/activate
  3. Install Poetry: pip install poetry
  4. Install dependencies: poetry install
  5. Unzip downloaded inputs into the root directory of the repo.
  • a. Core developers: Download latest *-build-sources.zip from Google Drive.
  • b. Everyone else: Download releases from each source:
    • LOINC
    • SNOMED
    • LOINC-SNOMED Ontology
    • LOINC Tree
      • From this app, select from the "Hierarchy" menu at the top of the page. There are 7 options. When you select an option, select 'Export'. Extract the CSVs in each zip, and put them into a single folder, using the following names: class.csv, component.csv, document.csv, method.csv, panel.csv, system.csv, component_by_system.csv.
    • General instructions: Ensure that these 4 sources are unzipped to the locations shown in comploinc_config.yaml/, or update the config to match your locations.

Repository Structure

Usage

If you just want to run a build of default artefacts / options, use the command comploinc build.

Command reference

comploinc --help:

Options:

Arg usage Description
--work-dir PATH CompLOINC work directory, defaults to current work directory. [default: (dynamic)]
--config-file PATH Configuration file name. Defaults to "comploinc_config.yaml" [default: comploinc_config.yaml]
-o, --out-dir PATH The output folder name. Defaults to "output". [default: output]
--install-completion Install completion for the current shell.
--show-completion Show completion for the current shell, to copy it or customize the installation.

Commands:

  • build: Performs a build from a build file as opposed to the "builder"...
  • builder ...

build

Usage: comploinc build [OPTIONS] [BUILD_NAME]

Performs a build from a build file as opposed to the "builder" command which takes build steps.

Arguments:

  • [BUILD_NAME] The build name or a path to a build file. The "default" build will build all outputs. [default: default]

Configuration

See: comploinc_config.yaml

If following the setup exactly, this configuration will not need to be modified.

Troubleshooting

If there are errors related to torch while running CompLOINC or nlp_taxonomification.py specifically, try changing the torch version to 2.1.0 in pyproject.toml.

Curation

CompLOINC has some functionality to configure provide curator feedback on some of the inputs, which can be used to inform what content will or will not be included in the ontology.

NLP on dangling parts: matches.sssom.tsv This file is the result of the semantic similarity process which matches dangling part terms (no parent or child)
against those in the hierarchy to try and identify a good parent for them. For each dangling part, only the top match is included. Confidence is shown in the similarity_score column.

This file adheres to the SSSOM standard. There are columns subject_id, subject_label, object_id, and object_label. The subjects are the dangling part terms, and the objects are the non-dangling part terms already in the hierarchy.

So where does curator input come into play for this file? There is a curator_approved column. If the value of this is set to True (case insensitive) for a given row, the match will be included in the ontology. If it is set to False (case insensitive), the match will not be included. If it is empty, or some value other than true/valse is present, then that column will be ignored and the setting for inclusion based on confidence threshold will be used. The default for this is 0.5, and can be configured in comploinc_config.yaml.

Statistics

Statistics page

Developer docs

Details

Tests

Tests: prerequisites

  1. robot
  2. Files in output/build-default/fast-run/
  • Can populate via comploinc --fast-run build default

Tests: Running

python -m unittest discover

Standard operating procedures (SOPs)

Setting up new/updated inputs/sources

  1. Create a new YYYY-MM-DD_comploinc-build-sources.zip in the Google Drive folder. Ensure it has the correct structure (folder names and files at the right paths).
  2. Make the link public: In the Google Drive folder, right-click the file, select "Share", and click "Share." At the bottom, under "General Access", click the left dropdown and select "Anyone with the link." Click "Copy link".
  3. Update DL_LINK_ID in GitHub: Go to the page for updating it. Paste the link from the previous step into the box, and click "Update secret."