Skip to content

SLUVisLab/xray-screws

Repository files navigation

Installation

gdcm requirement for pydicom

At the time of writing, pydicom requires the package gdcm to read and decompress the provided DICOM files. The easiest way to install this is directly through conda.

conda install conda-forge::gdcm

See https://github.com/conda-forge/gdcm-feedstock for repo documentation

Python Environment

Activate the conda environment and install dependencies:

conda activate screws_env
pip install -r requirements.txt

Scripts

extract_DICOM_data.py

Explores DICOM files and extracts metadata in human-readable JSON format. Useful for understanding DICOM structure and verifying metadata fields.

python scripts/extract_DICOM_data.py \
  --patient-dir /path/to/DICOM/study_type/patient_id \
  --depth 2

match_dicom_metadata_to_images.py

Matches DICOM metadata to annotated JPG images using image similarity (SSIM). This is the primary script for pairing metadata with extracted images when the original extraction order is unknown.

Key features:

  • Uses Structural Similarity Index Measure (SSIM) to compare image content
  • Handles count mismatches (e.g., 5 DICOMs to 4 JPGs)
  • Configurable confidence threshold (default: 0.5)
  • Creates placeholder JSON files for low-confidence or failed matches

Dry-run mode (recommended first):

python scripts/match_dicom_metadata_to_images.py \
  --images-dir /path/to/full_images_with_masks_batch_1 \
  --dicom-dir /path/to/DICOM \
  --dry-run

Actual run:

python scripts/match_dicom_metadata_to_images.py \
  --images-dir /path/to/full_images_with_masks_batch_1 \
  --dicom-dir /path/to/DICOM

Debug mode (detailed similarity scores):

python scripts/match_dicom_metadata_to_images.py \
  --images-dir /path/to/full_images_with_masks_batch_1 \
  --dicom-dir /path/to/DICOM \
  --debug \
  --dry-run

Custom confidence threshold:

python scripts/match_dicom_metadata_to_images.py \
  --images-dir /path/to/full_images_with_masks_batch_1 \
  --dicom-dir /path/to/DICOM \
  --confidence-threshold 0.7

create_metadata_summary_csv.py

Generates a CSV summary report of all images and their metadata matching status. Run this after match_dicom_metadata_to_images.py to review results.

Output columns:

  • manufacturer, patient_number, filename, relative_file_path
  • view_position (AP, LATERAL, etc.)
  • similarity_score (0.0-1.0 confidence)
  • error (yes/no)
  • error_type (no_match, low_confidence, dicom_not_found, etc.)

Usage:

python scripts/create_metadata_summary_csv.py \
  --images-dir /path/to/full_images_with_masks_batch_1

Custom output filename:

python scripts/create_metadata_summary_csv.py \
  --images-dir /path/to/full_images_with_masks_batch_1 \
  --output metadata_report.csv

Workflow

  1. Explore DICOM structure (optional):

    python scripts/extract_DICOM_data.py --patient-dir /path/to/DICOM/study/patient --depth 2
  2. Match metadata to images:

    # First, run in dry-run mode to preview
    python scripts/match_dicom_metadata_to_images.py \
      --images-dir /path/to/full_images_with_masks_batch_1 \
      --dicom-dir /path/to/DICOM \
      --dry-run
    
    # Then run actual matching to create JSON files
    python scripts/match_dicom_metadata_to_images.py \
      --images-dir /path/to/full_images_with_masks_batch_1 \
      --dicom-dir /path/to/DICOM
  3. Generate summary report:

    python scripts/create_metadata_summary_csv.py \
      --images-dir /path/to/full_images_with_masks_batch_1
  4. Review results:

    • Open metadata_summary.csv in Excel/Google Sheets
    • Filter by error="yes" to find images needing manual review
    • Sort by similarity_score to prioritize low-confidence matches

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors