Skip to content

orting/denoise-lung-masks

Repository files navigation

Denoise lung masks

Experiments with an autoencoder to reconstruct corrupted 3d lung mask. Masks are corrupted to resemble failure to segment high density pathologies

Data

Segmentation masks are from

https://www.kaggle.com/sandorkonya/ct-lung-heart-trachea-segmentation

which are derived from

https://www.kaggle.com/competitions/osic-pulmonary-fibrosis-progression/overview

Run data/preprocess_osic_masks.py to unpack and preprocess the lung masks. It will produce both 5mm and 2.5mm lung masks. If the data archive is not already downloaded to the data/osic_fibrosis_masks/ directory, the script will print instructions for downloading.

Note that the preprocessing will take some time.

A data-info file defining dataset splits is provided in osic_fibrosis_masks/data-info.csv. If you wish to run experiments with different data splits, either delete the file or change the path in preprocess_osic_masks.main.data_info.

Prepare module

The files in src/denoise_lung_masks are needed for the experiments. Either run reinstall_package.sh to install denoise_lung_masks as a python package, or create a symlink to src/denoise_lung_masks in the experiment directory.

Run experiments

Experiments are in experiments/.

Denoising autoencoder

Directory experiments/denoising-autoencoder.

Train an autoencoder to reconstruct a corrupted 3d lung mask. Masks are corrupted to resemble failure to segment high density pathologies. The autoencoder is fully convolutional with the following layers

TODO: specify layers

There are three versions of the experiments.

Name Description
Version 0 Use 5mm isotropic resolution and always corrupt masks
Version 1 Use 2.5mm isotropic resolution and always corrupt masks
Version 2 Use 2.5mm isotropic resolution and corrupt 3/4 masks

Parameters for each version are stored in parameters.py. Adjust batch_size as needed, version 1 and 2 requires around 20MB GPU RAM.

monai.losses.DiceLoss is used for all experiments

Train

Train each version with

python train.py <version-number>

or all version sequentially with

bash train_all.sh

The two models with lowest validation loss are kept.

Approximate runtime on RTX3090

Name Approximate wall clock time
Version 0 11 min
Version 1 38 min
Version 2 40 min

Predict

Predict each version with

python predict.py <model-checkpoint> <outdir> <version-number> [--with-corruptions]

or all version sequentially with

bash predict_all.sh

where you must set v*_checkpoint manually to the desired checkpoint.

The flag --with-corruptions will enable data corruption on all samples before prediction.

Analyse experiments

In experiments/analysis there are two tools for analysing the results

  • view_predictions.py : visualize one or more predictions using napari.
  • estimate_volume.py : estimates volume for one or more scans based on denoised masks.

There are some helper scripts to generate plots

  • v0_estimate_volume.sh
  • v1_estimate_volume.sh
  • v2_estimate_volume.sh

these assume the directory structure generated by predict_all.sh. Results are stored in experiments/denoising-autoencoder/results

About

Experiments with an autoencoder to reconstruct corrupted 3d lung mask

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published