Towards Generalist Models for Multimodal Clinical Diagnostics

This repository contains the official implementation of the Medical Imaging meets NeurIPs workshop paper "Towards Generalist Models for Multimodal Clinical Diagnostics".

Requirements

The environment are as follows

python 3.7
pytorch 1.12.1
transformer 4.27.1
scikit-learn 1.0.2
tqdm
PIL

Dataset

Please make sure you have credentialed access to our MMCaD dataset on PhysioNet. The dataset should be available for download soon.

Note that we do not explicitly upload the chest X-ray images used in MMCaD. To download the images and you will need credentialed access to MIMIC CXR and the run the data/download_images.py script using your PhysioNet account details:

python download_images.py

After downloading MMCaD, first prepare it for model input by running:

python prepare_data.py

We provide randomly split training, validation and test sets with a ratio of 7:1:2 in data/train_idx.json, data/val_idx.json, and data/test_idx.json, respectively.

Training

Before training GeMini, you need to download the image and text feature extractors for encoding image and text modalities. For image feature extractor, we used ViT patch 16 and ViT patch 32. The text feature extractor is PubMedBert.

Please configure the data and feature extractor paths in the run_train.sh script according to your download directories.

Now everything is well set and you can start training by running the following script:

sh run_train.sh

Note that hyperparameters and training arguments are also specified in the run_train.sh script. Our pre-trained checkpoint for GeMini can be downloaded from link.

Evaluation

To evaluate GeMini on the test set, please update the data and model paths in sh run_test.sh accordingly and run:

sh run_test.sh

Citation

If you use the MMCaD dataset in your work, please consider citing the following two papers:

@inproceedings{fu2023towards,
  title={Towards Generalist Models for Multimodal Clinical Diagnostics},
  author={Fu, Yunxiang and Zhou, Hong-Yu and Yu, Yizhou},
  booktitle={Medical Imaging Meets NeurIPS Workshop},
  year={2023}
}

@article{zhou2023irene,
  title={A transformer-based representation-learning model with unified processing of multimodal input for clinical diagnostics},
  author={Zhou, Hong-Yu and Yu, Yizhou and Wang, Chengdi and Zhang, Shu and Gao, Yuanxu and Pan, Jia and Shao, Jun and Lu, Guangming and Zhang, Kang and Li, Weimin},
  journal={Nature Biomedical Engineering},
  doi={10.1038/s41551-023-01045-x}
  year={2023},
  publisher={Nature Publishing Group UK London}
}

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
src		src
LICENSE		LICENSE
README.md		README.md
prepare_data.py		prepare_data.py
run_test.sh		run_test.sh
run_train.sh		run_train.sh
test.py		test.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Towards Generalist Models for Multimodal Clinical Diagnostics

Requirements

Dataset

Training

Evaluation

Citation

About

Releases

Packages

Languages

License

RL4M/GeMini

Folders and files

Latest commit

History

Repository files navigation

Towards Generalist Models for Multimodal Clinical Diagnostics

Requirements

Dataset

Training

Evaluation

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages