This repository contains code to train the ConVIRT model on the MIMIC-CXR-JPG dataset and fine tune the pretrained image backbone for downstream image multi-label classification on the CheXpert dataset in centralized and federated learning setups.
Install dependencies
# clone project
git clone https://github.com/tjdevWorks/ConVIRT-Federated
cd ConVIRT-Federated
# [OPTIONAL] create conda environment
conda create -n convirt_fed python=3.7
conda activate convirt_fed
# install requirements
pip install -r requirements.txt
Pretraining the model with default configuration
python src/pretrain.py
Fine tuning model slurm configuration execution scripts are available scripts/tejas/a100/
An example of fine tuning in centralized setup:
# To use the ConVIRT pretrained model Image Backbone
python src/finetune_chexpert.py
# To use the ImageNet pretrained model image backbone
python src/finetune_chexpert.py --config-name=finetune_chexpert_imagenet
To execute the federated learning setups we have three data partitioning strategies in configs/partitions/ volume, class, attribute.
An example of running a federated learning experiment:
# Runs a federated simulation on a single node with gpu using 4 clients for 100 rounds and paritioning logic for "class.yaml"
python src/run_simulation.py --config-name=prod_simulation server_config.num_rounds=100 pool_size=4 partitions=class partitions.num_clients=4 partitions.exclusive=False partitions.equal_num_samples=False task_name='fed_chexpert_class' job_name=fed_class_100_4_False_False datamodule.batch_size=256
You can override any parameter from command line like this
python src/finetune_chexpert.py trainer.max_epochs=20 datamodule.batch_size=64
@article{DBLP:journals/corr/abs-2010-00747,
author = {Yuhao Zhang and
Hang Jiang and
Yasuhide Miura and
Christopher D. Manning and
Curtis P. Langlotz},
title = {Contrastive Learning of Medical Visual Representations from Paired
Images and Text},
journal = {CoRR},
volume = {abs/2010.00747},
year = {2020},
url = {https://arxiv.org/abs/2010.00747},
eprinttype = {arXiv},
eprint = {2010.00747},
timestamp = {Fri, 20 Nov 2020 14:04:05 +0100},
biburl = {https://dblp.org/rec/journals/corr/abs-2010-00747.bib},
bibsource = {dblp computer science bibliography, https://dblp.org}
}