Anees Ur Rehman Hashmi, Ibrahim Almakky, Mohammad Areeb Qazi, Santosh Sanjeev, Vijay Ram Papineni, Dwarikanath Mahapatra, Mohammad Yaqub
Abstract: Large-scale generative models have demonstrated impressive capacity in producing visually compelling images, with increasing applications in medical imaging. However, they continue to grapple with the challenge of image hallucination and the generation of anatomically inaccurate outputs. These limitations are mainly due to the sole reliance on textual inputs and lack of spatial control over the generated images, hindering the potential usefulness of such models in real-life settings. We present XReal, a novel controllable diffusion model for generating realistic chest X-ray images through precise anatomy and pathology location control. Our lightweight method can seamlessly integrate spatial control in a pre-trained text-to-image diffusion model without fine-tuning, retaining its existing knowledge while enhancing its generation capabilities. XReal outperforms state-of-the-art x-ray diffusion models in quantitative and qualitative metrics while showing 13% and 10% anatomy and pathology realism gain, respectively, based on the expert radiologist evaluation. Our model holds promise for advancing generative models in medical imaging, offering greater precision and adaptability while inviting further exploration in this evolving field.
- Clone the repository:
git clone https://github.com/BioMedIA-MBZUAI/XReal.git
- Install the dependencies using the following command:
- Python version required: 3.8.5
pip install -r requirements.txt
-
Download the pre-trained models from here
-
Extract the downloaded models in the root directory.
-
Clone the required repositories in the src directory:
mkdir src \
cd src
- Clone the chexray-diffusion repository:
git clone https://github.com/CompVis/taming-transformers.git \
git clone https://github.com/openai/CLIP.git \
mv ./CLIP ./clip
Please check the tutorial in this notebook.
- We used the MIMIC-CXR-JPG dataset to train our models. The dataset can be downloaded from here.
- Follow the
data_preprocessing/prerprocess_mimic_cxr.ipynb
file to pre-process the CSVs and images.
- Note that we use images in .pt format for training with size 256x256.
- Adjust the image preprocessing according to your requirements.
- Put the csv path in the config file before training.
- All parameters are defined in the
config.yml
file.
- To train the XReal T2I backbone:
cd scripts
python train_t2i_backbone.py -b ../configs/xreal-diff-t2i.yml
- To Train XReal (our):
cd scripts
python train_vae.py -b ../configs/xreal-ae.yml
- To Train ControlNet:
- First create the skeleton for the controlnet:
cd cnet
python cnet/tool_add_control.py
- Then train the controlnet:
python cnet/train_cnet.py -b ../configs/xreal-diff-t2i_cnet.yml
If you use xreal or our repository in your research, please cite our paper XReal: Realistic Anatomy and Pathology-Aware X-ray Generation via Controllable Diffusion Model:
@article{hashmi2024xreal,
title={XReal: Realistic Anatomy and Pathology-Aware X-ray Generation via Controllable Diffusion Model},
author={Hashmi, Anees Ur Rehman and Almakky, Ibrahim and Qazi, Mohammad Areeb and Sanjeev, Santosh and Papineni, Vijay Ram and Mahapatra, Dwarikanath and Yaqub, Mohammad},
journal={arXiv preprint arXiv:2403.09240},
year={2024}
}
Our code is based on the following repositories: