Skip to content

RespectKnowledge/MG-3D

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MG-3D

This is the implementation of MG-3D: Multi-Grained Knowledge-Enhanced Vision-Language Pre-training for 3D Medical Image Analysis.

Table of Contents

Requirements

Run the following command to install the required packages:

pip install -r requirements.txt

Preparation

You can download the CT-RATE and CTRG-Chest datasets used in this work via the Hugging Face repository (https://huggingface.co/datasets/ibrahimhamamci/CT-RATE) and Github (https://github.com/tangyuhao2016/CTRG).

The project structure should be:

root:[.]
+--mg3d
| +--datasets
| +--datamodules
| +--metrics
| +--models
| +--config.py
| +--__init__.py
+--prepro
| +--glossary.py
| +--make_arrow.py
| +--prepro_finetuning_language_data.py
| +--prepro_finetuning_data.py
| +--prepro_finetuning_vision_data.py
| +--prepro_pretraining_data.py
+--data
| +--pretrain_arrows
| +--finetune_arrows
| +--finetune_vision_arrows
| +--finetune_language_arrows
+--run_scripts
| +--pretrain.sh
| +--finetune.sh
+--tools
| +--visualize_datasets.py
| +--convert_meter_weights.py
+--downstream
| +--ACDC
| +--cc-ccii
| +--Covid19_20
| +--CT-RATE
| +--CTRG
| +--Luna16
| +--MSD
| +--stoic2021
+--requirements.txt
+--README.md
+--main.py

Pre-training

1. Pre-processing

Run the following command to pre-process the data:

python prepro/prepro_pretraining_data.py

to get the following arrow files:

root:[data]
+--pretrain_arrows
| +--clm_chest_ctrg_train.arrow
| +--clm_chest_ctrg_val.arrow
| +--clm_chest_ctrg_test.arrow
| +--clm_ct_rate_train.arrow
| +--clm_ct_rate_val.arrow
| +--clm_ct_rate_test.arrow

2. Pre-training

Now we can start to pre-train the ptunifer model:

Single GPU:
bash run_scripts/pretrain.sh

Multiple GPUs:
bash run_scripts/pretrain_multi_gpus.sh

3. Pre-trained Models

We provide various models for downstream tasks. You can find the 3D Swin-B-47K, 3D Swin-L-47K, 3D UNet-1.4K, and 3D nn-UNet-1.4K.

Acknowledgement

The code is based on PTunifier, MONAI, CT-CLIP, M2KT.

We thank the authors for their open-sourced code and encourage users to cite their works when applicable.

Citation

If you find this repo useful for your research, please consider citing the paper as follows:

@article{NI2026104027,
title = {MG-3D: Multi-Grained Knowledge-Enhanced Vision-Language Pre-training for 3D Medical Image Analysis},
journal = {Medical Image Analysis},
pages = {104027},
year = {2026},
issn = {1361-8415},
doi = {https://doi.org/10.1016/j.media.2026.104027},
url = {https://www.sciencedirect.com/science/article/pii/S1361841526000964},
author = {Xuefeng Ni and Linshan Wu and Jiaxin Zhuang and Qiong Wang and Mingxiang Wu and Varut Vardhanabhuti and Lihai Zhang and Hanyu Gao and Hao Chen},}
}

About

[MIA 2026] MG-3D: Multi-Grained Knowledge-Enhanced Vision-Language Pre-training for 3D Medical Image Analysis

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 95.7%
  • Cuda 1.5%
  • C 1.4%
  • Perl 0.6%
  • C++ 0.5%
  • Shell 0.2%
  • Other 0.1%