Skip to content

alibaba-damo-academy/fvlm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fine-grained Vision-language Pre-training for Enhanced CT Image Understanding (ICLR 2025)

Data processing

  • Download the CT-RATE dataset into the data folder.

  • Download ImageNet pre-trained ViT weights from link, and BiomedVLP-CXR-BERT-specialized text encoder from link, as used by CT-CLIP.

  • Download the decomposed anatomy-wise descriptions from our provided supplementary materials link, and process the CT volume with the following commands.

    cd data
    python fix_data.py --split [train/valid]
    python generate_mask.py --split [train/valid]
    python resize.py --split [train/valid]
    python preprocess.py --split [train/valid]

    The processed results.

    |-- BiomedVLP-CXR-BERT
    |-- data
    |   |-- train
    |   |-- valid
    |   |-- train_fixed
    |   |-- valid_fixed
    |   |-- train_mask
    |   |-- valid_mask
    |   |-- resized_train_images
    |   |-- resized_train_masks
    |   |-- resized_valid_images
    |   |-- resized_valid_masks
    |   |-- processed_train_images
    |   |-- processed_train_masks
    |   |-- processed_valid_images
    |   |-- processed_valid_masks
    |   |-- multi_abnormality_labels
    |   |-- desc_info.json
    |   |-- conc_info.json
    |-- mae_pretrain_vit_base.pth

Training

torchrun --nproc_per_node=4 train.py

Evaluation

torchrun --nproc_per_node=4 eval.py

Then, you can calculate the metrics using the generated CSV file.

python calc_metrics.py --csv_file res/xxx.csv

Citation

If you find this repository useful, please cite:

@inproceedings{fvlm_iclr25,
  title={Large-scale and fine-grained vision-language pre-training for enhanced CT image understanding},
  author={Zhongyi Shui, Jianpeng Zhang, Weiwei Cao, Sinuo Wang, Ruizhe Guo, Le Lu, Lin Yang, Xianghua Ye, Tingbo Liang, Qi Zhang, Ling Zhang},
  booktitle={The Thirteenth International Conference on Learning Representations},
  pages={},
  year={2025}
}

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages