Skip to content

[NeurIPS 2024] Code for Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models

License

Notifications You must be signed in to change notification settings

zhangce01/DPE-CLIP

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[NeurIPS 2024] DPE-CLIP

Website arXiv Conference License: MIT

👀Introduction

This repository contains the code for our NeurIPS 2024 paper Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models. [Paper]

⏳Setup

1. Environment

We test our codebase with PyTorch 2.1.1 with CUDA 12.1. Please install corresponding PyTorch and CUDA versions according to your computational resources. Then install the rest of required packages by running pip install -r requirements.txt. Please install the info-nce-pytorch package following https://github.com/RElbers/info-nce-pytorch.

2. Dataset

To set up all required datasets, kindly refer to the guidance in DATASETS.md, which incorporates steps for installing two benchmarks.

📦Usage

To run the code, you can execute the following 4 bash scripts:

Robustness to Natural Distribution Shifts

  • ResNet50: Run DPE on the OOD Benchmark using the ResNet-50 model:
bash ./scripts/run_ood_benchmark_rn50.sh 
  • ViT/B-16: Run DPE on the OOD Benchmark using the ViT/B-16 model.
bash ./scripts/run_ood_benchmark_vit.sh 

Cross-Datasets Generalization

  • ResNet50: Run DPE on the Cross-Domain Benchmark using the ResNet-50 model:
bash ./scripts/run_cd_benchmark_rn50.sh 
  • ViT/B-16: Run DPE on the Cross-Domain Benchmark using the ViT/B-16 model.
bash ./scripts/run_cd_benchmark_vit.sh 

Arguments

In each bash script, you can modify the following arguments: (1) --datasets to specify the datasets, (2) --backbone to specify the backbone model (RN50 and ViT-B/16), and (3) --coop to enable the learned prompts by CoOp. We use wandb to track the results. If you wish to deactivate this feature, simply omit the --wandb-log argument.

🙏Acknowledgements

Our codebase is adapted from Tip-Adapter, CLIP, TDA, TPT, and CuPL. We thank the authors for releasing their code!

📧Contact

If you have any questions, please contact at cezhang@cs.cmu.edu.

📌 BibTeX & Citation

If you find this code useful, please consider citing our work:

@article{zhang2024dual,
  title={Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models},
  author={Zhang, Ce and Stepputtis, Simon and Sycara, Katia and Xie, Yaqi},
  journal={arXiv preprint arXiv:2410.12790},
  year={2024}
}

About

[NeurIPS 2024] Code for Dual Prototype Evolving for Test-Time Generalization of Vision-Language Models

Resources

License

Stars

Watchers

Forks