Benchmark Suite for Evaluating Corruption Robustness of Computer Vision Models

Introduction

This is the benchmark framework used in the survey paper A Survey on the Robustness of Computer Vision Models against Common Corruptions. It evaluates the corruption robustness of ImageNet-trained classifiers to benchmark datasets: ImageNet-C, ImageNet-C-bar, ImageNet-P, ImageNet-3DCC.

Leaderboard for benchmark results can be accessed here.

Quick start

Installation

Download the datasets (ImageNet-1k, ImageNet-C, ImageNet-C-bar, ImageNet-P, ImageNet-3DCC)

Python 3.9.12, cuda-11.7, cuda-11.x_cudnn-8.6

you can create a virtual environment with conda and activate the environment before the next step

 conda create -n virtualenv  python=3.9 anaconda
 source activate virtualenv
 conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia

Install other packages

pip install -r requirements.txt

Clone this repository

git clone https://github.com/nis-research/CorruptionBenchCV.git
cd CorruptionBenchCV

Evaluating pre-trained models from timm

python main.py --model resnet18  --dataset ImageNet_C_bar --image_size 224 --data_path /local/

Output: a csv file with the following structure, recording the accuracy and ECE per corruption per severity. The overal results (e.g. clean accuracy, robust accuracy, relative robustness, relative mCE, mCE, mFP, and mT5D) will be printed. You can also use the csv file to compute the values of the above metrics.

Corruption	Acc_s1	Acc_s2	Acc_s3	Acc_s4	Acc_s5	ECE_s1	ECE_s2	ECE_s3	ECE_s4	ECE_s5
blue_noise_sample
......

Notice:

Change this parameter to select the pretrained model available in timm '--model'
For testing on all benchmark datasets, it has high requirement of storage. Thus, we suggest testing on benchmark datasets one by one.

Visualizing robustness among different backbones --> Overall robustness, Per corruption

Citation

@misc{wang2023larger,
      title={A Survey on the Robustness of Computer Vision Models against Common Corruptions}, 
      author={Shunxin Wang and Raymond Veldhuis and Christoph Brune and Nicola Strisciuglio},
      year={2023},
      eprint={2305.06024},
      archivePrefix={arXiv},
      primaryClass={cs.CV}
}={2023}

Acknowledgement

We would like to thank Dan Hendrycks and his coauthors for their codes implementing the evaluation metrics for ImageNet-P.

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
figures		figures
make_plots		make_plots
utils		utils
LICENSE		LICENSE
README.md		README.md
main.py		main.py
metric.py		metric.py
requirements.txt		requirements.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Benchmark Suite for Evaluating Corruption Robustness of Computer Vision Models

Introduction

Quick start

Citation

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

License

nis-research/CorruptionBenchCV

Folders and files

Latest commit

History

Repository files navigation

Benchmark Suite for Evaluating Corruption Robustness of Computer Vision Models

Introduction

Quick start

Citation

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages