Abstract The long-tailed distribution is a common phenomenon in the real world. Extracted large scale image datasets inevitably demonstrate the long-tailed property and models trained with imbalanced data can obtain high performance for the over-represented categories, but struggle for the under-represented categories, leading to biased predictions and performance degradation. To address this challenge, we propose a novel de-biasing method named Inverse Image Frequency (IIF). IIF is a multiplicative margin adjustment transformation of the logits in the classification layer of a convolutional neural network. Our method achieves stronger performance than similar works and it is especially useful for downstream tasks such as long-tailed instance segmentation as it produces fewer false positive detections. Our extensive experiments show that IIF surpasses the state of the art on many long-tailed benchmarks such as ImageNet-LT, CIFAR-LT, Places-LT and LVIS, reaching 55.8 top-1 accuracy with ResNet50 on ImageNet-LT and 26.3 segmentation AP with MaskRCNN ResNet50 on LVIS.
- Training code.
- Evaluation code.
- LVIS v1.0, ImageNet-LT, Places-LT datasets.
- Provide classification checkpoint models.
- Provide instance segmentation checkpoint models.
- python==3.8.12
- torch==1.7.1
- torchvision==0.8.2
- mmdet==2.15.1
- lvis
- Tested on CUDA 10.1,10.0
conda create --name mmdet pytorch=1.7.1 -y
conda activate mmdet
- Install dependency packages
conda install torchvision -y
conda install pandas scipy -y
conda install opencv -y
pip install catalyst
pip install imgaug
pip install randaugment
- Install MMDetection
pip install openmim
mim install mmdet==2.15.1
- Clone this repo
git clone https://github.com/kostas1515/iif.git
cd iif
For COCO and LVIS datasets:
- Create data directory, download COCO 2017 datasets at https://cocodataset.org/#download (2017 Train images [118K/18GB], 2017 Val images [5K/1GB], 2017 Train/Val annotations [241MB]) and extract the zip files:
mkdir data
cd data
wget http://images.cocodataset.org/zips/train2017.zip
wget http://images.cocodataset.org/zips/val2017.zip
#download and unzip LVIS annotations
wget https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvis_v1_train.json.zip
wget https://s3-us-west-2.amazonaws.com/dl.fbaipublicfiles.com/LVIS/lvis_v1_val.json.zip
- modify mmdetection/configs/base/datasets/lvis_v1_instance.py and make sure data_root variable points to the above data directory, e.g., data_root= "<user_path>"
For ImageNet and Places-LT:
- Download the ImageNet_2014 and Places_365.
@article{alexandridis2023inverse,
title={Inverse Image Frequency for Long-tailed Image Recognition},
author={Alexandridis, Konstantinos Panagiotis and Luo, Shan and Nguyen, Anh and Deng, Jiankang and Zafeiriou, Stefanos},
journal={IEEE Transactions on Image Processing},
year={2023},
publisher={IEEE}
}