This is an official PyTorch implementation of Scene-Aware Label Graph Learning for Multi-Label Image Classification, ICCV 2023. [paper]
- Download dataset and organize them as follow:
|datasets
|---- MSCOCO
|-------- annotations
|-------- train2014
|-------- val2014
|---- NUS-WIDE
|-------- Flickr
|-------- Groundtruth
|-------- ImageList
|-------- NUS_WID_Tags
|-------- Concepts81.txt
|---- VOC2007
|-------- Annotations
|-------- ImageSets
|-------- JPEGImages
|-------- SegmentationClass
|-------- SegmentationObject
- Preprocess using following commands:
python scripts/mscoco.py
python scripts/nuswide.py
python scripts/voc2007.py
python embedding.py --data [mscoco, nuswide, voc2007]
torch >= 1.9.0
torchvision >= 0.10.0
Pre-trained weights can be found in google drive. Download and put them in the experiments
folder, then one can use following commands to reproduce results reported in paper.
python evaluate.py --exp-dir experiments/salgl_resnet101_mscoco/exp3 # Microsoft COCO (448 x 448)
python evaluate.py --exp-dir experiments/salgl_resnet101_mscoco/exp6 # Microsoft COCO (576 x 576)
python evaluate.py --exp-dir experiments/salgl_resnet101_nuswide/exp2 # NUS-WIDE
python evaluate.py --exp-dir experiments/salgl_vit_large_patch16_224_mscoco/exp1 # Pascal VOC 2007
To visualize the word cloud and the label co-occurrence probability heatmap of different scene categories, please first download coco experiments (salgl_resnet101_mscoco) and then run following commands:
python labelcloud.py --exp-dir experiments/salgl_resnet101_mscoco/exp3
python heatmap.py --exp-dir experiments/salgl_resnet101_mscoco/exp3
@inproceedings{zhu2023scene,
title={Scene-aware label graph learning for multi-label image classification},
author={Zhu, Xuelin and Liu, Jian and Liu, Weijia and Ge, Jiawei and Liu, Bo and Cao, Jiuxin},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={1473--1482},
year={2023}
}