📢Thanks for your interest in our work! This is the official implementation for our CVPR 2023 paper "Hierarchical Dense Correlation Distillation for Few-Shot Segmentation". And we also released the corresponding models.
Abstract: Few-shot semantic segmentation (FSS) aims to form class-agnostic models segmenting unseen classes with only a handful of annotations. Previous methods limited to the semantic feature and prototype representation suffer from coarse segmentation granularity and train-set overfitting. In this work, we design Hierarchically Decoupled Matching Network (HDMNet) mining pixel-level support correlation based on the transformer architecture. The self-attention modules are used to assist in establishing hierarchical dense features, as a means to accomplish the cascade matching between query and support features. Moreover, we propose a matching module to reduce train-set overfitting and introduce correlation distillation leveraging semantic correspondence from coarse resolution to boost fine-grained segmentation. Our method performs decently in experiments. We achieve 50.0% mIoU on COCO dataset one-shot setting and 56.0% on five-shot segmentation, respectively.
- python == 3.8.13
- torch == 1.12.1
- torchvision == 0.13.1
- cuda == 11.0
- mmcv-full == 1.6.1
- mmsegmentation == 0.27.0
Please refer to the guidelines in MMSegmentation v0.27.0.
Please download the following datasets:
-
PASCAL-5i: PASCAL VOC 2012 and SBD
-
COCO-20i: COCO 2014.
We follow the lists generation as PFENet and upload the Data lists. You can direct download and put them into the ./lists
directory.
💥 Please ensure that you uncomment the data list generation sections and generate the base annotation when running the code for the first time. More details refer to util/get_mulway_base_data.py
and util/dataset.py
We have adopted the same procedures as BAM for the pre-trained backbones, placing them in the ./initmodel
directory. We have also uploaded the complete trained model of COCO dataset for your convenience. For Pascal dataset, you can directly retrain the models since the traing time is less than 10 hours.
To reproduct the results we reported in our paper, you can just download the corresponding models and run test script. But we still highly recommond you to retrain the model. Please note that the experimental results may vary due to different environments and settings. We sometimes get higher mIoU results than reported in the paper by up to 1.0%. However, it is still acceptable to compare your results with those reported in the paper only. Wish you a good luck! 😄😄
-
First update the configurations in the
./config
for training or testing -
Train script
sh train.sh [exp_name] [dataset] [GPUs]
# Example (split0 | COCO dataset | 4 GPUs for traing):
# sh train.sh split0 coco 4
- Test script
sh test.sh [exp_name] [dataset] [GPUs]
# Example (split0 | COCO dataset | 1 GPU for testing):
# sh test.sh split0 coco 1
This repository owes its existence to the exceptional contributions of other projects:
- Segformer: https://github.com/NVlabs/SegFormer
- BAM: https://github.com/chunbolang/BAM
- PFENet: https://github.com/dvlab-research/PFENet
- PSPNet: https://github.com/hszhao/semseg
Many thanks to their invaluable contributions.
If you find our work and this repository useful. Please consider giving a star ⭐ and citation 📚.
@article{peng2023hierarchical,
title={Hierarchical Dense Correlation Distillation for Few-Shot Segmentation},
author={Peng, Bohao and Tian, Zhuotao and Wu, Xiaoyang and Wang, Chenyao and Liu, Shu and Su, Jingyong and Jia, Jiaya},
journal={arXiv preprint arXiv:2303.14652},
year={2023}
}