Name		Name	Last commit message	Last commit date
parent directory ..
assets		assets
configs		configs
mmseg_custom		mmseg_custom
tools/convert_datasets		tools/convert_datasets
LICENSE		LICENSE
README.md		README.md
THIRD PARTY OPEN SOURCE SOFTWARE NOTICE.txt		THIRD PARTY OPEN SOURCE SOFTWARE NOTICE.txt
generate_pseudo_label.py		generate_pseudo_label.py
install.sh		install.sh
launcher_pretrain.py		launcher_pretrain.py
launcher_pseudo_label.py		launcher_pseudo_label.py
launcher_test.py		launcher_test.py
launcher_train.py		launcher_train.py
pretrain.py		pretrain.py
requirements.txt		requirements.txt
test.py		test.py
train.py		train.py

README.md

When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Segmentation

Introduction

Source-free domain adaptive semantic segmentation aims to adapt a pre-trained source model to the unlabeled target domain without accessing the private source data. Previous methods usually fine-tune the entire network, which suffers from expensive parameter tuning. To avoid this problem, we propose to utilize visual prompt tuning for parameter-efficient adaptation. However, the existing visual prompt tuning methods are unsuitable for source-free domain adaptive semantic segmentation due to the following two reasons: (1) Commonly used visual prompts like input tokens or pixel-level perturbations cannot reliably learn informative knowledge beneficial for semantic segmentation. (2) Visual prompts require sufficient labeled data to fill the gap between the pre-trained model and downstream tasks. To alleviate these problems, we propose a universal unsupervised visual prompt tuning (Uni-UVPT) framework, which is applicable to various transformer-based backbones. Specifically, we first divide the source pre-trained backbone with frozen parameters into multiple stages, and propose a lightweight prompt adapter for progressively encoding informative knowledge into prompts and enhancing the generalization of target features between adjacent backbone stages. Cooperatively, a novel adaptive pseudo-label correction strategy with a multiscale consistency loss is designed to alleviate the negative effect of target samples with noisy pseudo labels and raise the capacity of visual prompts to spatial perturbations.

Setup and Environments

Please check your CUDA version and install the requirements with:

pip install -r requirements.txt
pip install mmcv-full==1.7.0 -f  https://download.openmmlab.com/mmcv/dist/{cu_version}/torch1.10.0/index.html
git clone https://github.com/chengdazhi/Deformable-Convolution-V2-PyTorch.git
sh install.sh

Datasets

Cityscapes: Please, download leftImg8bit_trainvaltest.zip and gt_trainvaltest.zip from here and extract them to data/cityscapes.

GTA: Download all image and label packages from here and extract them to data/gta.

Synthia: Please, download SYNTHIA-RAND-CITYSCAPES from here and extract it to data/synthia.

Data Preprocessing: Finally, please run the following commands to convert the label IDs to the train IDs:

python tools/convert_datasets/gta.py data/gta --nproc 8
python tools/convert_datasets/cityscapes.py data/cityscapes --nproc 8
python tools/convert_datasets/synthia.py data/synthia/ --nproc 8

Pre-Training in the source domain

(1) Download models pre-trained on ImageNet-1K and put them in model/: Swin-B Mit-B5

(2) Then, a pre-training job can be launched as follows:

python pretrain.py <config_dir>

Please refer to launcher_pretrain.py for all pretraining jobs and all source models could be downloaded from baidu and Google Drive, and must be placed in model/.

Generating Pseudo Labels

Then we can generate pseudo labels using:

python generate_pseudo_label.py --config <config_dir> --checkpoint <source_model_dir> --pseudo_label_dir <citiscapes_dir>/pretrain/<source_model_name>/train/

Please refer to launcher_pseudo_label.py for all jobs.

Training

After all preparations, we can train the final model by running:

python train.py <config_dir>

Please refer to launcher_train.py for all training jobs.

Testing

We have provided all checkpoints for fast evaluations. The checkpoints could be downloaded baidu and Google Drive, and must be placed in model/.

python test.py <config_dir> <checkpoint_dir>  --eval mIoU

Please refer to launcher_test.py for all testing jobs.

Results

Model	Pretraining	Backbone	GTA5 -> Cityscapes (mIoU19)	Synthia -> Cityscapes (mIoU16)	Synthia -> Cityscapes (mIoU13)
Ours	Standard Single Source	Swin-B	56.2	52.6	59.4
Ours	Standard Single Source	MiT-B5	54.2	52.6	59.3
Ours	Source-GtA	Swin-B	56.9	53.8	60.4
Ours	Source-GtA	MiT-B5	56.1	53.8	60.1

Citation

If this codebase is useful to you, please cite our work:

@article{ma2023uniuvpt,
  title={When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Segmentation},
  author={Xinhong Ma and Yiming Wang and Hao Liu and Tianyu Guo and Yunhe Wang},
  journal={Advances in Neural Information Processing Systems},
  year={2023},
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

uni-uvpt

uni-uvpt

README.md

When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Segmentation

Introduction

Setup and Environments

Datasets

Pre-Training in the source domain

Generating Pseudo Labels

Training

Testing

Results

Citation

Files

uni-uvpt

Directory actions

More options

Directory actions

More options

Latest commit

History

uni-uvpt

Folders and files

parent directory

README.md

When Visual Prompt Tuning Meets Source-Free Domain Adaptive Semantic Segmentation

Introduction

Setup and Environments

Datasets

Pre-Training in the source domain

Generating Pseudo Labels

Training

Testing

Results

Citation