This is the official PyTorch implementation of our paper:
Tackling Background Distraction in Video Object Segmentation, ECCV 2022
Suhwan Cho, Heansung Lee, Minhyeok Lee, Chaewon Park, Sungjun Jang, Minjung Kim, Sangyoun Lee
Link: [ECCV] [arXiv]

You can also explore other related works at awesome-video-object segmentation.
In semi-supervised VOS, one of the main challenges is the existence of background distractors that have a similar appearance to the target objects. As comparing visual properties is a fundamental technique, visual distractions can severely lower the reliability of a system. To suppress the negative influence of background distractions, we propose three novel strategies: 1) a spatio-temporally diversified template construction scheme to prepare various object properties for reliable and stable prediction; 2) a learnable distance-scoring function to consider the temporal consistency of a video; 3) swap-and-attach data augmentation to provide hard training samples showing severe occlusions.
1. Download the datasets: COCO, DAVIS, YouTube-VOS.
2. Download our custom split for the YouTube-VOS training set.
Start TBD training with:
python run.py --train
Verify the following before running:
✅ Training dataset selection and configuration
✅ GPU availability and configuration
Run TBD with:
python run.py --test
Verify the following before running:
✅ Testing dataset selection
✅ GPU availability and configuration
✅ Pre-trained model path
Pre-trained model (DAVIS)
Pre-trained model (YouTube-VOS)
Pre-computed results
Code and models are only available for non-commercial research purposes.
For questions or inquiries, feel free to contact:
E-mail: suhwanx@gmail.com