This is the official PyTorch implementation of our paper:
Treating Motion as Option to Reduce Motion Dependency in Unsupervised Video Object Segmentation, WACV 2023
Suhwan Cho, Minhyeok Lee, Seunghoon Lee, Chaewon Park, Donghyeong Kim, Sangyoun Lee
Link: [WACV] [arXiv]
Treating Motion as Option with Output Selection for Unsupervised Video Object Segmentation, arXiv 2023
Suhwan Cho, Minhyeok Lee, Jungho Lee, MyeongAh Cho, Sangyoun Lee
Link: [arXiv]

You can also explore other related works at awesome-video-object segmentation.
In unsupervised VOS, most state-of-the-art methods leverage motion cues obtained from optical flow maps in addition to appearance cues. However, as they are overly dependent on motion cues, which may be unreliable in some cases, they cannot achieve stable prediction. To overcome this limitation, we design a novel motion-as-option network that is not much dependent on motion cues and a collaborative network learning strategy to fully leverage its unique property. Additionally, an adaptive output selection algorithm is proposed to maximize the efficacy of the motion-as-option network at test time.
1. Download the datasets: DUTS, DAVIS, FBMS, YouTube-Objects, Long-Videos.
2. Estimate and save optical flow maps from the videos using RAFT.
3. I also provide the pre-processed datasets: DUTS, DAVIS, FBMS, YouTube-Objects, Long-Videos.
Start TMO training with:
python run.py --train
Verify the following before running:
✅ Training dataset selection and configuration
✅ GPU availability and configuration
✅ Backbone network selection
Run TMO with:
python run.py --test
Verify the following before running:
✅ Testing dataset selection
✅ GPU availability and configuration
✅ Backbone network selection
✅ Adaptive output selection option
✅ Pre-trained model path
Pre-trained model (ResNet-101)
Pre-trained model (MiT-b1)
Pre-computed results
Code and models are only available for non-commercial research purposes.
For questions or inquiries, feel free to contact:
E-mail: suhwanx@gmail.com