Skip to content

suhwan-cho/awesome-video-object-segmentation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Awesome Video Object Segmentation Awesome

A list of video object segmentation (VOS) papers.

Any suggestions and requests are always welcomed :)

Contents

1. Semi-Supervised VOS Papers

2. Unsupervised VOS Papers

3. Referring VOS Papers

4. Other Related Papers

Semi-Supervised VOS Papers

2024

  • [STMA] Spatial-Temporal Multi-level Association for Video Object Segmentation, ECCV [Paper] [arXiv] [Code]

  • [OneVOS] OneVOS: Unifying Video Object Segmentation with All-in-One Transformer Framework, ECCV [Paper] [arXiv] [Code]

  • [RMem] RMem: Restricted Memory Banks Improve Video Object Segmentation, CVPR [Paper] [arXiv] [Page]

  • [Point-VOS] Point-VOS: Pointing Up Video Object Segmentation, CVPR [Paper] [arXiv] [Page]

  • [Cutie] Putting the Object Back into Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [DeVOS] DeVOS: Flow-Guided Deformable Transformer for Video Object Segmentation, WACV [Paper]

2023

  • [TTT] Test-time Training for Matching-based Video Object Segmentation, NeurIPS [Paper] [Code]

  • [READMem] READMem: Robust Embedding Association for a Diverse Memory in Unconstrained Video Object Segmentation, BMVC [Paper] [arXiv] [Code]

  • [XMem++] XMem++: Production-level Video Segmentation From Few Annotated Frames, ICCV [Paper] [arXiv] [Code]

  • [SimVOS] Scalable Video Object Segmentation with Simplified Framework, ICCV [Paper] [arXiv] [Code]

  • [TMRN] Alignment Before Aggregation: Trajectory Memory Retrieval Network for Video Object Segmentation, ICCV [Paper]

  • [ISVOS] Look Before You Match: Instance Understanding Matters in Video Object Segmentation, CVPR [Paper] [arXiv]

  • [CorrLearn] Boosting Video Object Segmentation via Space-time Correspondence Learning, CVPR [Paper] [arXiv] [Code]

  • [MobileVOS] MobileVOS: Real-Time Video Object Segmentation Contrastive Learning meets Knowledge Distillation, CVPR [Paper] [arXiv]

  • [TSVOS] Two-shot Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [LLB] Learning to Learn Better for Video Object Segmentation, AAAI [Paper] [arXiv] [Code]

2022

  • [DeAOT] Decoupling Features in Hierarchical Propagation for Video Object Segmentation, NeurIPS [Paper] [arXiv] [Code]

  • [AOC] Towards Robust Video Object Segmentation with Adaptive Object Calibration, ACMMM [Paper] [arXiv] [Code]

  • [BATMAN] BATMAN: Bilateral Attention Transformer in Motion-Appearance Neighboring Space for Video Object Segmentation, ECCV [Paper] [arXiv]

  • [XMem] XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model, ECCV [Paper] [arXiv] [Code]

  • [QDMN] Learning Quality-aware Dynamic Memory for Video Object Segmentation, ECCV [Paper] [arXiv] [Code]

  • [TBD] Tackling Background Distraction in Video Object Segmentation, ECCV [Paper] [arXiv] [Code]

  • [GSFM] Global Spectral Filter Memory Network for Video Object Segmentation, ECCV [Paper] [arXiv] [Code]

  • [RDE-VOS] Recurrent Dynamic Embedding for Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [PCVOS] Per-Clip Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [CoVOS] Accelerating Video Object Segmentation with Compressed Video, CVPR [Paper] [arXiv] [Code]

  • [SWEM] SWEM: Towards Real-Time Video Object Segmentation with Sequential Weighted Expectation-Maximization, CVPR [Paper] [arXiv] [Code]

  • [RPCMVOS] Reliable Propagation-Correction Modulation for Video Object Segmentation, AAAI [Paper] [arXiv] [Code]

  • [SITVOS] Siamese Network with Interactive Transformer for Video Object Segmentation, AAAI [Paper] [arXiv]

  • [BMVOS] Pixel-Level Bijective Matching for Video Object Segmentation, WACV [Paper] [arXiv] [Code]

2021

  • [AOT] Associating Objects with Transformers for Video Object Segmentation, NeurIPS [Paper] [arXiv] [Code]

  • [STCN] Rethinking Space-Time Networks with Improved Memory Coverage for Efficient Video Object Segmentation, NeurIPS [Paper] [arXiv] [Code]

  • [JOINT] Joint Inductive and Transductive Learning for Video Object Segmentation, ICCV [Paper] [arXiv] [Code]

  • [HMMN] Hierarchical Memory Matching Network for Video Object Segmentation, ICCV [Paper] [arXiv] [Code]

  • [DMN-AOA] Video Object Segmentation with Dynamic Memory Networks and Adaptive Object Alignment, ICCV [Paper] [Code]

  • [RMNet] Efficient Regional Memory Network for Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [LCM] Learning Position and Target Consistency for Memory-Based Video Object Segmentation, CVPR [Paper] [arXiv]

  • [GIEL] Video Object Segmentation Using Global and Instance Embedding Learning, CVPR [Paper]

  • [SwiftNet] SwiftNet: Real-time Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [SSTVOS] SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [Reuse-VOS] Learning Dynamic Network Using a Reuse Gate Function in Semi-Supervised Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [STG-Net] Spatiotemporal Graph Neural Network Based Mask Reconstruction for Video Object Segmentation, AAAI [Paper] [arXiv]

  • [QMRA] Query-Memory Re-Aggregation for Weakly-Supervised Video Object Segmentation, AAAI [Paper]

2020

  • [STM-cycle] Delving into the Cyclic Mechanism in Semi-supervised Video Object Segmentation, NeurIPS [Paper] [arXiv] [Code]

  • [AFB-URR] Video Object Segmentation with Adaptive Feature Bank and Uncertain-Region Refinement, NeurIPS [Paper] [arXiv] [Code]

  • [e-OSVOS] Make One-Shot Video Object Segmentation Efficient Again, NeurIPS [Paper] [arXiv] [Code]

  • [LWL] Learning What to Learn for Video Object Segmentation, ECCV [Paper] [arXiv] [Code]

  • [EGMN] Video Object Segmentation with Episodic Graph Memory Networks, ECCV [Paper] [arXiv] [Code]

  • [CFBI] Collaborative Video Object Segmentation by Foreground-Background Integration, ECCV [Paper] [arXiv] [Code]

  • [GC] Fast Video Object Segmentation using the Global Context Module, ECCV [Paper] [arXiv]

  • [KMN] Kernelized Memory Network for Video Object Segmentation, ECCV [Paper] [arXiv]

  • [SAT] State-Aware Tracker for Real-Time Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [FRTM] Learning Fast and Robust Target Models for Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [TVOS] A Transductive Approach for Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [TAN-DTTM] Fast Video Object Segmentation With Temporal Aggregation Network and Dynamic Template Matching, CVPR [Paper] [arXiv]

  • [FTMU] Fast Template Matching and Update for Video Object Tracking and Segmentation, CVPR [Paper] [arXiv] [Code]

  • [DIPNet] DIPNet: Dynamic Identity Propagation Network for Video Object Segmentation, WACV [Paper]

2019

  • [DMM-Net] DMM-Net: Differentiable Mask-Matching Network for Video Object Segmentation, ICCV [Paper] [arXiv] [Code]

  • [AGSS-VOS] AGSS-VOS: Attention Guided Single-Shot Video Object Segmentation, ICCV [Paper] [Code]

  • [RANet] RANet: Ranking Attention Network for Fast Video Object Segmentation, ICCV [Paper] [arXiv] [Code]

  • [DTN] Fast Video Object Segmentation via Dynamic Targeting Network, ICCV [Paper]

  • [CapsuleVOS] CapsuleVOS: Semi-Supervised Video Object Segmentation Using Capsule Routing, ICCV [Paper] [arXiv] [Code]

  • [STM] Video Object Segmentation Using Space-Time Memory Networks, ICCV [Paper] [arXiv] [Code]

  • [MHP-VOS] MHP-VOS: Multiple Hypotheses Propagation for Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [STCNN] Spatiotemporal CNN for Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [RVOS] RVOS: End-To-End Recurrent Network for Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [A-GAME] A Generative Appearance Model for End-To-End Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [FEELVOS] FEELVOS: Fast End-To-End Embedding Learning for Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [SiamMask] Fast Online Object Tracking and Segmentation: A Unifying Approach, CVPR [Paper] [arXiv] [Code]

  • [TIS] Tukey-Inspired Video Object Segmentation, WACV [Paper] [arXiv] [Code]

2018

  • [S2S] YouTube-VOS: Sequence-to-Sequence Video Object Segmentation, ECCV [Paper] [arXiv] [Code]

  • [PReMVOS] PReMVOS: Proposal-generation, Refinement and Merging for Video Object Segmentation, ACCV [arXiv] [Code]

  • [OSMN] Efficient Video Object Segmentation via Network Modulation, CVPR [Paper] [arXiv] [Code]

  • [RGMP] Fast Video Object Segmentation by Reference-Guided Mask Propagation, CVPR [Paper] [Code]

  • [FAVOS] Fast and Accurate Online Video Object Segmentation via Tracking Parts, CVPR [Paper] [arXiv] [Code]

2017

Unsupervised VOS Papers

2024

  • [DPA] Dual Prototype Attention for Unsupervised Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [GSA-Net] Guided Slot Attention for Unsupervised Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [DATTT] Depth-aware Test-Time Training for Zero-shot Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [GFA] Generalizable Fourier Augmentation for Unsupervised Video Object Segmentation, AAAI [Paper]

2023

  • [SimulFlow] SimulFlow: Simultaneously Extracting Feature and Identifying Target for Unsupervised Video Object Segmentation, ACMMM [Paper] [arXiv]

  • [TGFormer] Temporally Efficient Gabor Transformer for Unsupervised Video Object Segmentation, ACMMM [Paper]

  • [Isomer] Isomer: Isomerous Transformer for Zero-Shot Video Object Segmentation, ICCV [Paper] [arXiv] [Code]

  • [OAST] Unsupervised Video Object Segmentation with Online Adversarial Self-Tuning, ICCV [Paper]

  • [PMN] Unsupervised Video Object Segmentation via Prototype Memory Network, WACV [Paper] [arXiv] [Code]

  • [TMO] Treating Motion as Option to Reduce Motion Dependency in Unsupervised Video Object Segmentation, WACV [Paper] [arXiv] [Code]

2022

  • [HFAN] Hierarchical Feature Alignment Network for Unsupervised Video Object Segmentation, ECCV [Paper] [arXiv] [Code]

  • [IMP] Iteratively Selecting an Easy Reference Frame Makes Unsupervised Video Object Segmentation Easier, AAAI [Paper] [arXiv]

  • [D2Conv3D] D2Conv3D: Dynamic Dilated Convolutions for Object Segmentation in Videos, WACV [Paper] [arXiv] [Code]

  • [CFAM] Video Salient Object Detection via Contrastive Features and Attention Modules, WACV [Paper] [arXiv]

2021

  • [FSNet] Full-Duplex Strategy for Video Object Segmentation, ICCV [Paper] [arXiv] [Code]

  • [TransportNet] Deep Transport Network for Unsupervised Video Object Segmentation, ICCV [Paper]

  • [AMC-Net] Learning Motion-Appearance Co-Attention for Zero-Shot Video Object Segmentation, ICCV [Paper] [Code]

  • [RTNet] Reciprocal Transformations for Unsupervised Video Object Segmentation, CVPR [Paper] [Code]

  • [F2Net] F2Net: Learning to Focus on the Foreground for Unsupervised Video Object Segmentation, AAAI [Paper] [arXiv] [Code]

  • [FrameSelect] Mask Selection and Propagation for Unsupervised Video Object Segmentation, WACV [Paper] [Code]

2020

  • [3DC-Seg] Making a Case for 3D Convolutions for Object Segmentation in Videos, BMVC [Paper] [arXiv] [Code]

  • [WCS-Net] Unsupervised Video Object Segmentation with Joint Hotspot Tracking, ECCV [Paper] [Code]

  • [DFNet] Learning Discriminative Feature with CRF for Unsupervised Video Object Segmentation, ECCV [Paper] [arXiv]

  • [MATNet] Motion-Attentive Transition for Zero-Shot Video Object Segmentation, AAAI [Paper] [arXiv] [Code]

  • [UnOVOST] UnOVOST: Unsupervised Offline Video Object Segmentation and Tracking, WACV [Paper] [arXiv] [Code]

  • [EpO-Net] EpO-Net: Exploiting Geometric Constraints on Dense Trajectories for Motion Saliency, WACV [Paper] [arXiv] [Code]

2019

  • [AD-Net] Anchor Diffusion for Unsupervised Video Object Segmentation, ICCV [Paper] [arXiv] [Code]

  • [AGNN] Zero-Shot Video Object Segmentation via Attentive Graph Neural Networks, ICCV [Paper] [arXiv] [Code]

  • [AGS] Learning Unsupervised Video Object Segmentation Through Visual Attention, CVPR [Paper] [Code]

  • [COSNet] See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks, CVPR [Paper] [arXiv] [Code]

  • [SSAV] Shifting More Attention to Video Salient Object Detection, CVPR [Paper] [Code]

  • [MOTAdapt] Video Object Segmentation using Teacher-Student Adaptation in a Human Robot Interaction (HRI) Setting, ICRA [Paper] [arXiv] [Code]

2018

  • [PDB] Pyramid Dilated Deeper ConvLSTM for Video Salient Object Detection, ECCV [Paper] [Code]

Referring VOS Papers

2024

  • [VISA] VISA: Reasoning Video Object Segmentation via Large Language Models, ECCV [Paper] [arXiv] [Code]

  • [VD-IT] Exploring Pre-trained Text-to-Video Diffusion Models for Referring Video Object Segmentation, ECCV [Paper] [arXiv] [Code]

  • [ActionVOS] ActionVOS: Actions as Prompts for Video Object Segmentation, ECCV [Paper] [arXiv] [Code]

  • [LoSh] LoSh: Long-Short Text Joint Prediction Network for Referring Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [MUTR] Referred by Multi-Modality: A Unified Temporal Transformer for Video Object Segmentation, AAAI [Paper] [arXiv] [Code]

  • [TCE-RVOS] Temporal Context Enhanced Referring Video Object Segmentation, WACV [Paper] [Code]

2023

  • [SOC] SOC: Semantic-Assisted Object Cluster for Referring Video Object Segmentation, NeurIPS [Paper] [arXiv] [Page]

  • [HTML] HTML: Hybrid Temporal-scale Multimodal Learning Framework for Referring Video Object Segmentation, ICCV [Paper] [Page]

  • [OnlineRefer] OnlineRefer: A Simple Online Baseline for Referring Video Object Segmentation, ICCV [Paper] [arXiv] [Code]

  • [CMA] Learning Cross-Modal Affinity for Referring Video Object Segmentation Targeting Limited Samples, ICCV [Paper] [arXiv] [Code]

  • [R2VOS] Robust Referring Video Object Segmentation with Cyclic Structural Consensus, ICCV [Paper] [arXiv] [Code]

  • [SgMg] Spectrum-guided Multi-granularity Referring Video Object Segmentation, ICCV [Paper] [arXiv] [Code]

  • [TempCD] Temporal Collection and Distribution for Referring Video Object Segmentation, ICCV [Paper] [arXiv] [Code]

2022

  • [MANet] Multi-Attention Network for Compressed Video Referring Object Segmentation, ACMMM [Paper] [arXiv] [Code]

  • [MTTR] End-to-End Referring Video Object Segmentation with Multimodal Transformers, CVPR [Paper] [arXiv] [Code]

  • [ReferFormer] Language as Queries for Referring Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [LBDT] Language-Bridged Spatial-Temporal Interaction for Referring Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [MLRL] Multi-Level Representation Learning with Semantic Alignment for Referring Video Object Segmentation, CVPR [Paper]

  • [YOFO] You Only Infer Once: Cross-Modal Meta-Transfer for Referring Video Object Segmentation, AAAI [Paper]

2020

  • [URVOS] URVOS: Unified Referring Video Object Segmentation Network with a Large-Scale Benchmark, ECCV [Paper] [Code]

Other Related Papers

2024

  • [BA] Betrayed by Attention: A Simple yet Effective Approach for Self-supervised Video Object Segmentation, ECCV [Paper] [arXiv] [Code]

  • [LLE-VOS] Event-assisted Low-Light Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

  • [EVA-VOS] Learning the What and How of Annotation in Video Object Segmentation, WACV [Paper] [arXiv] [Code]

2023

  • [Training-Free-VOS] From ViT Features to Training-free Video Object Segmentation via Streaming-data Mixture Models, NeurIPS [Paper] [Code]

  • [DVSOD] DVSOD: RGB-D Video Salient Object Detection, NeurIPS [Paper] [arXiv] [Page]

  • [VOSPGD] Exploring the Adversarial Robustness of Video Object Segmentation via One-shot Adversarial Attacks, ACMMM [Paper]

  • [DEVA] Tracking Anything with Decoupled Video Segmentation, ICCV [Paper] [arXiv] [Code]

  • [Timetuning] Time Does Tell: Self-Supervised Time-Tuning of Dense Image Representations, ICCV [Paper] [arXiv] [Code]

  • [VOS-VFI] Video Object Segmentation-aware Video Frame Interpolation, ICCV [Paper] [Code]

  • [LVOS] LVOS: A Benchmark for Long-term Video Object Segmentation, ICCV [Paper] [arXiv] [Page]

  • [MOSE] MOSE: A New Dataset for Video Object Segmentation in Complex Scenes, ICCV [Paper] [arXiv] [Page]

  • [RCF] Bootstrapping Objectness from Videos by Relaxed Common Fate and Visual Grouping, CVPR [Paper] [arXiv] [Code]

  • [VOST] Breaking the “Object” in Video Object Segmentation, CVPR [Paper] [arXiv] [Page]

  • [InstMove] InstMove: Instance Motion for Object-centric Video Segmentation, CVPR [Paper] [arXiv] [Code]

  • [SSL-VOS] A Simple and Powerful Global Optimization for Unsupervised Video Object Segmentation, WACV [Paper] [arXiv] [Code]

  • [BURST] BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in Video, WACV [Paper] [arXiv] [Code]

2022

  • [EPIC-KITCHENS] EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations, NeurIPS [Paper] [arXiv] [Page]

  • [SaVos] Self-supervised Amodal Video Object Segmentation, NeurIPS [Paper] [arXiv]

  • [YouMVOS] YouMVOS: An Actor-centric Multi-shot Video Object Segmentation Dataset, CVPR [Paper] [Page]

  • [Wnet] Wnet: Audio-Guided Video Object Segmentation via Wavelet-Based Cross-Modal Denoising Networks, CVPR [Paper] [Code]

2021

  • [DUL] Dense Unsupervised Learning for Video Segmentation, NeurIPS [Paper] [arXiv] [Code]

  • [AMD] The Emergence of Objectness: Learning Zero-Shot Segmentation from Videos, NeurIPS [Paper] [arXiv] [Code]

  • [MotionGroup] Self-supervised Video Object Segmentation by Motion Grouping, ICCV [Paper] [arXiv] [Code]

  • [GMB] Generating Masks from Boxes by Mining Spatio-Temporal Consistencies in Videos, ICCV [Paper] [arXiv] [Code]

  • [DANet] Delving Deep Into Many-to-Many Attention for Few-Shot Video Object Segmentation, CVPR [Paper] [Code]

  • [IVOS-W] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild, CVPR [Paper] [arXiv] [Code]

  • [GIS] Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps, CVPR [Paper] [arXiv] [Code]

  • [MiVOS] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion, CVPR [Paper] [arXiv] [Code]

  • [ContrastCorr] Contrastive Transformation for Self-supervised Correspondence Learning, AAAI [Paper] [arXiv] [Code]

  • [TAO-VOS] Reducing the Annotation Effort for Video Object Segmentation Datasets, WACV [Paper] [arXiv] [Page]

2020

  • [CRW] Space-Time Correspondence as a Contrastive Random Walk, NeurIPS [Paper] [arXiv] [Code]

  • [ODMS] Learning Object Depth from Camera Motion and Video Object Segmentation, ECCV [Paper] [arXiv] [Code]

  • [ScribbleBox] ScribbleBox: Interactive Annotation Framework for Video Object Segmentation, ECCV [Paper] [arXiv] [Page]

  • [ATNet] Interactive Video Object Segmentation Using Global and Local Transfer Modules, ECCV [Paper] [arXiv] [Code]

  • [MAST] MAST: A Memory-Augmented Self-Supervised Tracker, CVPR [Paper] [arXiv] [Code]

  • [MuG] Learning Video Object Segmentation From Unlabeled Videos, CVPR [Paper] [arXiv] [Code]

  • [MA-Net] Memory Aggregation Networks for Efficient Interactive Video Object Segmentation, CVPR [Paper] [arXiv] [Code]

2019

  • [TimeCycle] Learning Correspondence from the Cycle-Consistency of Time, CVPR [Paper] [arXiv] [Code]

  • [BubbleNets] BubbleNets: Learning to Select the Guidance Frame in Video Object Segmentation by Deep Sorting Frames, CVPR [Paper] [arXiv] [Code]

  • [IPNet] Fast User-Guided Video Object Segmentation by Interaction-And-Propagation Networks, CVPR [Paper] [arXiv] [Code]

2018

  • [YouTube-VOS] A Large-Scale Benchmark for Video Object Segmentation Dataset, preprint [arXiv] [Page]

2016

  • [DAVIS] A Benchmark Dataset and Evaluation Methodology for Video Object Segmentation, CVPR [Paper] [Page]

2014

  • [FBMS] Segmentation of Moving Objects by Long Term Video Analysis, TPAMI [Paper] [Page]

2012

  • [YouTube-Objects] Learning Object Class Detectors from Weakly Annotated Video, CVPR [Paper] [Page]

About

A list of video object segmentation (VOS) papers

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published