[CVPR 2025] Video Narration as Vocabulary & Video as Long Document
-
Updated
Mar 13, 2025 - Python
[CVPR 2025] Video Narration as Vocabulary & Video as Long Document
[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding
An official implementation for " UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation"
[CVPR2022] Official Implementation of ReferFormer
[CVPR2023] All in One: Exploring Unified Video-Language Pre-training
[NeurIPS 2022] Egocentric Video-Language Pretraining
Align and Prompt: Video-and-Language Pre-training with Entity Prompts
A new multi-shot video understanding benchmark Shot2Story with comprehensive video summaries and detailed shot-level captions.
Pytorch code for Language Models with Image Descriptors are Strong Few-Shot Video-Language Learners
[CVPR21] Visual Semantic Role Labeling for Video Understanding (https://arxiv.org/abs/2104.00990)
The Pytorch implementation for "Video-Text Pre-training with Learned Regions"
PyTorch code for "Perceiver-VL: Efficient Vision-and-Language Modeling with Iterative Latent Attention" (WACV 2023)
[EMNLP 2024] A Video Chat Agent with Temporal Prior
Official implementation for paper Learning Grounded Vision-Language Representation for Versatile Understanding in Untrimmed Videos
[ICCV 2023] The official PyTorch implementation of the paper: "Localizing Moments in Long Video Via Multimodal Guidance"
Code for CVPR 2023 paper "SViTT: Temporal Learning of Sparse Video-Text Transformers"
Pytorch version of DeCEMBERT: Learning from Noisy Instructional Videos via Dense Captions and Entropy Minimization (NAACL 2021)
ACM Multimedia 2023 (Oral) - RTQ: Rethinking Video-language Understanding Based on Image-text Model
[IJCV] VLG: General Video Recognition with Web Textual Knowledge (https://arxiv.org/abs/2212.01638)
Pressure Testing Large Video-Language Models (LVLM): Doing multimodal retrieval from LVLM at any video lengths to measure accuracy
Add a description, image, and links to the video-language topic page so that developers can more easily learn about it.
To associate your repository with the video-language topic, visit your repo's landing page and select "manage topics."