Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
-
Updated
Jan 10, 2025 - Python
Official repo for VGen: a holistic video generation ecosystem for video generation building on diffusion models
ICCV 2023-2025 Papers: Discover cutting-edge research from ICCV 2023-25, the leading computer vision conference. Stay updated on the latest in computer vision and deep learning, with code included. ⭐ support visual intelligence development!
CVPR 2023-2024 Papers: Dive into advanced research presented at the leading computer vision conference. Keep up to date with the latest developments in computer vision and deep learning. Code included. ⭐ support visual intelligence development!
ConsistI2V: Enhancing Visual Consistency for Image-to-Video Generation [TMLR 2024]
This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
Diverse Video Generation using a Gaussian Process Trigger
Generating Diverse Audio-Visual 360º Soundscapes for Sound Event Localization and Detection
This model synthesises high-fidelity fashion videos from single images featuring spontaneous and believable movements.
SG2VID: Scene Graphs Enable Fine-Grained Control for Video Synthesis (MICCAI 2025 - ORAL)
LipGANs is a text-to-viseme GAN framework that generates realistic mouth movements directly from text, without requiring audio. It maps phonemes → visemes, predicts phoneme durations, and uses per-viseme 3D GANs to synthesize photorealistic frames that can be exported as PNG sequences, GIFs, or MP4 videos.
Add a description, image, and links to the video-synthesis topic page so that developers can more easily learn about it.
To associate your repository with the video-synthesis topic, visit your repo's landing page and select "manage topics."