Skip to content

ishandutta2007/Video-Generation-Landscape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 

Repository files navigation

Video Generation Landscape

A curated list of state-of-the-art video generation models, research, and tools.

🚀 State-of-the-Art Models (2024–2026)

The field has evolved from experimental GANs to massive Transformer-based Diffusion (DiT) models capable of generating cinematic-quality video from text or images.

🏢 Commercial / Closed-Source

Model Developer Best For Key Features
Runway Gen-3 Alpha Runway Professional Control Industry-leading motion brush, director mode, and character consistency.
Luma Dream Machine Luma AI Cinematic Realism High-speed generation, realistic physics, and complex camera movements.
Kling AI Kuaishou Long-form Video Supports videos up to 2 minutes, native 4K, and superior human movement.
OpenAI Sora OpenAI High Fidelity 60-second clips with high physical consistency (limited public release).
Google Veo 3 Google Integration Native 4K, integrated with Google Vids and Workspace.
Pika 1.5 Pika Labs Creative Effects Specialized in "Pikaffects" (physics-defying creative transformations).

🔓 Open-Source / Weights Available

Model Repository Key Features License
Wan2.1 Wan-Video Current SOTA (2025). Best-in-class prompt adherence. Runs on 8GB-14GB VRAM. Apache 2.0
HunyuanVideo Tencent Cinematic quality, strong Image-to-Video (I2V) capabilities. Apache 2.0
Mochi-1 Genmo High-fidelity motion (30fps) and strong physical realism. Apache 2.0
CogVideoX Zhipu AI Highly accessible; 2B/5B/v1.5 variants. Apache 2.0
SVD (Stable Video Diffusion) Stability AI The industry standard for high-quality Image-to-Video workflows. Stability NC
LTX-Video Lightricks Optimized for real-time and efficient video generation. Apache 2.0

🛠️ Ecosystem & Tools

Most modern video generation workflows utilize node-based interfaces for maximum control.


📄 Key Research Papers

  • Sora: Video generation models as world simulators (OpenAI, 2024)
  • HunyuanVideo: Real-world Video Generation with Heterogeneous Diffusion Transformers (Tencent, 2024)
  • CogVideoX: Text-to-Video Diffusion Models with Compressed Video Latents (Zhipu AI, 2024)
  • Stable Video Diffusion: Scaling Latent Video Diffusion Models to Large Datasets (Stability AI, 2023)
  • Scalable Diffusion Models with Transformers (DiT) (Peebles & Xie, 2023)
  • AnimateDiff: Animate Your Personalized Text-to-Image Diffusion Models without Specific Tuning (Guo et al., 2023)

📊 Datasets

  • Panda-70M: 70M high-quality video-text pairs.
  • WebVid-10M: A large-scale dataset of short videos with captions.
  • HD-VILA-100M: High-resolution video-language dataset.

👥 Communities


🕰️ Historical Archive (2015–2022)

The models below represent the "Early Era" of video generation using GANs and VAEs.

Early News:

Early Samples:

Samples Code Paper
Memoji -- --
VideoGAN Code Tinyvideo
Adversarial Video Gen Code 1511.05440
Improved VideoGAN Code 1711.11453

Support:

If you want the good work to continue please support us on

About

🎬🎬 Open-source AI video tools to generate text-to-video or image-to-video without monthly subscriptions or cloud restrictions🌟 Star if you like it! 🌟

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project

Packages

 
 
 

Contributors