Lists (5)
Sort Name ascending (A-Z)
Stars
Stable Virtual Camera: Generative View Synthesis with Diffusion Models
A comprehensive list of Awesome Contrastive Learning Papers&Codes.Research include, but are not limited to: CV, NLP, Audio, Video, Multimodal, Graph, Language, etc.
A curated list of different papers and datasets in various areas of audio-visual processing
Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers"
Codebase for the Paper: Learning Visual Styles from Audio-Visual Associations (ECCV 2022, in PyTorch)
[CVPR 2025] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis
Enjoy the magic of Diffusion models!
Wan: Open and Advanced Large-Scale Video Generative Models
Code Release for “Balanced Contrastive Learning for Long-Tailed Visual Recognition”
Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019
Markdown语法支持添加 emoji表情,输入不同的符号码(两个冒号包围的字符)可以显示出不同的表情
Implementation for for "L-CoDer: Language-based Colorization with Color-object Decoupling Transformer"
Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer
text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)
The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.
Repository for training models for music source separation.
Model for CDX23 (Cinematic Sound Demixing) contest
Learning to cut end-to-end pretrained modules
Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning
Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration
Story-Based Retrieval with Contextual Embeddings. Largest freely available movie video dataset. [ACCV'20]
Reimplementation of Bandit for "Remastering Divide and Remaster: A Cinematic Audio Source Separation Dataset with Multilingual Support"
official repo of paper for "CamI2V: Camera-Controlled Image-to-Video Diffusion Model"
These scripts are used to download RealEstate10K dataset.
Implementation of CamTrol: Training-free Camera Control for Video Generation