Stars
Code and dataset for photorealistic Codec Avatars driven from audio
[CVPR 2024] 4K4D: Real-Time 4D View Synthesis at 4K Resolution
📖 A curated list of resources dedicated to avatar.
A curated list of audio-visual learning methods and datasets.
A curated list of different papers and datasets in various areas of audio-visual processing
🎓 Update Talking-Face Research Papers Daily, Now Integrated with LLM Analysis.
A minimal and universal controller for FLUX.1.
A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability
[SIGGRAPH 2024] InvertAvatar: Incremental GAN Inversion for Generalized Head Avatars
official code for PseR: Pseudo-label Refinement for Point-Supervised Temporal Action Detection
[ACM MM 2024] WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition
Official code for "A Closer Look at Audio-Visual Segmentation"
Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.
A paper list of some recent Transformer-based CV works.
A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.
detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
Advanced AI-Based Video Renovation UI Using EMA-VFI & Real-ESRGAN
Paper list for video enhancement, including video super-resolution, interpolation, denoising, deblurring and inpainting.
Papers for Video Anomaly Detection, released codes collection, Performance Comparision.
Recent weakly supervised semantic segmentation paper
CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
Generative Models by Stability AI
Finetune ModelScope's Text To Video model using Diffusers 🧨
The official implementation of DenoiseLoc: Boundary Denoising for Video Activity Localization, ICLR 2024