Skip to content
View zhenyingfang's full-sized avatar
🐒
🐒

Block or report zhenyingfang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Code and dataset for photorealistic Codec Avatars driven from audio

Python 2,730 263 Updated Sep 15, 2024

[CVPR 2024] 4K4D: Real-Time 4D View Synthesis at 4K Resolution

Python 1,631 71 Updated Jun 7, 2024

📖 A curated list of resources dedicated to avatar.

Jupyter Notebook 58 5 Updated Nov 8, 2024

A curated list of audio-visual learning methods and datasets.

241 17 Updated Dec 3, 2024

A curated list of different papers and datasets in various areas of audio-visual processing

684 68 Updated Jan 30, 2024

🎓 Update Talking-Face Research Papers Daily, Now Integrated with LLM Analysis.

Python 180 16 Updated Jan 10, 2025

A minimal and universal controller for FLUX.1.

Python 1,054 65 Updated Jan 9, 2025

A Versatile Video-LLM for Long and Short Video Understanding with Superior Temporal Localization Ability

78 Updated Nov 28, 2024

[SIGGRAPH 2024] InvertAvatar: Incremental GAN Inversion for Generalized Head Avatars

Python 46 3 Updated Jul 22, 2024

official code for PseR: Pseudo-label Refinement for Point-Supervised Temporal Action Detection

Python 3 Updated Nov 5, 2024

[ACM MM 2024] WeakSAM: Segment Anything Meets Weakly-supervised Instance-level Recognition

Python 45 2 Updated Sep 19, 2024
Python 43 4 Updated Jun 14, 2024

Official code for "A Closer Look at Audio-Visual Segmentation"

Python 111 18 Updated Aug 14, 2024

The best OSS video generation models

Python 2,666 271 Updated Jan 8, 2025

Allegro is a powerful text-to-video model that generates high-quality videos up to 6 seconds at 15 FPS and 720p resolution from simple text input.

Python 1,049 53 Updated Jan 2, 2025

A paper list of some recent Transformer-based CV works.

1,168 139 Updated Jan 10, 2025

A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.

376 15 Updated Jan 10, 2025

detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.

Python 2,069 215 Updated Aug 15, 2024

Advanced AI-Based Video Renovation UI Using EMA-VFI & Real-ESRGAN

Python 67 4 Updated Jan 10, 2025
Python 3 Updated Jun 12, 2024

Paper list for video enhancement, including video super-resolution, interpolation, denoising, deblurring and inpainting.

115 6 Updated Aug 28, 2024

Papers for Video Anomaly Detection, released codes collection, Performance Comparision.

605 103 Updated Sep 20, 2022

Recent weakly supervised semantic segmentation paper

284 22 Updated Oct 9, 2024

CVPR and NeurIPS poster examples and templates. May we have in-person poster session soon!

1,545 145 Updated May 9, 2023

Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation

Python 9,694 1,332 Updated Sep 14, 2024

EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning

Python 3,394 393 Updated Dec 10, 2024

Generative Models by Stability AI

Python 25,045 2,777 Updated Sep 4, 2024

Finetune ModelScope's Text To Video model using Diffusers 🧨

Python 675 107 Updated Dec 14, 2023

The official implementation of DenoiseLoc: Boundary Denoising for Video Activity Localization, ICLR 2024

Python 7 Updated Feb 28, 2024
Next