Skip to content
View sheqian36's full-sized avatar

Block or report sheqian36

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Stable Virtual Camera: Generative View Synthesis with Diffusion Models

Python 851 47 Updated Mar 21, 2025

A comprehensive list of Awesome Contrastive Learning Papers&Codes.Research include, but are not limited to: CV, NLP, Audio, Video, Multimodal, Graph, Language, etc.

413 42 Updated Sep 8, 2021

A curated list of different papers and datasets in various areas of audio-visual processing

700 68 Updated Jan 30, 2024

Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers"

Python 480 52 Updated Mar 3, 2025

Codebase for the Paper: Learning Visual Styles from Audio-Visual Associations (ECCV 2022, in PyTorch)

Python 15 2 Updated Jan 26, 2023

[CVPR 2025] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Python 1,236 155 Updated Mar 15, 2025

Enjoy the magic of Diffusion models!

Python 8,042 720 Updated Mar 21, 2025

Wan: Open and Advanced Large-Scale Video Generative Models

Python 8,924 952 Updated Mar 21, 2025

Code Release for “Balanced Contrastive Learning for Long-Tailed Visual Recognition”

Python 107 12 Updated Oct 31, 2022

Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

Python 609 69 Updated Aug 29, 2021

Markdown语法支持添加 emoji表情,输入不同的符号码(两个冒号包围的字符)可以显示出不同的表情

245 73 Updated Aug 5, 2018

Implementation for for "L-CoDer: Language-based Colorization with Color-object Decoupling Transformer"

Python 8 Updated Jan 20, 2024

Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer

Python 1,149 154 Updated Mar 13, 2025

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 11,019 1,055 Updated Mar 22, 2025

talking-face video editing

Python 285 43 Updated Feb 27, 2025

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,621 127 Updated Aug 13, 2024
Python 78 11 Updated Aug 31, 2024

Repository for training models for music source separation.

Python 677 88 Updated Mar 18, 2025

Model for CDX23 (Cinematic Sound Demixing) contest

Python 40 6 Updated Jun 24, 2024

Learning to cut end-to-end pretrained modules

Python 30 3 Updated Jul 16, 2024

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning

Python 248 23 Updated Feb 27, 2025

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Python 1,560 122 Updated Jan 1, 2025

Scripts for download AudioSet

Jupyter Notebook 73 47 Updated Nov 7, 2017

Story-Based Retrieval with Contextual Embeddings. Largest freely available movie video dataset. [ACCV'20]

Python 175 28 Updated Sep 21, 2022

Reimplementation of Bandit for "Remastering Divide and Remaster: A Cinematic Audio Source Separation Dataset with Multilingual Support"

Python 28 1 Updated Jul 29, 2024

GLOMAP - Global Structured-from-Motion Revisited

C++ 1,680 119 Updated Mar 20, 2025

生活不易,靓仔叹气(做好笔记)

120 26 Updated Mar 20, 2025

official repo of paper for "CamI2V: Camera-Controlled Image-to-Video Diffusion Model"

Python 113 7 Updated Mar 21, 2025

These scripts are used to download RealEstate10K dataset.

Python 81 16 Updated Mar 22, 2024

Implementation of CamTrol: Training-free Camera Control for Video Generation

Python 9 2 Updated Sep 13, 2024
Next