sheqian36

sheqian36

3 followers · 7 following

Lists (5)

Sort

Stars

Stability-AI / stable-virtual-camera

Stable Virtual Camera: Generative View Synthesis with Diffusion Models

Python 851 47 Updated Mar 21, 2025

coder-duibai / Contrastive-Learning-Papers-Codes

A comprehensive list of Awesome Contrastive Learning Papers&Codes.Research include, but are not limited to: CV, NLP, Audio, Video, Multimodal, Graph, Language, etc.

413 42 Updated Sep 8, 2021

krantiparida / awesome-audio-visual

A curated list of different papers and datasets in various areas of audio-visual processing

700 68 Updated Jan 30, 2024

thu-ml / RIFLEx

Official implementation for "RIFLEx: A Free Lunch for Length Extrapolation in Video Diffusion Transformers"

Python 480 52 Updated Mar 3, 2025

Tinglok / avstyle

Codebase for the Paper: Learning Visual Styles from Audio-Visual Associations (ECCV 2022, in PyTorch)

Python 15 2 Updated Jan 26, 2023

hkchengrex / MMAudio

[CVPR 2025] Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Python 1,236 155 Updated Mar 15, 2025

modelscope / DiffSynth-Studio

Enjoy the magic of Diffusion models!

Python 8,042 720 Updated Mar 21, 2025

Wan-Video / Wan2.1

Wan: Open and Advanced Large-Scale Video Generative Models

Python 8,924 952 Updated Mar 21, 2025

FlamieZhu / Balanced-Contrastive-Learning

Code Release for “Balanced Contrastive Learning for Long-Tailed Visual Recognition”

Python 107 12 Updated Oct 31, 2022

richardaecn / class-balanced-loss

Class-Balanced Loss Based on Effective Number of Samples. CVPR 2019

Python 609 69 Updated Aug 29, 2021

zhouie / markdown-emoji

Markdown语法支持添加 emoji表情，输入不同的符号码（两个冒号包围的字符）可以显示出不同的表情

245 73 Updated Aug 5, 2018

changzheng123 / L-CoDer

Implementation for for "L-CoDer: Language-based Colorization with Color-object Decoupling Transformer"

Python 8 Updated Jan 20, 2024

fudan-generative-vision / hallo3

Hallo3: Highly Dynamic and Realistic Portrait Image Animation with Video Diffusion Transformer

Python 1,149 154 Updated Mar 13, 2025

THUDM / CogVideo

text and image to video generation: CogVideoX (2024) and CogVideo (ICLR 2023)

Python 11,019 1,055 Updated Mar 22, 2025

JOY-MM / JoyGen

talking-face video editing

Python 285 43 Updated Feb 27, 2025

QwenLM / Qwen2-Audio

The official repo of Qwen2-Audio chat & pretrained large audio language model proposed by Alibaba Cloud.

Python 1,621 127 Updated Aug 13, 2024

lzw108 / EmoLLMs

Python 78 11 Updated Aug 31, 2024

ZFTurbo / Music-Source-Separation-Training

Repository for training models for music source separation.

Python 677 88 Updated Mar 18, 2025

ZFTurbo / MVSEP-CDX23-Cinematic-Sound-Demixing

Model for CDX23 (Cinematic Sound Demixing) contest

Python 40 6 Updated Jun 24, 2024

PardoAlejo / MovieCuts

Learning to cut end-to-end pretrained modules

Python 30 3 Updated Jul 16, 2024

ZebangCheng / Emotion-LLaMA

Emotion-LLaMA: Multimodal Emotion Recognition and Reasoning with Instruction Tuning

Python 248 23 Updated Feb 27, 2025

lyuchenyang / Macaw-LLM

Macaw-LLM: Multi-Modal Language Modeling with Image, Video, Audio, and Text Integration

Python 1,560 122 Updated Jan 1, 2025

speedyseal / audiosetdl

Scripts for download AudioSet

Jupyter Notebook 73 47 Updated Nov 7, 2017

m-bain / CondensedMovies

Story-Based Retrieval with Contextual Embeddings. Largest freely available movie video dataset. [ACCV'20]

Python 175 28 Updated Sep 21, 2022

kwatcharasupat / bandit-v2

Reimplementation of Bandit for "Remastering Divide and Remaster: A Cinematic Audio Source Separation Dataset with Multilingual Support"

Python 28 1 Updated Jul 29, 2024

colmap / glomap

GLOMAP - Global Structured-from-Motion Revisited

C++ 1,680 119 Updated Mar 20, 2025

hanlin-cheng / slam-study-note

生活不易，靓仔叹气（做好笔记）

120 26 Updated Mar 20, 2025

ZGCTroy / CamI2V

official repo of paper for "CamI2V: Camera-Controlled Image-to-Video Diffusion Model"

Python 113 7 Updated Mar 21, 2025

cashiwamochi / RealEstate10K_Downloader

These scripts are used to download RealEstate10K dataset.

Python 81 16 Updated Mar 22, 2024

LAARRRY / CamTrol

Implementation of CamTrol: Training-free Camera Control for Video Generation

Python 9 2 Updated Sep 13, 2024

sheqian36

Lists (5)

LLM

ML

multimodal

steamtools

上色

Stars