Skip to content
View wyh2000's full-sized avatar

Block or report wyh2000

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Please don't include any personal information such as legal names or email addresses. Maximum 100 characters, markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Reference implementation for DPO (Direct Preference Optimization)

Python 2,035 164 Updated Aug 11, 2024

Train transformer language models with reinforcement learning.

Python 9,329 1,170 Updated Sep 21, 2024

PyTorch Re-Implementation of "The Sparsely-Gated Mixture-of-Experts Layer" by Noam Shazeer et al. https://arxiv.org/abs/1701.06538

Python 945 98 Updated Apr 19, 2024

Mixture-of-Experts for Large Vision-Language Models

Python 1,920 121 Updated May 15, 2024

FACodec: Speech Codec with Attribute Factorization used for NaturalSpeech 3

Python 152 9 Updated Apr 20, 2024

A curated list of different papers and datasets in various areas of audio-visual processing

654 70 Updated Jan 30, 2024

An unofficial PyTorch implementation of the audio LM VALL-E

Python 2,935 417 Updated May 10, 2023

Official code for Cotatron @ INTERSPEECH 2020

Python 212 32 Updated Jul 25, 2024

PPG-Based Voice Conversion

Python 326 72 Updated Jul 22, 2022

PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.

Python 6,520 1,204 Updated Aug 13, 2024

AudioLDM: Generate speech, sound effects, music and beyond, with text.

Python 2,392 221 Updated Jun 2, 2024

Official PyTorch implementation of "Conditional Generation of Audio from Video via Foley Analogies".

Python 68 4 Updated Dec 8, 2023

Tools to bulk download arxiv data

Python 117 18 Updated Oct 29, 2018

a pytorch implementation of Google GEDLoss

Python 32 2 Updated Dec 9, 2020

Google Research

Jupyter Notebook 33,862 7,839 Updated Sep 20, 2024

PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation

Jupyter Notebook 4,665 620 Updated Aug 5, 2024

[CVPR 2021 Best Student Paper Honorable Mention, Oral] Official PyTorch code for ClipBERT, an efficient framework for end-to-end learning on image-text and video-text tasks.

Python 698 85 Updated Aug 8, 2023

This repository is an implementation of this article: https://arxiv.org/pdf/2107.03312.pdf

Python 343 52 Updated Apr 21, 2022

Efficient Image Captioning code in Torch, runs on GPU

Jupyter Notebook 5,499 1,260 Updated Nov 7, 2017

I decide to sync up this repo and self-critical.pytorch. (The old master is in old master branch for archive)

Python 1,434 412 Updated Oct 5, 2023

Awesome Vision-Language Pretraining Papers

29 3 Updated Sep 12, 2024

Recent Advances in Vision and Language PreTrained Models (VL-PTMs)

1,137 102 Updated Aug 19, 2022

深度学习经典、新论文逐段精读

26,339 2,401 Updated Aug 8, 2024

A Demo of Mandarin/Chinese TTS frontend

Python 275 125 Updated Apr 18, 2022

Facebook AI Research Sequence-to-Sequence Toolkit written in Python.

Python 30,196 6,376 Updated Sep 9, 2024

MLNLP社区用来帮助大家避免论文投稿小错误的整理仓库。 Paper Writing Tips

3,496 451 Updated May 29, 2022

Facial-Expression-Recognition in TensorFlow. Detecting faces in video and recognize the expression(emotion).

Python 628 188 Updated Nov 30, 2020