Starred repositories
This repo contains annotated research papers that I found really good and useful
xychelsea / rife
Forked from hzwer/ECCV2022-RIFERIFE: Real-Time Intermediate Flow Estimation for Video Frame Interpolation
Talking Head (3D): A JavaScript class for real-time lip-sync using Ready Player Me full-body 3D avatars.
VITS: Conditional Variational Autoencoder with Adversarial Learning for End-to-End Text-to-Speech
Elucidating the Design Space of Diffusion-Based Generative Models (EDM)
A data annotation pipeline to generate high-quality, large-scale speech datasets with machine pre-labeling and fully manual auditing.
[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild
Parsing gigabytes of JSON per second : used by Facebook/Meta Velox, the Node.js runtime, ClickHouse, WatermelonDB, Apache Doris, Milvus, StarRocks
Pytorch implementation of VQGAN (Taming Transformers for High-Resolution Image Synthesis) (https://arxiv.org/pdf/2012.09841.pdf)
A library for efficient similarity search and clustering of dense vectors.
A real-time video processing app written in C++ using OpenGL and FFmpeg
YourTTS: Towards Zero-Shot Multi-Speaker TTS and Zero-Shot Voice Conversion for everyone
Fast and memory-efficient exact attention
FastAPI project Template generator to make your life easier 🧬 🚀
Perceptual video quality assessment based on multi-method fusion.
State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism (SVS & TTS); AAAI 2022; Official code
Tools for handling speech data in machine learning projects.
FastAPI Best Practices and Conventions we used at our startup
Experimental LDM uses of Paella's architecture
Pytorch implementation of Diffusion Models (https://arxiv.org/pdf/2006.11239.pdf)
Erasing Concepts from Diffusion Models
Video-P2P: Video Editing with Cross-attention Control
Pytorch implementation of MaskGIT: Masked Generative Image Transformer (https://arxiv.org/pdf/2202.04200.pdf)
Code and documentation to train Stanford's Alpaca models, and generate the data.