Rich Zhao zRich

🎯

Focusing

Rich Zhao in Shenzhen China

22 followers · 36 following

Shenzhen

Achievements

Stars

hwjiang1510 / MegaSynth

Code for MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data

Python 129 2 Updated Dec 19, 2024

microsoft / TRELLIS

Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".

Python 7,443 519 Updated Dec 27, 2024

Jixuan-Fan / Momentum-GS

Code for Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction

Python 117 5 Updated Jan 12, 2025

facebookresearch / audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…

Jupyter Notebook 21,426 2,221 Updated Jan 15, 2025

DS4SD / docling

Get your documents ready for gen AI

Python 19,886 1,074 Updated Feb 5, 2025

RVC-Boss / GPT-SoVITS

1 min voice data can also be used to train a good TTS model! (few shot voice cloning)

Python 39,854 4,472 Updated Jan 18, 2025

coqui-ai / TTS

🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production

Python 37,379 4,664 Updated Aug 16, 2024

aigc-apps / EasyAnimate

📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion

Python 1,889 146 Updated Jan 23, 2025

SWivid / F5-TTS

Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"

Python 9,370 1,251 Updated Feb 5, 2025

open-mmlab / Amphion

Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…

Python 8,419 644 Updated Feb 3, 2025

CyberAgentAILab / TANGO

Official implementation of the paper "TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation"

Python 867 108 Updated Oct 29, 2024

johndpope / VASA-1-hack

Using Claude Sonnet 3.5 to forward (reverse) engineer code from VASA white paper - WIP - (this is for La Raza 🎷)

Python 270 34 Updated Nov 9, 2024

antgroup / echomimic

EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning

Python 3,505 396 Updated Dec 10, 2024

xyflow / xyflow

React Flow | Svelte Flow - Powerful open source libraries for building node-based UIs with React (https://reactflow.dev) or Svelte (https://svelteflow.dev). Ready out-of-the-box and infinitely cust…

TypeScript 27,410 1,778 Updated Feb 2, 2025