Stars
Code for MegaSynth: Scaling Up 3D Scene Reconstruction with Synthesized Data
Official repo for paper "Structured 3D Latents for Scalable and Versatile 3D Generation".
Code for Momentum-GS: Momentum Gaussian Self-Distillation for High-Quality Large Scene Reconstruction
Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable…
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
📺 An End-to-End Solution for High-Resolution and Long Video Generation Based on Transformer Diffusion
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
Official implementation of the paper "TANGO: Co-Speech Gesture Video Reenactment with Hierarchical Audio-Motion Embedding and Diffusion Interpolation"
Using Claude Sonnet 3.5 to forward (reverse) engineer code from VASA white paper - WIP - (this is for La Raza 🎷)
EchoMimic: Lifelike Audio-Driven Portrait Animations through Editable Landmark Conditioning
React Flow | Svelte Flow - Powerful open source libraries for building node-based UIs with React (https://reactflow.dev) or Svelte (https://svelteflow.dev). Ready out-of-the-box and infinitely cust…
Lexical is an extensible text editor framework that provides excellent reliability, accessibility and performance.
Storybook is the industry standard workshop for building, documenting, and testing UI components in isolation
OpenID Connect Relying Party and OAuth 2.0 Resource Server implementation in Lua for NGINX / OpenResty
A modular graph-based Retrieval-Augmented Generation (RAG) system
An open-source remote desktop application designed for self-hosting, as an alternative to TeamViewer.
SD.Next: All-in-one for AI generative image
Finetune Llama 3.3, DeepSeek-R1, Mistral, Phi-4 & Gemma 2 LLMs 2-5x faster with 70% less memory
The Single Sign-On Multi-Factor portal for web apps
我的 ComfyUI 工作流合集 | My ComfyUI workflows collection
Agno is a lightweight framework for building multi-modal Agents
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
Open source platform for the machine learning lifecycle