AI
1 min voice data can also be used to train a good TTS model! (few shot voice cloning)
Instant voice cloning by MIT and MyShell. Audio foundation model.
A collection of hand-crafted extensions for your Kotlin projects.
MuseTalk: Real-Time High Quality Lip Synchorization with Latent Space Inpainting
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
A generative speech model for daily dialogue.
🎤 微软语音合成工具,使用 Electron + Vue + ElementPlus + Vite 构建。
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
Open-Sora: Democratizing Efficient Video Production for All
A machine learning-based video super resolution and frame interpolation framework. Est. Hack the Valley II, 2018.
Stable Diffusion web UI
A latent text-to-image diffusion model
A curated list of awesome Machine Learning frameworks, libraries and software.
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
Clone a voice in 5 seconds to generate arbitrary speech in real-time
🚀AI拟声: 5秒内克隆您的声音并生成任意语音内容 Clone a voice in 5 seconds to generate arbitrary speech in real-time
High-Resolution Image Synthesis with Latent Diffusion Models
Real-time face swap for PC streaming or video calls
Generative Models by Stability AI
Easily train a good VC model with voice data <= 10 mins!
Everything you need to build state-of-the-art foundation models, end-to-end.