A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.
-
Updated
Feb 18, 2026 - Python
A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.
A ComfyUI custom node integration for local multi-engine multi-language Text-to-Speech and Voice Conversion. Supports: RVC, Echo-TTS, Qwen3-TTS, Cozy Voice 3, Step Audio EditX, IndexTTS-2, Chatterbox (classic and multilingual), F5-TTS, Higgs Audio 2 and VibeVoice with unlimited text length, SRT timing, Character support, and many audio tools
Draft to Take beta: local-first AI audio production studio powered by IndexTTS2, Docker, Qwen, OmniVoice, SFX, ambience, and music sidecars.
Free real-time AI Noise Gate VST3/AU plugin. Removes coughs, sneezes, and other artifacts from your live streams, podcasts, and videos.
Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.
Real-Time Deepfake Pipeline
Local-first CLI that turns Markdown scripts into multi-speaker podcast-style audio using Coqui XTTS v2.
Music Generation Using Deep Learning🎶🎵
Community list of AI tools for audio and music
AI Voice Agents: Exploring the Next Generation of Human-Machine Interaction! 🎙️🤖🎧
AudioInsight is a web application that processes audio, generates transcriptions, and allows users to ask questions about the related audio.
A local-first EPUB reader with high-fidelity neural text-to-speech, word-level synchronization, and Next.js/FastAPI/ONNX stack.
An approach to Andrej Karpathy's LLM challenge, as outlined here: https://twitter.com/karpathy/status/1760740503614836917
Maya Voice AI is an open-source project that demonstrates the Maya1 model, capable of generating realistic voice audio from text input with rich emotional and descriptive control. This repository provides a demo for text-to-speech synthesis using advanced language models and the SNAC codec, focusing on high-quality audio at 24kHz.
IntelliMix is an AI-powered web app for transforming and editing audio with ease. Create mashups just with one prompt, trim audio, batch process, and download media—all in one streamlined interface. Built with React, Flask, and integrated AI tools.
Professional Yocto BSP Layer for Dynamic Devices Edge Computing Platforms - AI Audio Processing, E-Ink Displays, Power Management, Wireless Connectivity, i.MX8MM/i.MX93 Support
AI Audio Framework 🎵
Open-source Chinese TTS workstation for humans, AI, and agents. CLI first, WebUI on the roadmap.
Local Windows app for Stable Audio Open 1.0 - 9-language UI, multilingual prompt translation, 217 presets across 15 | categories, multi-variations, batch mode, game-ready WAV output. One-file installer.
High-performance KittenTTS API server with a built-in web UI, OpenAI-compatible routes, long-form text support, and optional CUDA acceleration.
Add a description, image, and links to the ai-audio topic page so that developers can more easily learn about it.
To associate your repository with the ai-audio topic, visit your repo's landing page and select "manage topics."