High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体!
-
Updated
Nov 5, 2024 - Python
High-quality and streaming Speech-to-Speech interactive agent in a single file. 只用一个文件实现的流式全双工语音交互原型智能体!
MooER: Moore-threads Open Omni model for spech-to-speech intERaction. MooER-omni includes a series of end-to-end speech interaction models along with training and inference code, covering but not limited to end-to-end speech interaction, end-to-end speech translation and speech recognition.
A simple, high-quality voice conversion tool focused on ease of use and performance
An advanced speech-to-speech (S2S) voice assistant utilizing OpenAI’s Realtime API for ultra-low-latency, two-way audio streaming, real-time natural language understanding, and responsive, interactive dialogue through direct WebSocket communication.
An interactive voice-based chatbot with a visual avatar that runs locally (no internet needed)
Small Assistant IA like Amazon Echo or Siri (not usable)
Turn any LLM into Jarvis
CtrlSpeak is a voice assistant activated with [Control]+Q, listening and responding only when you want.
LLaMA-Omni is a low-latency and high-quality end-to-end speech interaction model built upon Llama-3.1-8B-Instruct, aiming to achieve speech capabilities at the GPT-4o level.
A lite tool to quickly customize LLM chatbot workflow pipelines, like Text-to-Text, Text-to-Speech or Speech-to-Speech
If you've ever had the wish to talk to your AI Waifu using quality characters and voices for character voicing, then I suggest Soul of Waifu. Don't miss the opportunity to touch your dream!
Code for NeurIPS 2023 paper "DASpeech: Directed Acyclic Transformer for Fast and High-quality Speech-to-Speech Translation".
A comparison of E2E and Cascading S2ST systems on the CVSS-C Spanish to English dataset (CommonVoice 4.0)
Speech to Speech Translation Python
3-month project on artificial intelligence in teams of 3 with Manon Duboscq and Léa Mariot
GPT powered rubber duck debugger as CS50 2023 final project.
Conversational speech chatbot utilizing OpenAI's GPTs and Microsoft Azure's Speech Services
Audio-to-Audio using microsoft/speecht5_vc from HuggingFace
Code for the INTERSPEECH 2023 paper "Learning When to Speak: Latency and Quality Trade-offs for Simultaneous Speech-to-Speech Translation with Offline Models"
Add a description, image, and links to the speech-to-speech topic page so that developers can more easily learn about it.
To associate your repository with the speech-to-speech topic, visit your repo's landing page and select "manage topics."