OpenAI have a voice powered chat mode in their app and there's a noticeable dela... | Hacker News #559
Labels
AI-Chatbots
Topics related to advanced chatbot platforms integrating multiple AI models
Automation
Automate the things
New-Label
Choose this option if the existing labels are insufficient to describe the content accurately
TITLE: OpenAI have a voice powered chat mode in their app and there's a noticeable dela... | Hacker News
DESCRIPTION:
mike_hearn 5 hours ago | parent | context | flag | favorite | on: Groq runs Mixtral 8x7B-32k with 500 T/s
OpenAI have a voice powered chat mode in their app and there's a noticeable delay of a few seconds between finishing your sentence and the bot starting to speak.
I think the problem is that for realistic TTS you need quite a few tokens because the prosody can be affected by tokens that come a fair bit further down the sentence, consider the difference in pitch between:
"The war will be long and bloody"
vs
"The war will be long and bloody?"
So to begin TTS you need quite a lot of tokens, which in turn means you have to digest the prompt and run a whole bunch of forward passes before you can start rendering. And of course you have to keep up with the speed of regular speech, which OpenAI sometimes struggles with.
That said, the gap isn't huge. Many apps won't need it. Some use cases where low latency might matter:
Undoubtably there will be a lot more though. When you give people performance, they find ways to use it.
URL: Hacker News
Suggested labels
{'label-name': 'real-time processing', 'label-description': 'Refers to tasks or systems that operate instantaneously, such as voice chat applications with minimal delays.', 'confidence': 51.49}
The text was updated successfully, but these errors were encountered: