Skip to content

Realtime pipeline attempts to load voice "kokoro" with kokoro-TSS #8413

@christian-drescher

Description

@christian-drescher

Thanks for your effort of implementing OpenAI's Realtime API with WebSocket. However, the TTS part of the realtime pipeline fails.

LocalAI version:
5ac50c9
localai/localai:master-gpu-nvidia-cuda-13

Describe the bug
Following the proposed configuration from the documentation by using kokoro as TTS-system, localai's realtime pipeline fails to generate speech because a voice with the same name "kokoro" cannot be found. Instead, it generates an error event. I suppose the error is caused when kokoro-TTS is called by the realtime pipeline with voice set to "kokoro" (i.e., the name of the TTS-system) instead of using the localai-config for the kokoro model (voice: af_heart). In turn, kokoro attempts to download model weights for the voice "kokoro", which does not exist.

To Reproduce
Connect to Realtime endpoint via websockets, transmit audio, wait for the TTS part of the pipeline to operate.

Expected behavior
Realtime pipeline either loads voice configured through session.update, or uses the default from the TTS configuration.

Logs
tts_error TTS generation failed: error during TTS: Unexpected err=EntryNotFoundError('404 Client Error. (Request ID: Root=REDACTED)\n\nEntry Not Found for url: https://huggingface.co/hexgrad/Kokoro-82M/resolve/main/voices/kokoro.pt.'), type(err)=<class 'huggingface_hub.errors.EntryNotFoundError'>

Additional context
I suppose line 990 in core/http/endpoints/openai/realtime.go triggers the error by submitting session.Voice as parameter to session.ModelInterface.TTS, while session.Voice is set to kokoro (being the TTS backend). Could be a logical error in the programming, i.e., confusing TTS-backend with TTS-voice when using session.Voice.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions