Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion plugins/elevenlabs/src/tts.ts
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 isFinal not checked as is_final, causing stream to never complete for non-legacy voices

The PR fixes contextId to also check context_id (snake_case) for non-legacy ElevenLabs voices, but the same snake_case issue is not addressed for data.isFinal on line 554. If the non-legacy API returns is_final (snake_case, consistent with returning context_id), the if (data.isFinal) check will always be falsy.

Root cause and impact

When data.isFinal is never truthy:

  • stream.markDone() is never called, so #streamDone stays false
  • ctx.waiter.resolve() is never called, so the waiterPromise in Promise.all at plugins/elevenlabs/src/tts.ts:1080 never resolves
  • #cleanupContext(contextId!) is never called, leaking context data
  • The audioProcessTask loop at plugins/elevenlabs/src/tts.ts:1041-1063 spins indefinitely because #streamDone is never set to true

Audio data may still play (since data.audio is processed before the isFinal check), but the stream never properly terminates. The Promise.all hangs, leading to resource leaks and the synthesize stream never completing.

The fix should mirror the contextId fix:

if (data.isFinal ?? data.is_final) {

(Refers to line 554)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@adriablancafort should we also handle snake_case for other fields to keep things consistent? Or is this only an issue for context_id?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 normalizedAlignment not checked as normalized_alignment for non-legacy voices

Following the same pattern as the contextId/context_id fix, data.normalizedAlignment on line 488 may be returned as normalized_alignment by non-legacy ElevenLabs voices. When the user has preferredAlignment: 'normalized' (the default per line 695), the alignment data will be undefined and no timed word transcripts will be generated.

Root cause and impact

At plugins/elevenlabs/src/tts.ts:486-489:

const alignment =
  this.#opts.preferredAlignment === 'normalized'
    ? (data.normalizedAlignment as Record<string, unknown>)
    : (data.alignment as Record<string, unknown>);

Since preferredAlignment defaults to 'normalized' (plugins/elevenlabs/src/tts.ts:695), non-legacy voices that return normalized_alignment instead of normalizedAlignment will have alignment resolve to undefined. This means the entire alignment processing block at lines 491-546 is skipped, and no timed word transcripts are produced. While audio still plays, transcript synchronization features (word timing) will silently fail.

(Refers to lines 488-489)

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Original file line number Diff line number Diff line change
Expand Up @@ -458,7 +458,7 @@ class Connection {
if (result.done || this.#closed) break;

const data = result.value;
const contextId = data.contextId as string | undefined;
const contextId = (data.contextId ?? data.context_id) as string | undefined;
const ctx = contextId ? this.#contextData.get(contextId) : undefined;

if (data.error) {
Expand Down