Skip to content

Conversation

@tschellenbach
Copy link
Member

@tschellenbach tschellenbach commented Nov 26, 2025

  • Elevenlabs Scribe v2: Prevent Turn Complete from running before the transcription is complete.
  • Update transcript buffer to handle the scenario where transcripts change (IE the prediction of what was said changes for words in the past)
  • Test the eager turn taking in deepgram
  • Deepgram Flux, start of turn support (fixes barge in)

Summary by CodeRabbit

Release Notes

  • New Features

    • Enhanced transcript buffering with improved handling of partial and final transcript events
    • Turn detection now includes confidence metrics for greater accuracy
    • Added word-level timestamp support for ElevenLabs STT integration
  • Bug Fixes

    • Improved turn-end event processing when STT is active with proper buffer management
  • Chores

    • Updated Deepgram SDK dependency to version 5.3.0 or higher
  • Documentation

    • Simplified ElevenLabs example assistant configuration

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link

coderabbitai bot commented Nov 26, 2025

Caution

Review failed

The pull request is closed.

Walkthrough

This PR overhauls STT transcript buffering to distinguish partial and final events, adjusts turn-end processing to wait for transcripts when not eager, enables turn-start event emission, propagates STT confidence to turn events, and refactors ElevenLabs STT with commit synchronization and word-level timestamps.

Changes

Cohort / File(s) Summary
Core Transcript & Agent Handling
agents-core/vision_agents/core/agents/agents.py, agents-core/vision_agents/core/agents/transcript_buffer.py
Log level for STTPartialTranscriptEvent elevated to info. Turn-end processing now conditionally clears STT and waits (0.02s) when not eager. TranscriptBuffer refactored to track pending partials, distinguish partial vs. final events, add text property, return segment copies, and expose __len__ and __bool__.
Core STT Infrastructure
agents-core/vision_agents/core/stt/stt.py
Registers TurnStartedEvent in EventManager; adds _emit_turn_started_event method to emit turn-start events with participant and confidence.
Deepgram Plugin
plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py, plugins/deepgram/pyproject.toml
Event emission order adjusted to emit transcripts before turn-end events; turn-start detection added; end-of-turn metadata removed from TranscriptResponse; confidence propagated to both turn-ended and turn-started events. Dependency bumped from deepgram-sdk==5.2.0 to >=5.3.0.
ElevenLabs Plugin & Tests
plugins/elevenlabs/vision_agents/plugins/elevenlabs/stt.py, plugins/elevenlabs/tests/test_elevenlabs_stt.py
STT class extended with connection field and _commit_received event for synchronization; clear method signature changed to include timeout parameter and now commits pending audio; transcript handlers capture word-level data as optional metadata; timestamps enabled in audio options. Test updated with module-level asyncio import and modified STT clearing/waiting sequence.
ElevenLabs Documentation & Example
plugins/elevenlabs/example/assistant.md, plugins/elevenlabs/example/elevenlabs_example.py
Assistant persona guidelines removed from documentation. Example simplified: agent initialization text shortened, manual user/call creation removed, replaced with single simple_response and immediate call finish.
Transcript Buffer Tests
tests/test_transcript_buffer.py
Expanded test suite to cover partial vs. final event semantics, multi-segment flows, duplicate handling, reset behavior, whitespace handling, and truthiness checks; imports added for STTTranscriptEvent and STTPartialTranscriptEvent.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

  • agents-core/vision_agents/core/agents/transcript_buffer.py — Significant refactoring with new partial/final event semantics and state tracking; requires understanding of the buffering contract and impact on callers.
  • plugins/elevenlabs/vision_agents/plugins/elevenlabs/stt.py — Multiple concurrent changes: commit synchronization, timeout handling in clear(), word-level metadata propagation, and timestamps integration; verify interaction between connection events and handlers.
  • tests/test_transcript_buffer.py — Comprehensive test expansion; validate coverage of edge cases (duplicates, resets, multi-utterance flows) and ensure new semantics are correctly captured.
  • agents-core/vision_agents/core/agents/agents.py — Turn-end conditional logic and timing (0.02s wait) interacts with multiple event handlers; ensure non-eager mode works correctly across scenarios.
  • plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py — Event emission order and confidence propagation must align with new turn-start/turn-end semantics.

Possibly related PRs

  • [AI-269] Eager turn taking #169 — Directly addresses turn handling and transcript buffering semantics for eager vs. non-eager end-of-turn processing, including related transcript buffer handling.
  • [AI-271] Elevenhour labs Scribe2 #170 — Modifies ElevenLabs STT implementation and tests covering transcript/commit handling and similar infrastructure changes.
  • Cleanup stt #122 — Updates core STT event emission behavior and turn-start event registration alongside transcript response handling.

Suggested labels

plugin-elevenlabs, plugin-deepgram, tests

Suggested reviewers

  • d3xvn

Poem

Partial truths collect like bell jars—
each fragment waiting, waiting still
*for the final word to shatter gently,
time pooling in the margins where
turn and transcript finally kiss,
no longer racing phantom schedules. 🎙️

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch time

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Disabled knowledge base sources:

  • Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 16d1103 and e5d8d87.

⛔ Files ignored due to path filters (1)
  • uv.lock is excluded by !**/*.lock
📒 Files selected for processing (10)
  • agents-core/vision_agents/core/agents/agents.py (3 hunks)
  • agents-core/vision_agents/core/agents/transcript_buffer.py (2 hunks)
  • agents-core/vision_agents/core/stt/stt.py (3 hunks)
  • plugins/deepgram/pyproject.toml (1 hunks)
  • plugins/deepgram/vision_agents/plugins/deepgram/deepgram_stt.py (3 hunks)
  • plugins/elevenlabs/example/assistant.md (0 hunks)
  • plugins/elevenlabs/example/elevenlabs_example.py (2 hunks)
  • plugins/elevenlabs/tests/test_elevenlabs_stt.py (2 hunks)
  • plugins/elevenlabs/vision_agents/plugins/elevenlabs/stt.py (11 hunks)
  • tests/test_transcript_buffer.py (2 hunks)

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@tschellenbach tschellenbach changed the title Elevenlabs Scribe 2 improvements [AI-326] Elevenlabs Scribe 2 improvements Nov 26, 2025
@tschellenbach tschellenbach marked this pull request as ready for review November 27, 2025 16:25
@tschellenbach tschellenbach merged commit 74b64ed into main Nov 27, 2025
5 of 8 checks passed
@tschellenbach tschellenbach deleted the time branch November 27, 2025 16:26
@coderabbitai coderabbitai bot mentioned this pull request Nov 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants