Fixes #4243 :Prevent immediate speech interruption in pause mode #4395

devbyteai · 2025-12-26T15:14:25Z

Summary

This PR fixes the phantom VAD activity issue that caused unwanted interruptions when using STT turn detection with resume_false_interruption=True.

Problem

When using STT turn detection (especially with Deepgram) and resume_false_interruption=True, the agent incorrectly interrupted speech during the false interruption timeout period. This caused:

Agent transitions from "thinking" to "listening" without actual user speech
The llm_node gets cancelled unexpectedly
Console mode users particularly affected (~30% reproduction rate)

User-Reported Behavior:

"the agent state changed from thinking to listening without the user state changing to speaking"

Root Cause

In on_final_transcript() (agent_activity.py), after pausing the speech and starting the false interruption timer, the code unconditionally called _interrupt_paused_speech():

def on_final_transcript(self, ev: stt.SpeechEvent, *, speaking: bool | None = None) -> None:
    # ...
    if self._audio_recognition and self._turn_detection not in ("manual", "realtime_llm"):
        self._interrupt_by_audio_activity()  # PAUSES speech when use_pause=True

        if speaking is False and self._paused_speech and timeout:
            self._start_false_interruption_timer(timeout)  # Start timer to resume

    # BUG: This ALWAYS runs, defeating the pause mode!
    self._interrupt_paused_speech_task = asyncio.create_task(
        self._interrupt_paused_speech(...)  # Cancels timer and interrupts!
    )

The _interrupt_paused_speech() method:

Cancels the false interruption timer
Calls interrupt() on the paused speech

This defeated the entire purpose of the pause mode, which was designed to allow speech to resume if the interruption was a false positive.

Solution

When in pause mode (resume_false_interruption=True, false_interruption_timeout set, audio supports pause), return early after starting the timer. Let the timer decide whether to:

Resume - if it was a false interruption (no real user speech)
Let end-of-turn handle it - if it was a real interruption

if self._audio_recognition and self._turn_detection not in ("manual", "realtime_llm"):
    self._interrupt_by_audio_activity()

    # NEW: Check if we're in pause mode
    opt = self._session.options
    use_pause = opt.resume_false_interruption and opt.false_interruption_timeout is not None
    can_pause = self._session.output.audio and self._session.output.audio.can_pause

    if use_pause and can_pause and self._paused_speech:
        if speaking is False and (timeout := opt.false_interruption_timeout) is not None:
            self._start_false_interruption_timer(timeout)
        return  # KEY FIX: Don't call _interrupt_paused_speech in pause mode

    # ... existing code for non-pause mode ...

Testing

Unit Tests

All 15 existing agent session tests pass:

tests/test_agent_session.py::test_events_and_metrics PASSED
tests/test_agent_session.py::test_tool_call PASSED
tests/test_agent_session.py::test_interruption[False-5.5] PASSED
tests/test_agent_session.py::test_interruption[True-5.5] PASSED
tests/test_agent_session.py::test_interruption_options PASSED
tests/test_agent_session.py::test_interruption_by_text_input PASSED
tests/test_agent_session.py::test_interruption_before_speaking[False-3.5] PASSED
tests/test_agent_session.py::test_interruption_before_speaking[True-3.5] PASSED
tests/test_agent_session.py::test_generate_reply PASSED
tests/test_agent_session.py::test_preemptive_generation[True-0.8] PASSED
tests/test_agent_session.py::test_preemptive_generation[False-1.1] PASSED
tests/test_agent_session.py::test_interrupt_during_on_user_turn_completed[False-0.0] PASSED
tests/test_agent_session.py::test_interrupt_during_on_user_turn_completed[False-2.0] PASSED
tests/test_agent_session.py::test_interrupt_during_on_user_turn_completed[True-0.0] PASSED
tests/test_agent_session.py::test_interrupt_during_on_user_turn_completed[True-2.0] PASSED

======================== 15 passed in 76.00s ========================

What Tests Verify

False interruption with resume_false_interruption=True - Tests that speech correctly resumes after false interruption timeout
False interruption with resume_false_interruption=False - Tests backward compatibility
Interruption options - Tests various interruption configurations
Preemptive generation - Tests preemptive generation with interruptions

Manual Testing Recommended

For production verification, test with:

Console mode + Deepgram STT + STT turn detection
Verify no phantom "thinking→listening" transitions
Verify legitimate interruptions still work correctly

Backward Compatibility

No breaking changes - Only affects users with resume_false_interruption=True
Default behavior preserved - Users with resume_false_interruption=False see no change
Expected behavior change - Agent now correctly waits for false interruption timeout instead of immediately interrupting

Impact Analysis

Console Mode

Primary fix target - Resolves phantom interruptions
Users should no longer see unexpected "thinking→listening" transitions

WebRTC Mode

Same fix applies
Improves false interruption handling for all users with resume_false_interruption=True

Realtime LLM

Not affected (already skipped in code at line 1280)

Manual Turn Detection

Not affected (already skipped in code at line 1297-1299)

Edge Cases Handled

User speaks again during timeout → Timer cancelled in on_start_of_speech() (already implemented)
Real end-of-turn detected → _user_turn_completed_task still interrupts correctly
Session close → _interrupt_paused_speech called during cleanup
Audio doesn't support pause → Falls through to immediate interrupt (correct behavior)

Files Changed

livekit-agents/livekit/agents/voice/agent_activity.py - Fix in on_final_transcript() method

Future Considerations

Framework-Wide Audit - Other methods that call _interrupt_paused_speech should be reviewed
Metrics - Consider adding metrics for false interruption events
Documentation - Update docs to explain resume_false_interruption behavior more clearly

When using STT turn detection with resume_false_interruption=True, the agent incorrectly interrupted speech during the false interruption timeout period, causing phantom "thinking->listening" transitions. Root cause: In on_final_transcript(), after pausing the speech and starting the false interruption timer, the code unconditionally called _interrupt_paused_speech() which cancelled the timer and interrupted the speech immediately - defeating the pause mode. Solution: When in pause mode (resume_false_interruption=True, timeout set, audio supports pause), return early after starting the timer. Let the timer decide whether to resume (false interruption) or let the next real end-of-turn event handle the actual interruption. This fix: - Only affects users with resume_false_interruption=True - Maintains backward compatibility for other configurations - Preserves correct behavior for real end-of-turn interruptions - Fixes console mode phantom VAD activity issues Fixes livekit#4243

CLAassistant · 2025-12-26T15:14:31Z

All committers have signed the CLA.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixes #4243 :Prevent immediate speech interruption in pause mode #4395

Fixes #4243 :Prevent immediate speech interruption in pause mode #4395

devbyteai commented Dec 26, 2025

Uh oh!

CLAassistant commented Dec 26, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fixes #4243 :Prevent immediate speech interruption in pause mode #4395

Are you sure you want to change the base?

Fixes #4243 :Prevent immediate speech interruption in pause mode #4395

Conversation

devbyteai commented Dec 26, 2025

Summary

Problem

Root Cause

Solution

Testing

Unit Tests

What Tests Verify

Manual Testing Recommended

Backward Compatibility

Impact Analysis

Edge Cases Handled

Files Changed

Future Considerations

Uh oh!

CLAassistant commented Dec 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

CLAassistant commented Dec 26, 2025 •

edited

Loading