Skip to content

Conversation

@longcw
Copy link
Contributor

@longcw longcw commented Jan 22, 2026

fix #4661

when there is a speech generated alongside a tool call, the interruption to the speech shouldn't cancel the tool execution if it's await for an AgentTask.

Summary by CodeRabbit

  • Bug Fixes

    • Enhanced tool execution reliability by preventing premature cancellation when speech generation is active.
    • Improved speech pause handling with better state tracking and proper recovery after cancellation.
    • Enhanced task logging for better debugging of cancellation events.
  • Chores

    • Updated email example to use OpenAI GPT-4.1 Mini as the default LLM model.

✏️ Tip: You can customize this high-level summary in your review settings.


Open with Devin

@chenghao-mou chenghao-mou requested a review from a team January 22, 2026 09:40
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 22, 2026

📝 Walkthrough

Walkthrough

This PR adds a SpeechHandle tool_cancelable flag and uses it to prevent mid-execution cancellations, renames/refactors paused-speech interruption APIs to _cancel_speech_pause (with an interrupt parameter), assigns names to tool-execution tasks, changes SegmentSynchronizerImpl.resume to a no-op after close, and switches the email example LLM to openai/gpt-4.1-mini.

Changes

Cohort / File(s) Summary
Tool execution safety
livekit-agents/livekit/agents/voice/speech_handle.py, livekit-agents/livekit/agents/voice/agent.py
Add _tool_cancelable + public tool_cancelable property on SpeechHandle. Agent code temporarily sets speech_handle.tool_cancelable = False while awaiting tool execution and restores the previous value in finally blocks to avoid race-condition cancellations.
Speech pause handling refactor
livekit-agents/livekit/agents/voice/agent_activity.py
Rename _interrupt_paused_speech_task_cancel_speech_pause_task, replace _interrupt_paused_speech(...) calls with _cancel_speech_pause(..., interrupt=...), add interrupt parameter, and unify forwarded_text/speech scheduling and interruption-reset logic across flows.
Task naming for tooling
livekit-agents/livekit/agents/voice/generation.py
Give the asyncio task created for tool execution a name (function call name) so cancellations/logging can reference the task name.
Transcription resume behavior
livekit-agents/livekit/agents/voice/transcription/synchronizer.py
SegmentSynchronizerImpl.resume now silently returns when called after close (removed runtime warning).
Example LLM change
examples/voice_agents/email_example.py
Change default LLM backend from google/gemini-2.5-flash to openai/gpt-4.1-mini.

Sequence Diagram(s)

sequenceDiagram
    participant Agent as Agent
    participant Speech as SpeechHandle
    participant Tool as Tool Task
    participant Finally as Finally

    Agent->>Speech: read tool_cancelable (old_state)
    Agent->>Speech: set tool_cancelable = False
    Note over Speech: prevents mid-execution cancellation

    Agent->>Tool: create named asyncio task (tool execution)
    Tool->>Tool: run tool logic
    Note over Tool: task runs without being cancelled by speech pause

    Tool-->>Agent: tool completes / returns result
    Finally->>Speech: restore tool_cancelable = old_state
    Note over Speech: original cancelability restored
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related PRs

Poem

🐰 A rabbit hops where code flags gleam,
I tuck cancels away behind a seam,
Tasks run named, safe from sudden stops,
Pauses canceled with gentler plops,
I nibble bugs and dance on logs—hip, hop! 🥕

🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 8.70% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'prevent tool cancellation when AgentTask is called inside it' is specific and directly describes the main change, which involves preventing tool cancellation during AgentTask execution to fix a deadlock issue.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

@longcw longcw changed the title fix deadlock when interrupting a tool that awaiting for AgentTask prervent tool cancellation when a AgentTask is called Jan 22, 2026
@longcw longcw changed the title prervent tool cancellation when a AgentTask is called prevent tool cancellation when AgentTask is called inside it Jan 22, 2026
Copy link
Member

@chenghao-mou chenghao-mou left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I tested with the email example, and it worked.

@longcw longcw requested a review from a team February 3, 2026 02:32
Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View issue and 5 additional flags in Devin Review.

Open in Devin Review

Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View issue and 8 additional flags in Devin Review.

Open in Devin Review

Comment on lines 2797 to 2799
self._paused_speech = None

if self._session.options.resume_false_interruption and self._session.output.audio:

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 Paused speech state cleared prematurely when allow_interruptions is False

When _cancel_speech_pause is called while an AgentTask has temporarily disabled interruptions (by setting speech_handle.allow_interruptions = False), the method still clears _paused_speech = None and calls resume() even though it skipped the interrupt logic.

Click to expand

Scenario

  1. Speech is playing with allow_interruptions=True
  2. User speaks, triggering _interrupt_by_audio_activity which pauses the audio and sets _paused_speech = self._current_speech
  3. Tool execution calls await AgentTask(), which sets speech_handle.allow_interruptions = False (line 769 in agent.py)
  4. User's final transcript triggers on_final_transcript which creates a task to call _cancel_speech_pause
  5. In _cancel_speech_pause, the condition at line 2789-2792 evaluates to False because allow_interruptions is now False
  6. The interrupt block is skipped, but _paused_speech = None is still executed (line 2797)
  7. Audio is resumed if resume_false_interruption option is set (line 2799-2800)

Impact

The paused speech reference is cleared prematurely while an AgentTask is running. When the AgentTask completes and restores allow_interruptions, the false interruption handling state has already been cleared. This could cause:

  • Inconsistent state tracking where _paused_speech is None but the speech wasn't properly interrupted
  • The false interruption detection logic won't work correctly after AgentTask completes
  • Audio might resume unexpectedly during AgentTask execution

(Refers to lines 2797-2800)

Recommendation: Consider not clearing _paused_speech and not calling resume() when the speech's allow_interruptions is False due to an AgentTask lock. The cleanup should happen either when the speech is successfully interrupted or when the AgentTask completes and the original interruption handling can proceed.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Comment on lines +2140 to +2142
if speech_handle.interrupted:
await utils.aio.cancel_and_wait(exe_task)
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be removed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is a guard for cancellation https://github.com/livekit/agents/blob/livekit-agents@1.3.12/livekit-agents/livekit/agents/voice/generation.py#L648-L658, we will cancel the tool execution task but not the user's function

Comment on lines -2113 to -2131
msg = chat_ctx.add_message(
role="assistant",
content=forwarded_text,
id=llm_gen_data.id,
interrupted=True,
created_at=reply_started_at,
metrics=assistant_metrics,
)
self._agent._chat_ctx.insert(msg)
self._session._conversation_item_added(msg)
speech_handle._item_added([msg])
current_span.set_attribute(trace_types.ATTR_RESPONSE_TEXT, forwarded_text)

if self._session.agent_state == "speaking":
self._session._update_agent_state("listening")

speech_handle._mark_generation_done()
await utils.aio.cancel_and_wait(exe_task)
return
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was this some duplicated logic?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we have some duplicated code for interrupted and not interrupted. I merged them in this pr.

@theomonnom
Copy link
Member

Otherwise it lgtm, but I'm not sure to follow the logic inside _close_session where we put interrupt=False for the paused speech.

@longcw
Copy link
Contributor Author

longcw commented Feb 3, 2026

_pause_scheduling_task will make sure all the speeches are done or should be ignored if it's in _drain_blocked_tasks, so in _close_session we only cancel the timer related to pause.

@longcw longcw merged commit 1725929 into main Feb 3, 2026
18 checks passed
@longcw longcw deleted the longc/fix-agent-task-interruption branch February 3, 2026 03:08
sam-s10s added a commit to speechmatics/livekit-agents that referenced this pull request Feb 3, 2026
commit c46013d
Author: Long Chen <longch1024@gmail.com>
Date:   Tue Feb 3 20:02:57 2026 +0800

    add exclude_config_update to ChatContext copy (livekit#4700)

commit 7849a8c
Author: Chenghao Mou <chenghao.mou@livekit.io>
Date:   Tue Feb 3 09:51:07 2026 +0000

    fix: commit user turn with STT and realtime (livekit#4663)

commit edfa391
Author: Chenghao Mou <chenghao.mou@livekit.io>
Date:   Tue Feb 3 09:48:36 2026 +0000

    add STT usage for google (livekit#4599)

commit 34d0d62
Author: Long Chen <longch1024@gmail.com>
Date:   Tue Feb 3 15:53:42 2026 +0800

    fix gemini live tool execution interrupted by generation_complete event (livekit#4699)

commit 1725929
Author: Long Chen <longch1024@gmail.com>
Date:   Tue Feb 3 11:08:27 2026 +0800

    prevent tool cancellation when AgentTask is called inside it (livekit#4586)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Awaiting AgentTask in tool deadlocks

4 participants