-
Notifications
You must be signed in to change notification settings - Fork 2.8k
prevent tool cancellation when AgentTask is called inside it #4586
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
📝 WalkthroughWalkthroughThis PR adds a SpeechHandle Changes
Sequence Diagram(s)sequenceDiagram
participant Agent as Agent
participant Speech as SpeechHandle
participant Tool as Tool Task
participant Finally as Finally
Agent->>Speech: read tool_cancelable (old_state)
Agent->>Speech: set tool_cancelable = False
Note over Speech: prevents mid-execution cancellation
Agent->>Tool: create named asyncio task (tool execution)
Tool->>Tool: run tool logic
Note over Tool: task runs without being cancelled by speech pause
Tool-->>Agent: tool completes / returns result
Finally->>Speech: restore tool_cancelable = old_state
Note over Speech: original cancelability restored
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
Comment |
chenghao-mou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. I tested with the email example, and it worked.
livekit-agents/livekit/agents/voice/transcription/synchronizer.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| self._paused_speech = None | ||
|
|
||
| if self._session.options.resume_false_interruption and self._session.output.audio: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🟡 Paused speech state cleared prematurely when allow_interruptions is False
When _cancel_speech_pause is called while an AgentTask has temporarily disabled interruptions (by setting speech_handle.allow_interruptions = False), the method still clears _paused_speech = None and calls resume() even though it skipped the interrupt logic.
Click to expand
Scenario
- Speech is playing with
allow_interruptions=True - User speaks, triggering
_interrupt_by_audio_activitywhich pauses the audio and sets_paused_speech = self._current_speech - Tool execution calls
await AgentTask(), which setsspeech_handle.allow_interruptions = False(line 769 in agent.py) - User's final transcript triggers
on_final_transcriptwhich creates a task to call_cancel_speech_pause - In
_cancel_speech_pause, the condition at line 2789-2792 evaluates to False becauseallow_interruptionsis now False - The interrupt block is skipped, but
_paused_speech = Noneis still executed (line 2797) - Audio is resumed if
resume_false_interruptionoption is set (line 2799-2800)
Impact
The paused speech reference is cleared prematurely while an AgentTask is running. When the AgentTask completes and restores allow_interruptions, the false interruption handling state has already been cleared. This could cause:
- Inconsistent state tracking where
_paused_speechisNonebut the speech wasn't properly interrupted - The false interruption detection logic won't work correctly after
AgentTaskcompletes - Audio might resume unexpectedly during
AgentTaskexecution
(Refers to lines 2797-2800)
Recommendation: Consider not clearing _paused_speech and not calling resume() when the speech's allow_interruptions is False due to an AgentTask lock. The cleanup should happen either when the speech is successfully interrupted or when the AgentTask completes and the original interruption handling can proceed.
Was this helpful? React with 👍 or 👎 to provide feedback.
| if speech_handle.interrupted: | ||
| await utils.aio.cancel_and_wait(exe_task) | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is a guard for cancellation https://github.com/livekit/agents/blob/livekit-agents@1.3.12/livekit-agents/livekit/agents/voice/generation.py#L648-L658, we will cancel the tool execution task but not the user's function
| msg = chat_ctx.add_message( | ||
| role="assistant", | ||
| content=forwarded_text, | ||
| id=llm_gen_data.id, | ||
| interrupted=True, | ||
| created_at=reply_started_at, | ||
| metrics=assistant_metrics, | ||
| ) | ||
| self._agent._chat_ctx.insert(msg) | ||
| self._session._conversation_item_added(msg) | ||
| speech_handle._item_added([msg]) | ||
| current_span.set_attribute(trace_types.ATTR_RESPONSE_TEXT, forwarded_text) | ||
|
|
||
| if self._session.agent_state == "speaking": | ||
| self._session._update_agent_state("listening") | ||
|
|
||
| speech_handle._mark_generation_done() | ||
| await utils.aio.cancel_and_wait(exe_task) | ||
| return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this some duplicated logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, we have some duplicated code for interrupted and not interrupted. I merged them in this pr.
|
Otherwise it lgtm, but I'm not sure to follow the logic inside _close_session where we put interrupt=False for the paused speech. |
|
|
commit c46013d Author: Long Chen <longch1024@gmail.com> Date: Tue Feb 3 20:02:57 2026 +0800 add exclude_config_update to ChatContext copy (livekit#4700) commit 7849a8c Author: Chenghao Mou <chenghao.mou@livekit.io> Date: Tue Feb 3 09:51:07 2026 +0000 fix: commit user turn with STT and realtime (livekit#4663) commit edfa391 Author: Chenghao Mou <chenghao.mou@livekit.io> Date: Tue Feb 3 09:48:36 2026 +0000 add STT usage for google (livekit#4599) commit 34d0d62 Author: Long Chen <longch1024@gmail.com> Date: Tue Feb 3 15:53:42 2026 +0800 fix gemini live tool execution interrupted by generation_complete event (livekit#4699) commit 1725929 Author: Long Chen <longch1024@gmail.com> Date: Tue Feb 3 11:08:27 2026 +0800 prevent tool cancellation when AgentTask is called inside it (livekit#4586)
fix #4661
when there is a speech generated alongside a tool call, the interruption to the speech shouldn't cancel the tool execution if it's await for an AgentTask.
Summary by CodeRabbit
Bug Fixes
Chores
✏️ Tip: You can customize this high-level summary in your review settings.