Skip to content

Python: Fix AG-UI message handling and MCP tool double-call bug#3635

Merged
moonbox3 merged 9 commits intomicrosoft:mainfrom
moonbox3:ag-ui-fixes
Feb 5, 2026
Merged

Python: Fix AG-UI message handling and MCP tool double-call bug#3635
moonbox3 merged 9 commits intomicrosoft:mainfrom
moonbox3:ag-ui-fixes

Conversation

@moonbox3
Copy link
Contributor

@moonbox3 moonbox3 commented Feb 3, 2026

Motivation and Context

Summary

Description

Issue #3568: TextMessageEndEvent missing after tool results

When a tool-only response was detected, a TextMessageStartEvent was emitted to create message context, but TextMessageEndEvent was not emitted after
tool results. Fixed by emitting the end event in _emit_tool_result().

Issue #3619: MessagesSnapshot merging tool_calls and content

The AG-UI protocol expects tool_calls and content to be in separate messages within MessagesSnapshotEvent. The previous implementation merged them into
a single message. Fixed _build_messages_snapshot() to emit separate messages.

JSONDecodeError crash

Malformed JSON in tool arguments could crash the streaming response. Now we skip the confirmation flow with a warning log instead of crashing.

MCP tool double-call bug

Two root causes:

  1. _replace_approval_contents_with_results() placed function_result content in user messages instead of tool messages. OpenAI requires tool results in
    role="tool" messages.
  2. _sanitize_tool_history() didn't remove call_ids from pending tracking after seeing their results, causing duplicate synthetic results.

Fixed by:

  • Adding _convert_approval_results_to_tool_messages() to extract function_result content from user messages into proper tool messages
  • Adding pending_tool_call_ids.discard(call_id) after processing tool results

confirm_changes not in MessagesSnapshotEvent

The confirm_changes tool call events were emitted but not tracked in flow.pending_tool_calls, so the frontend couldn't see them in the snapshot to
render the confirmation dialog. Fixed by tracking confirm_changes in both _emit_approval_request() and the predictive tools path.

Contribution Checklist

  • The code builds clean without any errors or warnings
  • The PR follows the Contribution Guidelines
  • All unit tests pass, and I have added new tests where possible
  • Is this a breaking change? If yes, add "[BREAKING]" prefix to the title of the PR.

Copilot AI review requested due to automatic review settings February 3, 2026 05:48
@moonbox3 moonbox3 self-assigned this Feb 3, 2026
@moonbox3 moonbox3 added the ag-ui label Feb 3, 2026
@moonbox3 moonbox3 moved this to In Review in Agent Framework Feb 3, 2026
@markwallace-microsoft
Copy link
Member

markwallace-microsoft commented Feb 3, 2026

Python Test Coverage

Python Test Coverage Report •
FileStmtsMissCoverMissing
packages/ag-ui/agent_framework_ag_ui
   _message_adapters.py4539579%89, 99–100, 109–112, 115–119, 121–126, 129, 138–144, 147, 151–153, 162–164, 184, 190–192, 222, 235–236, 246–247, 284, 287, 289, 292, 295, 311, 328, 350, 381, 386, 397–398, 449, 465–466, 532–535, 537, 543, 551–552, 554, 558–561, 574, 663–666, 668, 733, 768–770, 772–775, 778–779, 781, 787, 790, 792, 795, 797, 803–804, 806
   _run.py47211875%152–159, 302, 321–322, 337–338, 353, 381–383, 408, 411–414, 416–417, 420–426, 429–431, 434, 450–452, 459, 465–467, 471, 476–478, 480–481, 497–501, 512, 525, 527–528, 544, 565–566, 618–620, 632–634, 832, 843–844, 851, 898–900, 917, 923, 931, 933, 969–975, 978–981, 983–992, 995, 1003–1006, 1013, 1016–1017, 1022, 1028–1030, 1034, 1039–1042, 1056–1058
TOTAL16313192188% 

Python Unit Test Overview

Tests Skipped Failures Errors Time
3943 221 💤 0 ❌ 0 🔥 1m 10s ⏱️

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes several AG-UI streaming and snapshot edge-cases around tool calling, approvals, and MCP integration to align emitted events/messages with protocol and provider constraints.

Changes:

  • Emit TEXT_MESSAGE_END when a tool result arrives while a text message context is open.
  • Rebuild MESSAGES_SNAPSHOT so tool calls and assistant text appear as separate assistant messages.
  • Improve robustness around malformed JSON tool arguments and track confirm_changes in snapshots for UI rendering.

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
python/packages/ag-ui/agent_framework_ag_ui/_run.py Updates streaming event emission, snapshot construction, confirm flow tracking, and adds post-processing for approval/tool history.
python/packages/ag-ui/agent_framework_ag_ui/_message_adapters.py Adjusts tool-history sanitization to filter confirm_changes and fixes pending tool-call tracking behavior.
python/packages/ag-ui/tests/test_run.py Adds regression tests for message end balancing and snapshots; introduces a malformed-JSON test case.
python/packages/ag-ui/tests/test_message_hygiene.py Updates/extends hygiene tests around confirm_changes filtering and approval-result conversion.
python/packages/ag-ui/tests/test_message_adapters.py Updates expectations/docs for approval-modified arguments behavior (LLM context vs snapshot payload).

@moonbox3 moonbox3 enabled auto-merge February 4, 2026 06:54
@moonbox3 moonbox3 added this pull request to the merge queue Feb 5, 2026
Merged via the queue into microsoft:main with commit 4e25917 Feb 5, 2026
23 checks passed
@github-project-automation github-project-automation bot moved this from In Review to Done in Agent Framework Feb 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Done

4 participants