-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Description
Description
The thread.message.completed event, emitted by the OpenAI/Azure Assistants API stream upon completion of each assistant message, is entirely unhandled in agent_framework/openai/_assistants_client.py. This event carries the full, resolved annotation array for a completed message — including file IDs, quote text, and start/end indices — and is the canonical source of complete annotation metadata. Because it is silently discarded, there is no code path in the framework that ever processes complete annotation data for Assistants API responses.
Related issue: #4316
TextDeltaBlock.text.annotationsignored during streaming deltas. That fix alone may be insufficient if delta annotations are empty or partial; this bug is the underlying reason why.
Background
The Assistants API streaming protocol emits a sequence of server-sent events during a run. Text content is delivered incrementally via thread.message.delta events, each carrying a MessageDeltaEvent with partial TextDeltaBlock objects. When the message is fully assembled, the API emits a thread.message.completed event carrying the complete Message object, including a fully populated .content[].text.annotations array.
The distinction matters because:
- Delta annotations (
thread.message.delta) may be partial, empty, or inconsistently populated depending on the model and API version - Completed message annotations (
thread.message.completed) are always fully resolved — they contain the definitive file IDs, quote text, and character offsets for every citation in the message
Currently, agent_framework/openai/_assistants_client.py handles thread.message.delta (line 551) but has no handler for thread.message.completed.
Evidence
The full _process_stream_events method in agent_framework/openai/_assistants_client.py (v1.0.0rc2) handles the following events: thread.run.created, thread.run.step.created, thread.message.delta, thread.run.requires_action, and thread.run.completed. The else branch at the end catches everything else:
else:
yield ChatResponseUpdate(
contents=[],
conversation_id=thread_id,
message_id=response_id,
raw_representation=response.data,
response_id=response_id,
role="assistant",
)There is no elif branch for thread.message.completed anywhere in the method. Any thread.message.completed event received from the API falls through to this else branch and yields an empty ChatResponseUpdate(contents=[]). The completed Message object - including its fully resolved .content[].text.annotations array - is silently discarded.
Impact
All users of the Assistants API streaming path (Azure AI Agents and OpenAI Assistants) who use file search or file attachments are affected. Even after a fix to delta-level annotation handling (#4316), citation placeholders such as 【4:0†source】 may still appear unresolved if delta annotations are empty and the completed message event is never processed.
Reproduction
- Configure an Azure AI Agent with a file search tool and one or more uploaded knowledge files.
- Issue a query that triggers a file search and causes the assistant to produce a response with file citations.
- Add a debug log at the
elsebranch of the streaming event loop in_assistants_client.pyto confirmthread.message.completedevents are being received from the API but falling through unhandled. - Observe that no
Contentupdate with resolved annotations is ever emitted after streaming completes.
Expected Behaviour
thread.message.completed is handled in the streaming event loop. The completed Message object's .content array is walked, annotations are mapped to Annotation objects, and a final ChatResponseUpdate is yielded containing Content objects with fully resolved annotation metadata. This may either supplement or replace the streamed delta content, depending on the chosen implementation strategy.
Suggested Fix
Add a handler for thread.message.completed in _process_stream_events in openai/_assistants_client.py, applying the same annotation-to-Annotation mapping pattern used in _responses_client.py lines 1109–1158, adapted for the Assistants API's batched ThreadMessage shape and field paths:
elif response.event == "thread.message.completed" and isinstance(response.data, ThreadMessage):
contents: list[Content] = []
for block in response.data.content:
if block.type != "text":
continue
text_content = Content.from_text(
text=block.text.value,
raw_representation=block,
)
if block.text.annotations:
text_content.annotations = []
for annotation in block.text.annotations:
match annotation.type:
case "file_citation":
text_content.annotations.append(
Annotation(
type="citation",
file_id=annotation.file_citation.file_id,
additional_properties={
"text": annotation.text,
"start_index": annotation.start_index,
"end_index": annotation.end_index,
"quote": annotation.file_citation.quote,
},
raw_representation=annotation,
)
)
case "file_path":
text_content.annotations.append(
Annotation(
type="citation",
file_id=annotation.file_path.file_id,
additional_properties={
"text": annotation.text,
"start_index": annotation.start_index,
"end_index": annotation.end_index,
},
raw_representation=annotation,
)
)
case _:
logger.debug("Unparsed annotation type in thread.message.completed: %s", annotation.type)
contents.append(text_content)
if contents:
yield ChatResponseUpdate(
role="assistant",
contents=contents,
conversation_id=thread_id,
message_id=response_id,
raw_representation=response.data,
response_id=response_id,
)Code Sample
Error Messages / Stack Traces
Package Versions
agent-framework-core v1.0.0rc2 agent-framework-ag-ui v1.0.0b260225 agent-framework-azure-ai v1.0.0rc2
Python Version
Python 3.12