Skip to content

Python: [Bug]: thread.message.completed Event Unhandled in Assistants API Streaming #4322

@djw-bsn

Description

@djw-bsn

Description

The thread.message.completed event, emitted by the OpenAI/Azure Assistants API stream upon completion of each assistant message, is entirely unhandled in agent_framework/openai/_assistants_client.py. This event carries the full, resolved annotation array for a completed message — including file IDs, quote text, and start/end indices — and is the canonical source of complete annotation metadata. Because it is silently discarded, there is no code path in the framework that ever processes complete annotation data for Assistants API responses.

Related issue: #4316 TextDeltaBlock.text.annotations ignored during streaming deltas. That fix alone may be insufficient if delta annotations are empty or partial; this bug is the underlying reason why.

Background

The Assistants API streaming protocol emits a sequence of server-sent events during a run. Text content is delivered incrementally via thread.message.delta events, each carrying a MessageDeltaEvent with partial TextDeltaBlock objects. When the message is fully assembled, the API emits a thread.message.completed event carrying the complete Message object, including a fully populated .content[].text.annotations array.

The distinction matters because:

  • Delta annotations (thread.message.delta) may be partial, empty, or inconsistently populated depending on the model and API version
  • Completed message annotations (thread.message.completed) are always fully resolved — they contain the definitive file IDs, quote text, and character offsets for every citation in the message

Currently, agent_framework/openai/_assistants_client.py handles thread.message.delta (line 551) but has no handler for thread.message.completed.

Evidence

The full _process_stream_events method in agent_framework/openai/_assistants_client.py (v1.0.0rc2) handles the following events: thread.run.created, thread.run.step.created, thread.message.delta, thread.run.requires_action, and thread.run.completed. The else branch at the end catches everything else:

else:
    yield ChatResponseUpdate(
        contents=[],
        conversation_id=thread_id,
        message_id=response_id,
        raw_representation=response.data,
        response_id=response_id,
        role="assistant",
    )

There is no elif branch for thread.message.completed anywhere in the method. Any thread.message.completed event received from the API falls through to this else branch and yields an empty ChatResponseUpdate(contents=[]). The completed Message object - including its fully resolved .content[].text.annotations array - is silently discarded.

Impact

All users of the Assistants API streaming path (Azure AI Agents and OpenAI Assistants) who use file search or file attachments are affected. Even after a fix to delta-level annotation handling (#4316), citation placeholders such as 【4:0†source】 may still appear unresolved if delta annotations are empty and the completed message event is never processed.

Reproduction

  1. Configure an Azure AI Agent with a file search tool and one or more uploaded knowledge files.
  2. Issue a query that triggers a file search and causes the assistant to produce a response with file citations.
  3. Add a debug log at the else branch of the streaming event loop in _assistants_client.py to confirm thread.message.completed events are being received from the API but falling through unhandled.
  4. Observe that no Content update with resolved annotations is ever emitted after streaming completes.

Expected Behaviour

thread.message.completed is handled in the streaming event loop. The completed Message object's .content array is walked, annotations are mapped to Annotation objects, and a final ChatResponseUpdate is yielded containing Content objects with fully resolved annotation metadata. This may either supplement or replace the streamed delta content, depending on the chosen implementation strategy.

Suggested Fix

Add a handler for thread.message.completed in _process_stream_events in openai/_assistants_client.py, applying the same annotation-to-Annotation mapping pattern used in _responses_client.py lines 1109–1158, adapted for the Assistants API's batched ThreadMessage shape and field paths:

elif response.event == "thread.message.completed" and isinstance(response.data, ThreadMessage):
    contents: list[Content] = []
    for block in response.data.content:
        if block.type != "text":
            continue
        text_content = Content.from_text(
            text=block.text.value,
            raw_representation=block,
        )
        if block.text.annotations:
            text_content.annotations = []
            for annotation in block.text.annotations:
                match annotation.type:
                    case "file_citation":
                        text_content.annotations.append(
                            Annotation(
                                type="citation",
                                file_id=annotation.file_citation.file_id,
                                additional_properties={
                                    "text": annotation.text,
                                    "start_index": annotation.start_index,
                                    "end_index": annotation.end_index,
                                    "quote": annotation.file_citation.quote,
                                },
                                raw_representation=annotation,
                            )
                        )
                    case "file_path":
                        text_content.annotations.append(
                            Annotation(
                                type="citation",
                                file_id=annotation.file_path.file_id,
                                additional_properties={
                                    "text": annotation.text,
                                    "start_index": annotation.start_index,
                                    "end_index": annotation.end_index,
                                },
                                raw_representation=annotation,
                            )
                        )
                    case _:
                        logger.debug("Unparsed annotation type in thread.message.completed: %s", annotation.type)
        contents.append(text_content)
    if contents:
        yield ChatResponseUpdate(
            role="assistant",
            contents=contents,
            conversation_id=thread_id,
            message_id=response_id,
            raw_representation=response.data,
            response_id=response_id,
        )

Code Sample

Error Messages / Stack Traces

Package Versions

agent-framework-core v1.0.0rc2 agent-framework-ag-ui v1.0.0b260225 agent-framework-azure-ai v1.0.0rc2

Python Version

Python 3.12

Additional Context

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions