Python: fix(python): Handle thread.message.completed event in Assistants API streaming#4333
Conversation
Previously, `thread.message.completed` events fell through to the catch-all `else` branch and yielded empty `ChatResponseUpdate` objects, silently discarding fully-resolved annotation data (file citations, file paths, and their character-offset regions). This commit adds a dedicated handler for `thread.message.completed` that: - Walks the completed ThreadMessage.content array - Extracts text blocks with their fully-resolved annotations - Maps FileCitationAnnotation and FilePathAnnotation to the framework's Annotation type with proper TextSpanRegion data - Yields a ChatResponseUpdate containing the complete text and annotations Fixes microsoft#4322
Tests cover: - File citation annotation extraction - File path annotation extraction - Multiple annotations on a single text block - Text-only messages (no annotations) - Non-text blocks are skipped - Mixed content blocks (text + image) - Conversation ID propagation
There was a problem hiding this comment.
Pull request overview
This PR fixes a bug where thread.message.completed streaming events from the OpenAI/Azure Assistants API were falling through to a catch-all handler, yielding empty ChatResponseUpdate objects and discarding fully-resolved annotation metadata (file citations with IDs, quotes, and character offsets). The fix adds a dedicated event handler that extracts text content and annotations from completed messages, enabling proper citation rendering for users of the Assistants API with file_search or code_interpreter tools.
Changes:
- Added handler for
thread.message.completedevent in_process_stream_eventsto extract fully-resolved annotation data - Added comprehensive test suite covering annotation extraction, edge cases, and conversation ID propagation
- Added imports for
FileCitationAnnotation,FilePathAnnotation,ThreadMessage,Annotation, andTextSpanRegiontypes
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
python/packages/core/agent_framework/openai/_assistants_client.py |
Added imports and new elif branch in _process_stream_events to handle thread.message.completed events, extracting text content and mapping FileCitationAnnotation and FilePathAnnotation to framework Annotation objects with TextSpanRegion data |
python/packages/core/tests/openai/test_assistants_message_completed.py |
New test file with 7 test cases covering file citation extraction, file path extraction, multiple annotations, text-only messages, non-text block skipping, mixed content blocks, and conversation ID propagation |
| if isinstance(annotation, FileCitationAnnotation): | ||
| ann: Annotation = Annotation( | ||
| type="citation", | ||
| additional_properties={ | ||
| "text": annotation.text, | ||
| }, | ||
| raw_representation=annotation, | ||
| ) | ||
| if annotation.file_citation and annotation.file_citation.file_id: | ||
| ann["file_id"] = annotation.file_citation.file_id | ||
| if annotation.start_index is not None and annotation.end_index is not None: | ||
| ann["annotated_regions"] = [ | ||
| TextSpanRegion( | ||
| type="text_span", | ||
| start_index=annotation.start_index, | ||
| end_index=annotation.end_index, | ||
| ) | ||
| ] | ||
| text_content.annotations.append(ann) | ||
| elif isinstance(annotation, FilePathAnnotation): | ||
| ann = Annotation( | ||
| type="citation", | ||
| additional_properties={ | ||
| "text": annotation.text, | ||
| }, | ||
| raw_representation=annotation, | ||
| ) | ||
| if annotation.file_path and annotation.file_path.file_id: | ||
| ann["file_id"] = annotation.file_path.file_id | ||
| if annotation.start_index is not None and annotation.end_index is not None: | ||
| ann["annotated_regions"] = [ | ||
| TextSpanRegion( | ||
| type="text_span", | ||
| start_index=annotation.start_index, | ||
| end_index=annotation.end_index, | ||
| ) | ||
| ] | ||
| text_content.annotations.append(ann) |
There was a problem hiding this comment.
Missing default case for unrecognized annotation types. The implementation should include an else clause to log unhandled annotation types at debug level, consistent with the pattern established in _responses_client.py lines 1176-1180. This ensures visibility into any future annotation types that may be added by the API without requiring code changes.
| if isinstance(annotation, FileCitationAnnotation): | ||
| ann: Annotation = Annotation( | ||
| type="citation", | ||
| additional_properties={ | ||
| "text": annotation.text, | ||
| }, | ||
| raw_representation=annotation, | ||
| ) | ||
| if annotation.file_citation and annotation.file_citation.file_id: | ||
| ann["file_id"] = annotation.file_citation.file_id | ||
| if annotation.start_index is not None and annotation.end_index is not None: | ||
| ann["annotated_regions"] = [ | ||
| TextSpanRegion( | ||
| type="text_span", | ||
| start_index=annotation.start_index, | ||
| end_index=annotation.end_index, | ||
| ) | ||
| ] | ||
| text_content.annotations.append(ann) |
There was a problem hiding this comment.
The quote field from annotation.file_citation.quote should be included in additional_properties to preserve complete citation metadata. The issue description (#4322) explicitly mentions that completed message annotations contain "quote text" that needs to be captured. This field contains the exact text snippet from the source file that was cited, which is valuable for rendering proper citations.
…notations - Include `quote` from `annotation.file_citation.quote` in `additional_properties` for FileCitationAnnotation, preserving the exact cited text snippet from the source file - Add `else` clause to log unrecognized annotation types at debug level, consistent with the pattern in `_responses_client.py` - Add `import logging` and module-level logger
- test_message_completed_with_file_citation_quote: verifies quote is included in additional_properties - test_message_completed_with_file_citation_no_quote: verifies quote is omitted when None - test_message_completed_unrecognized_annotation_logged: verifies unknown annotation types are logged at debug level and skipped
|
Thanks @copilot for the review! Both suggestions were great catches. I've addressed them in the latest commits: 1. Quote field ( 2. Unrecognized annotation fallback ( 3. Test coverage (
|
Summary
Fixes #4322
Previously,
thread.message.completedstreaming events fell through to the catch-allelsebranch in_process_stream_events, yielding emptyChatResponseUpdateobjects. This silently discarded fully-resolved annotation data — file citations with IDs, quotes, and character-offset regions.Changes
_assistants_client.pyFileCitationAnnotation,FilePathAnnotation,Message as ThreadMessagefromopenai.types.beta.threads;Annotation,TextSpanRegionfrom_typeselifbranch forthread.message.completedin_process_stream_eventsthat:ThreadMessage.contentarrayimage_file)FileCitationAnnotation→Annotation(type="citation")withfile_idandTextSpanRegionFilePathAnnotation→ same mapping patternChatResponseUpdatewith the complete text and annotationstest_assistants_message_completed.py(new)7 test cases covering:
Impact
Users of the Assistants API with
file_searchorcode_interpreterwill now receive resolved citation annotations in streaming responses, enabling proper citation rendering.