Skip to content

feat: Add OTel Gen AI semantic convention input / output messages#1666

Open
brightsparc wants to merge 27 commits intopydantic:mainfrom
brightsparc:julian/semconvs-message
Open

feat: Add OTel Gen AI semantic convention input / output messages#1666
brightsparc wants to merge 27 commits intopydantic:mainfrom
brightsparc:julian/semconvs-message

Conversation

@brightsparc
Copy link
Contributor

Adds OpenTelemetry Gen AI Semantic Convention message attributes including:

gen_ai.system_instructions
gen_ai.input.messages
gen_ai.output.messages

Converts message formats (text, tool_use, tool_result, images) to the standardized semconv message part format.

brightsparc and others added 14 commits January 24, 2026 12:57
- Add TypedDicts in semconv.py (TextPart, ToolCallPart, ChatMessage, etc.)
- Add convert_anthropic_messages_to_semconv() + _convert_anthropic_content_part()
- Add convert_chat_completions_to_semconv() + helpers
- Add convert_responses_inputs_to_semconv()
- Set gen_ai.input.messages and gen_ai.system_instructions on spans

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add OutputMessage TypedDict in semconv.py
- Add convert_anthropic_response_to_semconv() for Anthropic
- Add convert_openai_response_to_semconv() for OpenAI chat completions
- Add convert_responses_outputs_to_semconv() for OpenAI Responses API
- Set gen_ai.output.messages on spans in on_response and stream state

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Anthropic tests still need manual updates due to inline-snapshot issues.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Combines both branches to add semantic convention attributes for:
- Input messages (gen_ai.input.messages)
- System instructions (gen_ai.system_instructions)
- Output messages (gen_ai.output.messages)

Resolves merge conflicts in:
- semconv.py: Combined InputMessages, SystemInstructions, OutputMessage types
- openai.py: Combined imports from both branches
- anthropic.py: Kept both input and output conversion functions
- test_anthropic.py: Combined expected snapshots with both input and output
- test_anthropic_bedrock.py: Combined expected snapshots

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@codecov
Copy link

codecov bot commented Jan 27, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds OpenTelemetry Gen AI semantic convention message attributes to LLM provider instrumentation, including conversion of provider-specific message formats into standardized semconv messages/parts structures.

Changes:

  • Introduces typed semconv definitions for message parts/messages and new attribute constants for input/output/system instructions.
  • Updates OpenAI + Anthropic integrations to emit gen_ai.input.messages, gen_ai.output.messages, and gen_ai.system_instructions where applicable.
  • Expands and updates OTel integration tests to validate semconv message conversion across text/tool/image and edge cases.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
logfire/_internal/integrations/llm_providers/semconv.py Adds message attribute constants and TypedDict-based semconv message/part type definitions.
logfire/_internal/integrations/llm_providers/openai.py Emits semconv input/output/system instruction attributes and adds conversion helpers for Chat Completions + Responses API.
logfire/_internal/integrations/llm_providers/anthropic.py Emits semconv input/output/system instruction attributes and adds conversion helpers for Anthropic message formats.
tests/otel_integrations/test_openai.py Updates snapshots and adds coverage for semconv conversion across tool calls/images/empty inputs/None finish_reason.
tests/otel_integrations/test_anthropic.py Updates snapshots and adds new tests for missing system, missing content, and list/multi-part content conversion.
tests/otel_integrations/test_anthropic_bedrock.py Updates mocked responses and snapshots to include semconv message attributes and finish reason.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

if otel_span is not None and hasattr(otel_span, 'attributes') and otel_span.attributes:
events_attr = otel_span.attributes.get('events')
if isinstance(events_attr, list): # pragma: no cover
existing_events = cast(list[Any], events_attr)
Copy link

Copilot AI Jan 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In on_response for Response, the current logic tries to read existing events from otel_span.attributes only when it’s already a list. However LogfireSpan.set_attribute() serializes non-OTel types (like list-of-dicts) to a JSON str, so events_attr will typically be a string here. This causes existing_events to stay empty and overwrites the input-side events, breaking the “keep 'events' for backward compatibility” intent.

Parse the existing events attribute when it’s a JSON string (or read the pre-serialized value from span._otlp_attributes) and append responses_output_events(response) to that list before setting the attribute.

Suggested change
existing_events = cast(list[Any], events_attr)
existing_events = cast(list[Any], events_attr)
elif isinstance(events_attr, str):
# LogfireSpan.set_attribute() serializes non-OTel types (like list-of-dicts) to JSON strings.
# Try to recover the original list from the JSON string if possible.
with contextlib.suppress(Exception):
parsed = json.loads(events_attr)
if isinstance(parsed, list):
existing_events = cast(list[Any], parsed)
# If we still don't have existing events, try reading the pre-serialized value from _otlp_attributes.
if not existing_events:
otlp_attrs = getattr(span, '_otlp_attributes', None)
if isinstance(otlp_attrs, dict):
otlp_events = otlp_attrs.get('events')
if isinstance(otlp_events, list):
existing_events = cast(list[Any], otlp_events)

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made this changed and resolved merge conflicts.

brightsparc and others added 6 commits February 3, 2026 12:48
The `events` attribute was serialized to a JSON string by
`prepare_otlp_attribute`, so the `isinstance(events_attr, list)` check
always failed and input events were silently overwritten by output events.
Parse the JSON string back to a list when reading existing events.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The non-streaming Responses API tests now correctly show both input
and output events in the backward-compat 'events' attribute.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Resolve conflicts between semconv message attributes (INPUT_MESSAGES,
OUTPUT_MESSAGES, SYSTEM_INSTRUCTIONS) and upstream's new semconv
attributes (INPUT_TOKENS, OUTPUT_TOKENS, RESPONSE_FINISH_REASONS,
RESPONSE_ID, RESPONSE_MODEL). Both sets of attributes are now included.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
These paths handle message types (multimodal list content, tool calls
in input messages, tool responses, named messages) that are not
exercised by the existing VCR-based integration tests.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@brightsparc
Copy link
Contributor Author

@alexmojaki ready for final review from you.

@alexmojaki alexmojaki self-requested a review February 4, 2026 11:22
Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 potential issue.

View issue and 5 additional flags in Devin Review.

Open in Devin Review

brightsparc and others added 3 commits February 4, 2026 05:34
OpenAI's input_audio content part nests data under
input_audio.data, not at the top level. Also narrow the type
check to just 'input_audio' since that's the actual OpenAI type.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View issue and 11 additional flags in Devin Review.

Open in Devin Review

Copy link

@devin-ai-integration devin-ai-integration bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Devin Review found 1 new potential issue.

View issue and 14 additional flags in Devin Review.

Open in Devin Review

Comment on lines +405 to +417
if isinstance(tc, ChatCompletionMessageFunctionToolCall): # pragma: no cover
func_args: Any = tc.function.arguments
if isinstance(func_args, str):
with contextlib.suppress(json.JSONDecodeError):
func_args = json.loads(func_args)
parts.append(
ToolCallPart(
type='tool_call',
id=tc.id,
name=tc.function.name,
arguments=func_args,
)
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔴 Tool calls in OpenAI responses are silently dropped due to incorrect type check

The convert_openai_response_to_semconv function checks if tool calls are instances of ChatCompletionMessageFunctionToolCall, but standard OpenAI chat completion responses return ChatCompletionMessageToolCall objects in message.tool_calls.

Click to expand

Issue Details

At lines 402-417, the code iterates over message.tool_calls and checks:

if isinstance(tc, ChatCompletionMessageFunctionToolCall):  # pragma: no cover

However, ChatCompletionMessageFunctionToolCall is a specialized type used for parsed/structured output scenarios, not the standard ChatCompletionMessageToolCall type that is actually returned in message.tool_calls for regular chat completions with tool use.

Impact

When an OpenAI assistant response includes tool calls, the isinstance check fails silently (the inner block is never executed), and the tool calls are not included in the gen_ai.output.messages semantic convention attribute. This results in incomplete telemetry data where tool call information is lost from the output messages.

Expected vs Actual

  • Expected: Tool calls should be included in the output message parts
  • Actual: Tool calls are silently skipped because the type check fails

Recommendation: Check for ChatCompletionMessageToolCall instead of ChatCompletionMessageFunctionToolCall, or remove the isinstance check entirely since all tool calls in message.tool_calls should have the expected structure with id, function.name, and function.arguments attributes.

Open in Devin Review

Was this helpful? React with 👍 or 👎 to provide feedback.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Devin review is incorrect. ChatCompletionMessageToolCall and ChatCompletionMessageFunctionToolCall are the same class — the SDK aliases them:

>>> from openai.types.chat.chat_completion_message_function_tool_call import ChatCompletionMessageFunctionToolCall
>>> from openai.types.chat import ChatCompletionMessageToolCall
>>> ChatCompletionMessageFunctionToolCall is ChatCompletionMessageToolCall
True

The tool_calls field on ChatCompletionMessage is typed as list[ChatCompletionMessageFunctionToolCall | ChatCompletionMessageCustomToolCall] | None, so the isinstance check exists to filter out the CustomToolCall variant, which is intentional.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants