✨ Add OpenTelemetry GenAI auto-instrumentation for MLflow compatibility by Ladas · Pull Request #114 · kagenti/agent-examples

Ladas · 2026-02-01T07:49:03Z

Summary

This adds opentelemetry-instrumentation-openai to emit spans with gen_ai.* attributes alongside the existing OpenInference spans.

The OTEL Collector transform processor converts GenAI semantic convention spans to OpenInference format before sending to MLflow, enabling trace visibility in both Phoenix and MLflow observability backends.

Changes:

pyproject.toml: Add opentelemetry-instrumentation-openai dependency
agent.py: Instrument OpenAI calls with GenAI semantic conventions

Related issue(s)

kagenti/kagenti#569

This adds opentelemetry-instrumentation-openai to emit spans with gen_ai.* attributes alongside the existing OpenInference spans. The OTEL Collector transform processor converts GenAI semantic convention spans to OpenInference format before sending to MLflow, enabling trace visibility in both Phoenix and MLflow observability backends. Changes: - pyproject.toml: Add opentelemetry-instrumentation-openai dependency - agent.py: Instrument OpenAI calls with GenAI semantic conventions Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Adds instrumentation to propagate A2A context_id, task_id, and user input to OTEL spans. This enables: - Session/conversation grouping in MLflow UI - Trace filtering by context_id - Request/response visibility in trace details Signed-off-by: Ladislav Smola <lsmola@redhat.com> Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Replace custom a2a.* attributes with standard GenAI semantic conventions: - gen_ai.conversation.id (for session tracking in MLflow) - gen_ai.agent.name, gen_ai.agent.id - gen_ai.request.model, gen_ai.system - gen_ai.prompt, gen_ai.completion This allows MLflow to automatically parse and display session/trace info. See: https://opentelemetry.io/docs/specs/semconv/gen-ai/ Signed-off-by: Ladislav Smola <lsmola@redhat.com> Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Agents now emit only gen_ai.* attributes. The OTEL Collector transforms these to OpenInference (llm.*) for Phoenix and MLflow metadata (mlflow.trace.session) for session tracking. Signed-off-by: Ladislav Smola <lsmola@redhat.com> Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Add input.value and output.value span attributes for MLflow UI columns. These are set alongside gen_ai.prompt/completion for dual compatibility with Phoenix (OpenInference) and MLflow. Signed-off-by: Claude <noreply@anthropic.com> Co-authored-by: Claude <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Make gen_ai.agent.invoke a new root span instead of A2A child span. This allows MLflow UI to read Request/Response from the root span. - Create new trace context (no parent) for agent span - Add link to A2A parent span for debugging reference - Set mlflow.spanInputs/Outputs on root span directly Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

The previous approach using empty Context() didn't break the trace chain. Use trace.set_span_in_context(INVALID_SPAN) to force a new trace_id. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

The start_as_current_span approach wasn't breaking the chain. Use start_span() with explicit context management: - Create empty Context() for new trace - Manually attach/detach context for child spans - Properly end span in finally block Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Properly break trace chain by attaching an empty context before creating the span. This ensures the tracer sees no parent span. Also fixes span lifecycle with proper end() and context detach. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Port observability.py from PR kagenti#105 to provide clean abstraction for: - setup_observability() for one-time tracer initialization - create_agent_span() context manager for AGENT spans - W3C Trace Context propagation for distributed tracing Added OpenInference dependencies (semantic-conventions and instrumentation-langchain) to enable LangChain auto-instrumentation with AGENT span semantics for MLflow compatibility. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Replace ~100 lines of manual OTEL context manipulation in agent.py with the new create_agent_span() context manager. Update __init__.py to call setup_observability() for centralized tracer configuration. This simplifies agent code while maintaining: - LangChain/OpenAI auto-instrumentation - New root span creation (breaks A2A trace chain for MLflow UI) - GenAI/OpenInference semantic conventions - W3C trace context propagation Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Change break_parent_chain default to False so our gen_ai.agent.invoke span becomes a proper child of A2A request spans. This enables: - Full distributed trace visibility in Phoenix and MLflow - Proper tree structure with A2A -> Agent -> LangChain hierarchy - All spans visible under a single trace instead of orphaned roots Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Instead of creating a child span, enrich_current_span() adds GenAI semantic conventions directly to the existing A2A span. This: - Keeps the A2A span as the trace root - Adds gen_ai.prompt, gen_ai.completion, etc. to the root span - Avoids extra nesting in the trace tree - Makes MLflow/Phoenix show GenAI attributes on the root span Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Set mlflow.spanInputs, mlflow.spanOutputs, mlflow.spanType, mlflow.traceName, mlflow.user, mlflow.source directly in the agent's observability.py. This removes the dependency on OTEL Collector transform/genai_to_mlflow processor. The transform can be re-enabled as a fallback for agents that don't set these attributes. Added helpers: - set_span_output(): Sets gen_ai.completion, output.value, mlflow.spanOutputs - set_token_usage(): Sets token count attributes Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

When A2A SDK doesn't have an active span, get_current_span() returns a non-recording span and our attributes would be lost. Now enrich_current_span(): 1. Checks if current span is recording 2. If yes, enriches it with GenAI/MLflow attributes 3. If no, creates a new 'gen_ai.agent.invoke' span This ensures traces are always captured regardless of A2A SDK state. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Add Starlette middleware that creates a root span BEFORE A2A handlers: - Our gen_ai.agent.invoke span becomes the true trace root - A2A spans become children of our span - Full control over MLflow/GenAI attributes on root span Also add static resource attributes for MLflow: - mlflow.traceName, mlflow.source on TracerProvider Resource - gen_ai.agent.name, gen_ai.agent.version, gen_ai.system The middleware parses A2A JSON-RPC request/response to set: - mlflow.spanInputs from user message - mlflow.spanOutputs from agent response - gen_ai.conversation.id for session tracking Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

The middleware now attaches an empty context before creating the span, ensuring our gen_ai.agent.invoke span is a true root span without inheriting parent from W3C Trace Context headers. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Signed-off-by: Ladislav Smola <lsmola@redhat.com>

The middleware is now working correctly - creates a true root span by breaking the parent chain before creating gen_ai.agent.invoke span. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Adds span attributes for MLflow trace list columns: - mlflow.user / enduser.id - from auth header or "anonymous" - mlflow.traceName - agent name - mlflow.runName - agent invoke name - mlflow.source - service name - mlflow.version - agent version - mlflow.trace.session / gen_ai.conversation.id - from context_id or message_id Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

When reading the response body to extract output for MLflow attributes, we must always recreate the Response object since the body_iterator is consumed and cannot be read again. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

The tracing middleware cannot capture output from streaming responses. This fix calls set_span_output() in the agent's execute() method after extracting the final answer, which populates: - mlflow.spanOutputs (for MLflow Response column) - output.value (for Phoenix/OpenInference) - gen_ai.completion (for GenAI semantic conventions) Uses trace.get_current_span() to get the active span created by the tracing middleware. Fixes Task kagenti#8: Streaming response output capture. Signed-off-by: Ladislav Smola <lsmola@redhat.com> Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

The previous approach used trace.get_current_span() in execute() to set mlflow.spanOutputs, but this returned the A2A span, not the middleware-created root span. MLflow reads attributes from the root span only, so the Response column was empty for streaming responses. Fix: - Add ContextVar _root_span_var to store middleware's root span - Add get_root_span() function to retrieve it in agent code - Update agent.py to use get_root_span() instead of trace.get_current_span() Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Per OpenTelemetry GenAI Agent Spans specification: - Span name: "invoke_agent {agent_name}" instead of "gen_ai.agent.invoke" - SpanKind: INTERNAL for in-process agents (not SERVER) - Add gen_ai.operation.name = "invoke_agent" (required) - Add gen_ai.provider.name (required) - Remove duplicate attribute settings Ref: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/ Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

ghcr.io/astral-sh/uv requires authentication which fails on HyperShift clusters without GHCR pull secrets. Switch to python:3.12-slim-bookworm (Docker Hub, no auth needed) and install uv via pip. Affects: weather_service, weather_tool (E2E test agents) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> Signed-off-by: Ladislav Smola <lsmola@redhat.com>

pdettori

/lgtm

Ladas marked this pull request as draft February 1, 2026 07:49

Ladas force-pushed the genai-autoinstrumentation branch 2 times, most recently from 990dea6 to 3524675 Compare February 4, 2026 20:14

Ladas mentioned this pull request Feb 6, 2026

🌱 Otel auto instrumentation extensions #105

Closed

Ladas force-pushed the genai-autoinstrumentation branch from 9955083 to 54b5c4d Compare February 6, 2026 20:40

Ladas and others added 25 commits February 11, 2026 15:54

Update uv.lock for opentelemetry-instrumentation-openai

81dcf06

Signed-off-by: Ladislav Smola <lsmola@redhat.com>

debug: Add logging to tracing middleware

33a5eac

Signed-off-by: Ladislav Smola <lsmola@redhat.com>

debug: Add print statements to middleware

7a892b1

Signed-off-by: Ladislav Smola <lsmola@redhat.com>

Ladas and others added 2 commits February 11, 2026 16:51

Ladas marked this pull request as ready for review February 12, 2026 10:21

Ladas force-pushed the genai-autoinstrumentation branch from ebe6e19 to 7249809 Compare February 12, 2026 10:41

pdettori approved these changes Feb 12, 2026

View reviewed changes

pdettori merged commit 0ced706 into kagenti:main Feb 12, 2026
2 checks passed

Ladas mentioned this pull request Feb 13, 2026

feat: Minimal agent for AuthBridge OTEL (Approach A, zero custom observability) #122

Draft

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

✨ Add OpenTelemetry GenAI auto-instrumentation for MLflow compatibility#114

✨ Add OpenTelemetry GenAI auto-instrumentation for MLflow compatibility#114
pdettori merged 27 commits intokagenti:mainfrom
Ladas:genai-autoinstrumentation

Ladas commented Feb 1, 2026

Uh oh!

pdettori left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Ladas commented Feb 1, 2026

Summary

Related issue(s)

Uh oh!

pdettori left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants