Skip to content

✨ Add OpenTelemetry GenAI auto-instrumentation for MLflow compatibility#114

Merged
pdettori merged 27 commits intokagenti:mainfrom
Ladas:genai-autoinstrumentation
Feb 12, 2026
Merged

✨ Add OpenTelemetry GenAI auto-instrumentation for MLflow compatibility#114
pdettori merged 27 commits intokagenti:mainfrom
Ladas:genai-autoinstrumentation

Conversation

@Ladas
Copy link
Copy Markdown
Contributor

@Ladas Ladas commented Feb 1, 2026

Summary

This adds opentelemetry-instrumentation-openai to emit spans with gen_ai.* attributes alongside the existing OpenInference spans.

The OTEL Collector transform processor converts GenAI semantic convention spans to OpenInference format before sending to MLflow, enabling trace visibility in both Phoenix and MLflow observability backends.

Changes:

  • pyproject.toml: Add opentelemetry-instrumentation-openai dependency
  • agent.py: Instrument OpenAI calls with GenAI semantic conventions

Related issue(s)

kagenti/kagenti#569

@Ladas Ladas marked this pull request as draft February 1, 2026 07:49
@Ladas Ladas force-pushed the genai-autoinstrumentation branch 2 times, most recently from 990dea6 to 3524675 Compare February 4, 2026 20:14
@Ladas Ladas force-pushed the genai-autoinstrumentation branch from 9955083 to 54b5c4d Compare February 6, 2026 20:40
Ladas and others added 25 commits February 11, 2026 15:54
This adds opentelemetry-instrumentation-openai to emit spans with gen_ai.*
attributes alongside the existing OpenInference spans.

The OTEL Collector transform processor converts GenAI semantic convention
spans to OpenInference format before sending to MLflow, enabling trace
visibility in both Phoenix and MLflow observability backends.

Changes:
- pyproject.toml: Add opentelemetry-instrumentation-openai dependency
- agent.py: Instrument OpenAI calls with GenAI semantic conventions

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Adds instrumentation to propagate A2A context_id, task_id, and user input
to OTEL spans. This enables:
- Session/conversation grouping in MLflow UI
- Trace filtering by context_id
- Request/response visibility in trace details

Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Replace custom a2a.* attributes with standard GenAI semantic conventions:
- gen_ai.conversation.id (for session tracking in MLflow)
- gen_ai.agent.name, gen_ai.agent.id
- gen_ai.request.model, gen_ai.system
- gen_ai.prompt, gen_ai.completion

This allows MLflow to automatically parse and display session/trace info.

See: https://opentelemetry.io/docs/specs/semconv/gen-ai/

Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Agents now emit only gen_ai.* attributes. The OTEL Collector
transforms these to OpenInference (llm.*) for Phoenix and
MLflow metadata (mlflow.trace.session) for session tracking.

Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Add input.value and output.value span attributes for MLflow UI columns.
These are set alongside gen_ai.prompt/completion for dual compatibility
with Phoenix (OpenInference) and MLflow.

Signed-off-by: Claude <noreply@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Make gen_ai.agent.invoke a new root span instead of A2A child span.
This allows MLflow UI to read Request/Response from the root span.

- Create new trace context (no parent) for agent span
- Add link to A2A parent span for debugging reference
- Set mlflow.spanInputs/Outputs on root span directly

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
The previous approach using empty Context() didn't break the trace chain.
Use trace.set_span_in_context(INVALID_SPAN) to force a new trace_id.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
The start_as_current_span approach wasn't breaking the chain.
Use start_span() with explicit context management:
- Create empty Context() for new trace
- Manually attach/detach context for child spans
- Properly end span in finally block

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Properly break trace chain by attaching an empty context before
creating the span. This ensures the tracer sees no parent span.

Also fixes span lifecycle with proper end() and context detach.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Port observability.py from PR kagenti#105 to provide clean abstraction for:
- setup_observability() for one-time tracer initialization
- create_agent_span() context manager for AGENT spans
- W3C Trace Context propagation for distributed tracing

Added OpenInference dependencies (semantic-conventions and
instrumentation-langchain) to enable LangChain auto-instrumentation
with AGENT span semantics for MLflow compatibility.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Replace ~100 lines of manual OTEL context manipulation in agent.py
with the new create_agent_span() context manager. Update __init__.py
to call setup_observability() for centralized tracer configuration.

This simplifies agent code while maintaining:
- LangChain/OpenAI auto-instrumentation
- New root span creation (breaks A2A trace chain for MLflow UI)
- GenAI/OpenInference semantic conventions
- W3C trace context propagation

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Change break_parent_chain default to False so our gen_ai.agent.invoke
span becomes a proper child of A2A request spans. This enables:

- Full distributed trace visibility in Phoenix and MLflow
- Proper tree structure with A2A -> Agent -> LangChain hierarchy
- All spans visible under a single trace instead of orphaned roots

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Instead of creating a child span, enrich_current_span() adds GenAI
semantic conventions directly to the existing A2A span. This:

- Keeps the A2A span as the trace root
- Adds gen_ai.prompt, gen_ai.completion, etc. to the root span
- Avoids extra nesting in the trace tree
- Makes MLflow/Phoenix show GenAI attributes on the root span

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Set mlflow.spanInputs, mlflow.spanOutputs, mlflow.spanType,
mlflow.traceName, mlflow.user, mlflow.source directly in the
agent's observability.py.

This removes the dependency on OTEL Collector transform/genai_to_mlflow
processor. The transform can be re-enabled as a fallback for agents
that don't set these attributes.

Added helpers:
- set_span_output(): Sets gen_ai.completion, output.value, mlflow.spanOutputs
- set_token_usage(): Sets token count attributes

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
When A2A SDK doesn't have an active span, get_current_span() returns
a non-recording span and our attributes would be lost.

Now enrich_current_span():
1. Checks if current span is recording
2. If yes, enriches it with GenAI/MLflow attributes
3. If no, creates a new 'gen_ai.agent.invoke' span

This ensures traces are always captured regardless of A2A SDK state.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Add Starlette middleware that creates a root span BEFORE A2A handlers:
- Our gen_ai.agent.invoke span becomes the true trace root
- A2A spans become children of our span
- Full control over MLflow/GenAI attributes on root span

Also add static resource attributes for MLflow:
- mlflow.traceName, mlflow.source on TracerProvider Resource
- gen_ai.agent.name, gen_ai.agent.version, gen_ai.system

The middleware parses A2A JSON-RPC request/response to set:
- mlflow.spanInputs from user message
- mlflow.spanOutputs from agent response
- gen_ai.conversation.id for session tracking

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
The middleware now attaches an empty context before creating the
span, ensuring our gen_ai.agent.invoke span is a true root span
without inheriting parent from W3C Trace Context headers.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
The middleware is now working correctly - creates a true root span
by breaking the parent chain before creating gen_ai.agent.invoke span.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Adds span attributes for MLflow trace list columns:
- mlflow.user / enduser.id - from auth header or "anonymous"
- mlflow.traceName - agent name
- mlflow.runName - agent invoke name
- mlflow.source - service name
- mlflow.version - agent version
- mlflow.trace.session / gen_ai.conversation.id - from context_id or message_id

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
When reading the response body to extract output for MLflow attributes,
we must always recreate the Response object since the body_iterator
is consumed and cannot be read again.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
The tracing middleware cannot capture output from streaming responses.
This fix calls set_span_output() in the agent's execute() method after
extracting the final answer, which populates:
- mlflow.spanOutputs (for MLflow Response column)
- output.value (for Phoenix/OpenInference)
- gen_ai.completion (for GenAI semantic conventions)

Uses trace.get_current_span() to get the active span created by the
tracing middleware.

Fixes Task kagenti#8: Streaming response output capture.

Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
The previous approach used trace.get_current_span() in execute() to
set mlflow.spanOutputs, but this returned the A2A span, not the
middleware-created root span. MLflow reads attributes from the root
span only, so the Response column was empty for streaming responses.

Fix:
- Add ContextVar _root_span_var to store middleware's root span
- Add get_root_span() function to retrieve it in agent code
- Update agent.py to use get_root_span() instead of trace.get_current_span()

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Ladas and others added 2 commits February 11, 2026 16:51
Per OpenTelemetry GenAI Agent Spans specification:
- Span name: "invoke_agent {agent_name}" instead of "gen_ai.agent.invoke"
- SpanKind: INTERNAL for in-process agents (not SERVER)
- Add gen_ai.operation.name = "invoke_agent" (required)
- Add gen_ai.provider.name (required)
- Remove duplicate attribute settings

Ref: https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-agent-spans/

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
ghcr.io/astral-sh/uv requires authentication which fails on
HyperShift clusters without GHCR pull secrets. Switch to
python:3.12-slim-bookworm (Docker Hub, no auth needed) and
install uv via pip.

Affects: weather_service, weather_tool (E2E test agents)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Signed-off-by: Ladislav Smola <lsmola@redhat.com>
Copy link
Copy Markdown
Contributor

@pdettori pdettori left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@pdettori pdettori merged commit 0ced706 into kagenti:main Feb 12, 2026
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants