feat: Tracing and Instrumentation #2309

marklysze · 2026-01-01T07:55:59Z

Why are these changes needed?

This WIP PR introduces OpenTelemetry-based distributed tracing for AG2 multi-agent conversations. It enables observability into agent workflows, LLM calls, tool executions, and human-in-the-loop interactions.

Approach

OpenTelemetry GenAI Semantic Conventions

The implementation follows the OpenTelemetry GenAI Semantic Conventions with AG2-specific extensions. This ensures compatibility with standard observability tools (Grafana, Jaeger, Honeycomb, etc.) while capturing AG2-specific context.

Trace Hierarchy

initiate_chats (multi-chat workflow)
  └── conversation (initiate_chat / a_initiate_chat)
        ├── invoke_agent (generate_reply / a_generate_reply)
        │     ├── chat (LLM API call)
        │     ├── execute_tool (execute_function)
        │     ├── execute_code (code execution)
        │     └── speaker_selection (group chat)
        │           └── invoke_agent (internal)
        │                 └── chat (LLM API call)
        └── await_human_input (get_human_input)

Instrumentation Points

Span Type	Operation	What's Traced
`conversation`	`conversation`	`initiate_chat`, `a_initiate_chat`, `run_chat`, `a_run_chat`
`agent`	`invoke_agent`	`generate_reply`, `a_generate_reply`, remote A2A calls
`llm`	`chat`	All LLM API calls via `OpenAIWrapper.create()`
`tool`	`execute_tool`	`execute_function`, `a_execute_function`
`speaker_selection`	`speaker_selection`	Group chat speaker selection
`human_input`	`await_human_input`	Human-in-the-loop wait time
`code_execution`	`execute_code`	Code block execution
`multi_conversation`	`initiate_chats`	Sequential/parallel multi-chat workflows

Central LLM Instrumentation

All LLM providers (OpenAI, Anthropic, Gemini, Bedrock, Mistral, etc.) are instrumented through a single point: OpenAIWrapper.create(). This captures:

Provider and model names
Token usage (input/output)
Request parameters (temperature, max_tokens, etc.)
Response metadata (finish reasons, cost)
Optional input/output message capture

Distributed Tracing (A2A)

For remote agents using the A2A protocol, trace context is automatically propagated via W3C Trace Context headers, enabling end-to-end traces across service boundaries.

Current API (WIP)

from autogen.instrumentation import (
    setup_instrumentation,
    instrument_agent,
    instrument_llm_wrapper,
    instrument_pattern,
    instrument_chats,
    instrument_a2a_server,
)

# 1. Setup tracer
tracer = setup_instrumentation("my-service", "http://localhost:4317")

# 2. Instrument LLM calls (global, once)
instrument_llm_wrapper(tracer)

# 3. Instrument agents
instrument_agent(my_agent, tracer)

# 4. For group chats, instrument the pattern (auto-instruments all agents)
instrument_pattern(pattern, tracer)

# 5. For multi-chat workflows
instrument_chats(tracer)

# 6. For A2A remote agents
instrument_a2a_server(server, tracer)

Standard Attributes (OTEL GenAI)

gen_ai.operation.name - Operation type
gen_ai.agent.name - Agent name
gen_ai.provider.name - LLM provider
gen_ai.request.model / gen_ai.response.model
gen_ai.usage.input_tokens / gen_ai.usage.output_tokens
gen_ai.tool.name, gen_ai.tool.call.id, gen_ai.tool.call.arguments
gen_ai.input.messages / gen_ai.output.messages

AG2-Specific Extensions

ag2.span.type - Span classification
ag2.speaker_selection.candidates / ag2.speaker_selection.selected
ag2.human_input.prompt / ag2.human_input.response
ag2.code_execution.exit_code / ag2.code_execution.output
ag2.chats.count, ag2.chats.mode, ag2.chats.recipients
gen_ai.usage.cost - AG2 cost tracking
gen_ai.conversation.id / gen_ai.conversation.turns

Files

Note: This draft PR contains temporary files under the root tracing folder that you can use to test out the implementation.

File	Purpose
`autogen/instrumentation.py`	Core instrumentation functions
`autogen/tracing/utils.py`	Helper functions (message conversion, attribute extraction)
`tracing/TRACING.md`	Developer documentation
`tracing/OTEL_GENAI_CONVENTION_AG2.md`	Attribute reference
`tracing/agents/*.py`	Example scripts / playground
`tracing/docker-compose.yaml`	Local Tempo + Grafana stack

Local Testing

cd tracing
docker-compose up -d          # Start Tempo + Grafana
python -m tracing.agents.local_agents  # Run example

# View traces at http://localhost:3333 (Grafana)

Status

DRAFT - Open for feedback! Please feel free to provide suggestions for the approach, in particular the API design.

Tracing examples

Related issue number

N/A

Checks

I've included any doc changes needed for https://docs.ag2.ai/. See https://docs.ag2.ai/latest/docs/contributor-guide/documentation/ to build and test documentation locally.
I've added tests (if relevant) corresponding to the changes introduced in this PR.
I've made sure all auto checks have passed.

…port to ClickHouse

codecov · 2026-01-08T20:45:43Z

Codecov Report

❌ Patch coverage is 20.70773% with 605 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
...etry/instrumentators/agent_instrumentators/chat.py	5.31%	178 Missing ⚠️
autogen/opentelemetry/utils.py	17.44%	71 Missing ⚠️
...try/instrumentators/agent_instrumentators/reply.py	16.04%	68 Missing ⚠️
...togen/opentelemetry/instrumentators/llm_wrapper.py	19.11%	55 Missing ⚠️
autogen/opentelemetry/instrumentators/pattern.py	24.28%	53 Missing ⚠️
...etry/instrumentators/agent_instrumentators/code.py	14.54%	47 Missing ⚠️
...etry/instrumentators/agent_instrumentators/tool.py	11.32%	47 Missing ⚠️
...strumentators/agent_instrumentators/human_input.py	15.62%	27 Missing ⚠️
...ry/instrumentators/agent_instrumentators/remote.py	18.18%	27 Missing ⚠️
autogen/opentelemetry/instrumentators/agent.py	38.09%	13 Missing ⚠️
... and 3 more

Files with missing lines	Coverage Δ
autogen/opentelemetry/__init__.py	`100.00% <100.00%> (ø)`
autogen/opentelemetry/consts.py	`100.00% <100.00%> (ø)`
autogen/opentelemetry/instrumentators/__init__.py	`100.00% <100.00%> (ø)`
.../instrumentators/agent_instrumentators/__init__.py	`100.00% <100.00%> (ø)`
autogen/a2a/server.py	`93.22% <62.50%> (-4.90%)`	⬇️
autogen/opentelemetry/setup.py	`76.47% <76.47%> (ø)`
autogen/opentelemetry/instrumentators/a2a.py	`45.45% <45.45%> (ø)`
autogen/opentelemetry/instrumentators/agent.py	`38.09% <38.09%> (ø)`
...strumentators/agent_instrumentators/human_input.py	`15.62% <15.62%> (ø)`
...ry/instrumentators/agent_instrumentators/remote.py	`18.18% <18.18%> (ø)`
... and 7 more

... and 23 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Lancetnik and others added 7 commits December 29, 2025 23:13

chore: setup OTEL infrastructure

95ae56d

feat: impl instrumentation

26f89fe

feat: support remote tracing

29ae413

chore: rename a2a span

a35fdca

Added OpenTelemetry GenAI spans, tool tracing

1d53721

Auto Speaker Selection tracing

611ce2d

HITL and Code Execution tracing (and sync initial chat)

2da0a8f

This comment has been minimized.

Sign in to view

marklysze added 7 commits January 1, 2026 18:57

Merge remote-tracking branch 'origin/main' into feat/tracing

91de898

run_chat trace and more dev examples

954bc2b

Add initiate_chats tracing

8cc0763

A2A Server ignore

c602fe1

Add telemetry packages to docs group

1d48904

Map AG2 Chat ID to conversation id, add otel collector and example ex…

ed389f7

…port to ClickHouse

Merge remote-tracking branch 'origin/main' into feat/tracing

08c9b24

marklysze changed the title ~~feat: Instrumentation~~ feat: Tracing and Instrumentation Jan 2, 2026

marklysze and others added 6 commits January 2, 2026 12:35

Update temp docs for ClickHouse

265451e

Add LLM model provider

b29890b

LLM tracing

e086da0

Merge branch 'main' into feat/tracing

227c149

refactor: new OTEL module structure

0684eff

Merge branch 'feat/tracing' of github.com:ag2ai/ag2 into feat/tracing

641820a

Lancetnik linked an issue Jan 6, 2026 that may be closed by this pull request

[Feature Request]: Observability #2319

Open

Lancetnik added 7 commits January 7, 2026 23:00

feat: expose TracerProvider instead of Tracer in instrumentators

cc2b754

Merge branch 'main' into feat/tracing

4a5880b

Merge branch 'feat/tracing' of github.com:ag2ai/ag2 into feat/tracing

9b797fa

refactor: move instrument_chats to instrument_agent logic

6f3f82a

lint: remove unused tracer

e48d1c9

Merge branch 'main' into feat/tracing

5a4a866

refactor: split instrument_agent to subfunctions

937230a

Merge branch 'feat/tracing' of github.com:ag2ai/ag2 into feat/tracing

59f63c4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: Tracing and Instrumentation #2309

feat: Tracing and Instrumentation #2309

Uh oh!

marklysze commented Jan 1, 2026 •

edited

Loading

Uh oh!

This comment has been minimized.

codecov bot commented Jan 8, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

feat: Tracing and Instrumentation #2309

Are you sure you want to change the base?

feat: Tracing and Instrumentation #2309

Uh oh!

Conversation

marklysze commented Jan 1, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why are these changes needed?

Approach

OpenTelemetry GenAI Semantic Conventions

Trace Hierarchy

Instrumentation Points

Central LLM Instrumentation

Distributed Tracing (A2A)

Current API (WIP)

Standard Attributes (OTEL GenAI)

AG2-Specific Extensions

Files

Local Testing

Status

Tracing examples

Related issue number

Checks

Uh oh!

This comment has been minimized.

codecov bot commented Jan 8, 2026

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

marklysze commented Jan 1, 2026 •

edited

Loading