Skip to content

Python: Add telemetry for server latency tracing #2249

@castlenthesky

Description

@castlenthesky

otel has established semantic conventions for tracing server latency.

Propose setting up latency metrics for the following:

  • gen_ai.server.request.duration
  • gen_ai.server.time_per_output_token
  • gen_ai.server.time_to_first_token

These metrics allow users to diagnose latency and identify opportunities for pipeline enhancements. This feature will also elevate MS Agent Framework for production use-cases instead of a fun prototyping tool.

Metadata

Metadata

Labels

observabilityIssues related to observability or telemetrypython

Projects

Status

No status

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions