-
Notifications
You must be signed in to change notification settings - Fork 2.6k
feat: Adds telemetry and fixes usage metadata for live mode #2325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 13 commits
195886d
1734afc
6df797e
db32e7e
f38f0ac
a640f2f
596c93f
5abf1f4
6eed324
702eecf
817c630
2052404
bb6437e
5861264
698a8a4
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -230,9 +230,31 @@ def trace_call_llm( | |
| llm_request: The LLM request object. | ||
| llm_response: The LLM response object. | ||
| """ | ||
| span = trace.get_current_span() | ||
| # Special standard Open Telemetry GenaI attributes that indicate | ||
| # that this is a span related to a Generative AI system. | ||
| # For live events with usage metadata, create a new span for each event | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. hm what's the reasoning for creating a new span for every live event? just wondering if it would it cause too much overhead by generating too many spans
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The content of each back-and-forth communication with the live API was overriding the content of the last back-and-forth communication in the span. |
||
| # For regular events or live events without usage data, use the current span | ||
| if ( | ||
| hasattr(invocation_context, 'live_request_queue') | ||
| and invocation_context.live_request_queue | ||
| and llm_response.usage_metadata is not None | ||
| ): | ||
| # Live mode with usage data: create new span for each event | ||
| span_name = f'llm_call_live_event [{event_id[:8]}]' | ||
| with tracer.start_as_current_span(span_name) as span: | ||
| _set_llm_span_attributes( | ||
| span, invocation_context, event_id, llm_request, llm_response | ||
| ) | ||
| else: | ||
| # Regular mode or live mode without usage data: use current span | ||
| span = trace.get_current_span() | ||
| _set_llm_span_attributes( | ||
| span, invocation_context, event_id, llm_request, llm_response | ||
| ) | ||
|
|
||
|
|
||
| def _set_llm_span_attributes( | ||
| span, invocation_context, event_id, llm_request, llm_response | ||
| ): | ||
| """Set LLM span attributes.""" | ||
| span.set_attribute('gen_ai.system', 'gcp.vertex.agent') | ||
| span.set_attribute('gen_ai.request.model', llm_request.model) | ||
| span.set_attribute( | ||
|
|
@@ -271,10 +293,11 @@ def trace_call_llm( | |
| ) | ||
|
|
||
| if llm_response.usage_metadata is not None: | ||
| span.set_attribute( | ||
| 'gen_ai.usage.input_tokens', | ||
| llm_response.usage_metadata.prompt_token_count, | ||
| ) | ||
| if llm_response.usage_metadata.prompt_token_count is not None: | ||
| span.set_attribute( | ||
| 'gen_ai.usage.input_tokens', | ||
| llm_response.usage_metadata.prompt_token_count, | ||
| ) | ||
| if llm_response.usage_metadata.candidates_token_count is not None: | ||
| span.set_attribute( | ||
| 'gen_ai.usage.output_tokens', | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will this cause duplicated usage metadata since we're adding it to all the LlmResponses? i.e. in the case that a message contains both
content.partsandmessage.server_content.input_transcriptionThere was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll check. When I created the PR, that was not the case, but perhaps it has changed since then.