-
Notifications
You must be signed in to change notification settings - Fork 447
fix(llmobs): fix input token counting for bedrock prompt caching #13919
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
Bootstrap import analysisComparison of import times between this PR and base. SummaryThe average import time from this PR is: 279 ± 3 ms. The average import time from base is: 281 ± 3 ms. The import time difference between this PR and base is: -1.7 ± 0.1 ms. Import time breakdownThe following import paths have shrunk:
|
BenchmarksBenchmark execution time: 2025-07-15 03:56:37 Comparing candidate commit f2db3db in PR branch Found 0 performance improvements and 2 performance regressions! Performance is the same for 480 metrics, 2 unstable metrics. scenario:iastaspects-lstrip_aspect
scenario:iastaspects-replace_aspect
|
releasenotes/notes/fix-bedrock-input-tokens-unified-calculation-b2c3d4e5f6g7h8i9.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cosmetic nits but LGTM
releasenotes/notes/fix-bedrock-input-tokens-unified-calculation-b2c3d4e5f6g7h8i9.yaml
Outdated
Show resolved
Hide resolved
Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
…n-b2c3d4e5f6g7h8i9.yaml Co-authored-by: Yun Kim <35776586+Yun-Kim@users.noreply.github.com>
When prompt caching is used in bedrock, the number of input tokens returned in the usage field is the number of non cached tokens, not the total number of tokens sent to the model (what we expect in datadog)
This pr fixes this by setting input tokens to the total number of input tokens (cache read + cache write + input tokens)
Checklist
Reviewer Checklist