Fix/gemini prompt caching usage feedback #11095

daarko10 · 2025-05-23T15:16:26Z

Title

Fix async callback cache hit reporting and improve token details handling

Relevant issues

Fixes #11058

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

I have added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement
I have added a screenshot of my new test passing locally
My PR passes all unit tests on make test-unit
My PR's scope is as isolated as possible

Type

🐛 Bug Fix
✅ Test

Changes

Async callback cache hit
- Ensure cache_hit flag is set in model_call_details before invoking async success handlers (caching_handler.py)
- Detect and await callable classes with async __call__ in async_log_event, passing through cache_hit to callbacks (custom_logger.py)
Token details handling
- Simplify token-count logic
- Copy chunk.usage into model_response when available
- Switch prompt token detail types from PromptTokensDetails to PromptTokensDetailsWrapper and streamline attribute setting (streaming_chunk_builder_utils.py)
- Introduce extract_cached_tokens helper and use it in _calculate_usage for Gemini integrations, including prompt token detail propagation (vertex_and_google_ai_studio_gemini.py)
- Preserve prompt_tokens_details when converting dict responses to streaming model objects (convert_dict_to_response.py)
- Extend Usage model with explicit prompt_tokens_details and completion_tokens_details fields (types/utils.py)
Testing
- New test utility file for Gemini token detail scenarios (tests/litellm/llms/vertex_ai/gemini/gemini_token_details_test_utils.py)
- Updated existing tests to use helpers
- Verified that:
  1. No more “invalid imports” warnings
  2. Async callbacks receive cache_hit=True on cache hits
  3. Token detail wrappers are correctly populated in all response paths

…verage tests

…ng handler.

…prove test utilities, and streamline usage metadata extraction in Gemini integration.

…aintainability

…odel with additional attributes for caching and usage statistics.

vercel · 2025-05-23T15:16:30Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Comments	Updated (UTC)
litellm	✅ Ready (Inspect)	Visit Preview	💬 Add feedback	May 23, 2025 3:17pm

daarko10 · 2025-05-23T15:26:05Z

@krrishdholakia can you have a look at why the test fail? I didn't touch anything revolving that and when I run it locally it works.

daarko10 added 7 commits May 23, 2025 00:13

Add detailed token usage handling (text, audio, image, cached) and co…

43b9c73

…verage tests

Remove unused imports from Gemini token details unit test.

0d6a368

Handle detailed text token extraction in cached responses for streami…

0945325

…ng handler.

Set default token counts to 0 instead of None in streaming handler.

7555ed7

Refactor token details handling with PromptTokensDetailsWrapper, im…

05c6bb3

…prove test utilities, and streamline usage metadata extraction in Gemini integration.

Refactor test to use expected token count variables for clarity and m…

3c2bd0a

…aintainability

Refactor token detail handling with helper method and enhance Usage m…

a282b33

…odel with additional attributes for caching and usage statistics.

vercel bot deployed to Preview May 23, 2025 15:17 View deployment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Fix/gemini prompt caching usage feedback #11095

Fix/gemini prompt caching usage feedback #11095

Uh oh!

daarko10 commented May 23, 2025 •

edited

Loading

Uh oh!

vercel bot commented May 23, 2025 •

edited

Loading

Uh oh!

daarko10 commented May 23, 2025

Uh oh!

Uh oh!

Uh oh!

Fix/gemini prompt caching usage feedback #11095

Are you sure you want to change the base?

Fix/gemini prompt caching usage feedback #11095

Uh oh!

Conversation

daarko10 commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Title

Relevant issues

Pre-Submission checklist

Type

Changes

Uh oh!

vercel bot commented May 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daarko10 commented May 23, 2025

Uh oh!

Uh oh!

daarko10 commented May 23, 2025 •

edited

Loading

vercel bot commented May 23, 2025 •

edited

Loading