[Bug]: Gemini cache hit tokens double counted in cost calculator

### What happened?

- When using `gemini-2.5-pro` through LiteLLM with context caching enabled, a single request produced the following usage: `prompt_tokens=262,960`, `prompt_tokens_details.cached_tokens=257,955`, `completion_tokens=1,744`.
- LiteLLM reported `response_cost = 0.7642 USD`, but recalculating with Google Vertex pricing (cache miss tokens × 1.25/million + cache hit tokens × 0.625/million + output tokens × 10/million) gives `0.1849 USD`.
- The gap matches charging cache hits twice: once via `text_tokens * input_cost_per_token` and once via `cache_hit_tokens * cache_read_input_token_cost`.
- Reading `litellm/litellm_core_utils/llm_cost_calc/utils.py` shows `_parse_prompt_tokens_details` keeps `text_tokens` equal to the full prompt count, and `_calculate_input_cost` adds both terms, so Gemini cache hits are double-counted.
- Expected behaviour: cache hit tokens should only be charged at the cache-read rate (after removing them from the normal prompt bucket).

### Relevant log output

LiteLLM usage block:

```json
{
  "total_tokens": 264704,
  "prompt_tokens": 262960,
  "completion_tokens": 1744,
  "prompt_tokens_details": {
    "text_tokens": 262960,
    "cached_tokens": 257955
  },
  "completion_tokens_details": {
    "reasoning_tokens": 0
  },
  "response_cost": 0.7642
}
```

Manual recomputation:
- cache miss tokens = 262,960 - 257,955 = 5,005 → 5,005 × 1.25 / 1e6 = 0.0062563
- cache hit tokens = 257,955 × 0.625 / 1e6 = 0.1612219
- output tokens = 1,744 × 10 / 1e6 = 0.01744
- expected total = 0.1849182 USD

### Are you a ML Ops Team?
No

### What LiteLLM version are you on ?
v1.77.3

### Twitter / LinkedIn details
N/A


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Bug]: Gemini cache hit tokens double counted in cost calculator #14849

What happened?

Relevant log output

Are you a ML Ops Team?

What LiteLLM version are you on ?

Twitter / LinkedIn details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Bug]: Gemini cache hit tokens double counted in cost calculator #14849

Description

What happened?

Relevant log output

Are you a ML Ops Team?

What LiteLLM version are you on ?

Twitter / LinkedIn details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions