Skip to content

fix(prometheus): stop hitting DB on every request for budget_reset_at#25588

Open
ishaan-berri wants to merge 1 commit intomainfrom
litellm_ishaan_fix_prom_metrics_2
Open

fix(prometheus): stop hitting DB on every request for budget_reset_at#25588
ishaan-berri wants to merge 1 commit intomainfrom
litellm_ishaan_fix_prom_metrics_2

Conversation

@ishaan-berri
Copy link
Copy Markdown
Contributor

Relevant issues

Fixes #24875

Pre-Submission checklist

  • Added at least 1 test in tests/test_litellm/
  • make test-unit passes
  • PR scope is isolated to 1 problem

Type

🐛 Bug Fix

Changes

The prometheus logging path was making up to 3 DB calls per request — one each for get_key_object, get_team_object, and get_user_object — just to fetch budget_reset_at, a date field that changes at most once per billing period.

With 200 pods this is a serious DB load problem. The fix propagates budget_reset_at through request metadata at auth time (where team/user objects are already fetched and cached) so prometheus never needs to hit the DB.

  • UserAPIKeyAuth gets two new fields: team_budget_reset_at and user_budget_reset_at
  • Auth path populates these from team/user objects at auth time (they ride in the 60s DualCache with the rest of the token)
  • litellm_pre_call_utils.py writes all three budget_reset_at values (key + team + user) into request metadata as ISO strings
  • prometheus.py: _assemble_key_object, _assemble_team_object, _assemble_user_object now read from metadata — no DB calls
  • Added _parse_budget_reset_at helper to parse the ISO strings
  • Fixed pre-existing type error: _initialize_budget_metrics data_type literal now includes "orgs"

budget_reset_at (key/team/user) is now propagated through request
metadata at auth time instead of being fetched from DB on every request
in the prometheus logging path.

- add team_budget_reset_at + user_budget_reset_at fields to UserAPIKeyAuth
- populate them from team/user objects during auth (cached in 60s DualCache)
- write all three budget_reset_at values to request metadata in pre_call_utils
- remove get_key_object / get_team_object / get_user_object DB calls from
  _assemble_key_object, _assemble_team_object, _assemble_user_object in prometheus.py
- add _parse_budget_reset_at helper to parse ISO strings from metadata

eliminates up to 3 DB queries per request in the prometheus logging worker
@vercel
Copy link
Copy Markdown

vercel bot commented Apr 12, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Apr 12, 2026 2:07am

Request Review

@codspeed-hq
Copy link
Copy Markdown
Contributor

codspeed-hq bot commented Apr 12, 2026

Merging this PR will not alter performance

✅ 16 untouched benchmarks


Comparing litellm_ishaan_fix_prom_metrics_2 (287c4ae) with main (1edf41c)

Open in CodSpeed

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps bot commented Apr 12, 2026

Greptile Summary

This PR eliminates up to 3 DB round-trips per request in the Prometheus logging path by propagating budget_reset_at for key, team, and user through UserAPIKeyAuth at auth time and reading it from request metadata in prometheus.py. The _assemble_team_object, _assemble_user_object, and _assemble_key_object helpers no longer hit the DB at all; a new _parse_budget_reset_at helper handles ISO-string parsing. A pre-existing Literal type error in _initialize_budget_metrics ("orgs" was missing) is also fixed.

  • P1 — test file: Several async tests added in this PR (test_assemble_team_object_uses_db_max_budget_when_metadata_is_none, test_set_team_budget_metrics_after_api_request_no_inf_when_metadata_budget_none, and their user-budget counterparts) assert DB-fallback behaviour that no longer exists in the new code. With asyncio_mode = "auto" set in pyproject.toml these tests will be collected and run — and will fail.
  • P2: get_sanitized_user_information_from_key hard-codes None for the two new fields even though the values are available on the dict; callers outside add_litellm_data_to_request (e.g. add_headers_to_llm_call) will always see None.

Confidence Score: 4/5

Safe to merge after fixing the stale test assertions that will fail under asyncio_mode=auto.

The production code change is well-structured and correctly propagates budget_reset_at through auth→metadata→prometheus without DB calls. The P1 finding is confined to the test file: several newly-added async tests assert that _assemble_team_object/_assemble_user_object fall back to the DB when max_budget is None, but the PR explicitly removes that DB fallback. With asyncio_mode="auto" these tests will run and fail, blocking CI.

tests/test_litellm/integrations/test_prometheus_user_team_metrics.py — stale DB-fallback assertions at lines 270–395 and 404–525

Important Files Changed

Filename Overview
tests/test_litellm/integrations/test_prometheus_user_team_metrics.py New regression tests for DB-fallback behavior that no longer exists in _assemble_team_object/_assemble_user_object; multiple async test functions will fail when run under asyncio_mode="auto"
litellm/integrations/prometheus.py DB calls for budget_reset_at removed from _assemble_key_object, _assemble_team_object, _assemble_user_object; new _parse_budget_reset_at helper added; _initialize_budget_metrics literal type fixed to include "orgs"
litellm/proxy/_types.py Adds team_budget_reset_at: Optional[datetime] to LiteLLM_VerificationTokenView and user_budget_reset_at: Optional[datetime] to UserAPIKeyAuth; straightforward additions
litellm/proxy/auth/user_api_key_auth.py Propagates team_budget_reset_at from the team object at auth time (line 1457) and user_budget_reset_at from the user object (line 1664); exception fallback path correctly inherits None default
litellm/proxy/litellm_pre_call_utils.py Writes team/user budget_reset_at from UserAPIKeyAuth into request metadata as ISO strings (lines 1172–1180); get_sanitized_user_information_from_key hard-codes both new fields to None (lines 698–699) even though the correct values are on user_api_key_dict
litellm/types/utils.py Adds user_api_key_team_budget_reset_at and user_api_key_user_budget_reset_at Optional[str] fields to StandardLoggingUserAPIKeyMetadata TypedDict

Sequence Diagram

sequenceDiagram
    participant Client
    participant AuthLayer as user_api_key_auth.py
    participant Cache as DualCache (60s TTL)
    participant PreCall as litellm_pre_call_utils.py
    participant Prometheus as prometheus.py

    Client->>AuthLayer: Request with API key
    AuthLayer->>Cache: get_team_object / get_user_object
    Cache-->>AuthLayer: team_obj (budget_reset_at), user_obj (budget_reset_at)
    AuthLayer->>AuthLayer: valid_token.team_budget_reset_at = team_obj.budget_reset_at
    AuthLayer->>AuthLayer: user_budget_reset_at = user_obj.budget_reset_at
    AuthLayer-->>PreCall: UserAPIKeyAuth (with team/user budget_reset_at)
    PreCall->>PreCall: Write ISO strings to request metadata
    Note over PreCall: user_api_key_team_budget_reset_at
    PreCall-->>Prometheus: Request metadata
    Prometheus->>Prometheus: _parse_budget_reset_at() from metadata
    Prometheus->>Prometheus: Emit budget_remaining_hours gauges
    Note over Prometheus: NO DB calls for key/team/user
Loading

Comments Outside Diff (1)

  1. tests/test_litellm/integrations/test_prometheus_user_team_metrics.py, line 270-395 (link)

    P1 Stale DB-fallback assertions will fail against current code

    These tests mock get_team_object / get_user_object and assert that _assemble_team_object / _assemble_user_object pull max_budget and budget_reset_at from the DB when metadata is None. But the new implementations (lines 2987–2994 and 3276–3282 of prometheus.py) never call those helpers — they build the objects purely from the parameters they receive. With asyncio_mode = "auto" in pyproject.toml these tests will be executed and will fail:

    • test_assemble_team_object_uses_db_max_budget_when_metadata_is_none (line 292): asserts team_object.max_budget == 3000.0, but new code passes max_budget=None straight through, returning None.
    • test_set_team_budget_metrics_after_api_request_no_inf_when_metadata_budget_none (line 353): asserts remaining budget ≠ +Inf, but _safe_get_remaining_budget(max_budget=None, ...) returns float("inf").

    The same problem applies to the parallel user-budget tests at lines 404–491. The tests need to be updated to verify the new metadata-propagation contract — e.g., confirm that when budget_reset_at IS supplied as a parameter it is stored correctly — rather than the removed DB-fallback behaviour.

Reviews (1): Last reviewed commit: "fix(prometheus): eliminate DB calls in p..." | Re-trigger Greptile

Comment on lines 696 to +699
else None
),
user_api_key_team_budget_reset_at=None,
user_api_key_user_budget_reset_at=None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Team/user budget_reset_at always None in get_sanitized_user_information_from_key

Both new fields are hard-coded to None even though the values are already on user_api_key_dict.team_budget_reset_at / user_api_key_dict.user_budget_reset_at. Within add_litellm_data_to_request this is harmless because lines 1172–1180 overwrite them. But the same function is called standalone from add_headers_to_llm_call (line 516), where the override never happens — so those fields will always be None in the LLM-forwarded headers. Populate them directly here the same way user_api_key_budget_reset_at is populated just above.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants