fix(prometheus): stop hitting DB on every request for budget_reset_at#25588
fix(prometheus): stop hitting DB on every request for budget_reset_at#25588ishaan-berri wants to merge 1 commit intomainfrom
Conversation
budget_reset_at (key/team/user) is now propagated through request metadata at auth time instead of being fetched from DB on every request in the prometheus logging path. - add team_budget_reset_at + user_budget_reset_at fields to UserAPIKeyAuth - populate them from team/user objects during auth (cached in 60s DualCache) - write all three budget_reset_at values to request metadata in pre_call_utils - remove get_key_object / get_team_object / get_user_object DB calls from _assemble_key_object, _assemble_team_object, _assemble_user_object in prometheus.py - add _parse_budget_reset_at helper to parse ISO strings from metadata eliminates up to 3 DB queries per request in the prometheus logging worker
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
Greptile SummaryThis PR eliminates up to 3 DB round-trips per request in the Prometheus logging path by propagating
Confidence Score: 4/5Safe to merge after fixing the stale test assertions that will fail under asyncio_mode=auto. The production code change is well-structured and correctly propagates budget_reset_at through auth→metadata→prometheus without DB calls. The P1 finding is confined to the test file: several newly-added async tests assert that _assemble_team_object/_assemble_user_object fall back to the DB when max_budget is None, but the PR explicitly removes that DB fallback. With asyncio_mode="auto" these tests will run and fail, blocking CI. tests/test_litellm/integrations/test_prometheus_user_team_metrics.py — stale DB-fallback assertions at lines 270–395 and 404–525
|
| Filename | Overview |
|---|---|
| tests/test_litellm/integrations/test_prometheus_user_team_metrics.py | New regression tests for DB-fallback behavior that no longer exists in _assemble_team_object/_assemble_user_object; multiple async test functions will fail when run under asyncio_mode="auto" |
| litellm/integrations/prometheus.py | DB calls for budget_reset_at removed from _assemble_key_object, _assemble_team_object, _assemble_user_object; new _parse_budget_reset_at helper added; _initialize_budget_metrics literal type fixed to include "orgs" |
| litellm/proxy/_types.py | Adds team_budget_reset_at: Optional[datetime] to LiteLLM_VerificationTokenView and user_budget_reset_at: Optional[datetime] to UserAPIKeyAuth; straightforward additions |
| litellm/proxy/auth/user_api_key_auth.py | Propagates team_budget_reset_at from the team object at auth time (line 1457) and user_budget_reset_at from the user object (line 1664); exception fallback path correctly inherits None default |
| litellm/proxy/litellm_pre_call_utils.py | Writes team/user budget_reset_at from UserAPIKeyAuth into request metadata as ISO strings (lines 1172–1180); get_sanitized_user_information_from_key hard-codes both new fields to None (lines 698–699) even though the correct values are on user_api_key_dict |
| litellm/types/utils.py | Adds user_api_key_team_budget_reset_at and user_api_key_user_budget_reset_at Optional[str] fields to StandardLoggingUserAPIKeyMetadata TypedDict |
Sequence Diagram
sequenceDiagram
participant Client
participant AuthLayer as user_api_key_auth.py
participant Cache as DualCache (60s TTL)
participant PreCall as litellm_pre_call_utils.py
participant Prometheus as prometheus.py
Client->>AuthLayer: Request with API key
AuthLayer->>Cache: get_team_object / get_user_object
Cache-->>AuthLayer: team_obj (budget_reset_at), user_obj (budget_reset_at)
AuthLayer->>AuthLayer: valid_token.team_budget_reset_at = team_obj.budget_reset_at
AuthLayer->>AuthLayer: user_budget_reset_at = user_obj.budget_reset_at
AuthLayer-->>PreCall: UserAPIKeyAuth (with team/user budget_reset_at)
PreCall->>PreCall: Write ISO strings to request metadata
Note over PreCall: user_api_key_team_budget_reset_at
PreCall-->>Prometheus: Request metadata
Prometheus->>Prometheus: _parse_budget_reset_at() from metadata
Prometheus->>Prometheus: Emit budget_remaining_hours gauges
Note over Prometheus: NO DB calls for key/team/user
Comments Outside Diff (1)
-
tests/test_litellm/integrations/test_prometheus_user_team_metrics.py, line 270-395 (link)Stale DB-fallback assertions will fail against current code
These tests mock
get_team_object/get_user_objectand assert that_assemble_team_object/_assemble_user_objectpullmax_budgetandbudget_reset_atfrom the DB when metadata isNone. But the new implementations (lines 2987–2994 and 3276–3282 ofprometheus.py) never call those helpers — they build the objects purely from the parameters they receive. Withasyncio_mode = "auto"inpyproject.tomlthese tests will be executed and will fail:test_assemble_team_object_uses_db_max_budget_when_metadata_is_none(line 292): assertsteam_object.max_budget == 3000.0, but new code passesmax_budget=Nonestraight through, returningNone.test_set_team_budget_metrics_after_api_request_no_inf_when_metadata_budget_none(line 353): asserts remaining budget ≠+Inf, but_safe_get_remaining_budget(max_budget=None, ...)returnsfloat("inf").
The same problem applies to the parallel user-budget tests at lines 404–491. The tests need to be updated to verify the new metadata-propagation contract — e.g., confirm that when
budget_reset_atIS supplied as a parameter it is stored correctly — rather than the removed DB-fallback behaviour.
Reviews (1): Last reviewed commit: "fix(prometheus): eliminate DB calls in p..." | Re-trigger Greptile
| else None | ||
| ), | ||
| user_api_key_team_budget_reset_at=None, | ||
| user_api_key_user_budget_reset_at=None, |
There was a problem hiding this comment.
Team/user budget_reset_at always None in
get_sanitized_user_information_from_key
Both new fields are hard-coded to None even though the values are already on user_api_key_dict.team_budget_reset_at / user_api_key_dict.user_budget_reset_at. Within add_litellm_data_to_request this is harmless because lines 1172–1180 overwrite them. But the same function is called standalone from add_headers_to_llm_call (line 516), where the override never happens — so those fields will always be None in the LLM-forwarded headers. Populate them directly here the same way user_api_key_budget_reset_at is populated just above.
Relevant issues
Fixes #24875
Pre-Submission checklist
tests/test_litellm/make test-unitpassesType
🐛 Bug Fix
Changes
The prometheus logging path was making up to 3 DB calls per request — one each for
get_key_object,get_team_object, andget_user_object— just to fetchbudget_reset_at, a date field that changes at most once per billing period.With 200 pods this is a serious DB load problem. The fix propagates
budget_reset_atthrough request metadata at auth time (where team/user objects are already fetched and cached) so prometheus never needs to hit the DB.UserAPIKeyAuthgets two new fields:team_budget_reset_atanduser_budget_reset_atlitellm_pre_call_utils.pywrites all threebudget_reset_atvalues (key + team + user) into request metadata as ISO stringsprometheus.py:_assemble_key_object,_assemble_team_object,_assemble_user_objectnow read from metadata — no DB calls_parse_budget_reset_athelper to parse the ISO strings_initialize_budget_metricsdata_typeliteral now includes"orgs"