Skip to content

[Feature]: Proxy - return OpenAI compatible remaining_requests and remaining_tokens headers  #5957

Closed
@ishaan-jaff

Description

The Feature

he was using the headers to track usage and remaining quota. He was very surprised to find out his metric was broken on the migration, as LiteLLM was supposed to be a drop-in replacement.

Action items for this ticket

  • Return x-ratelimit-* headers in responses https://platform.openai.com/docs/guides/rate-limits/usage-tiers
  • If virtual key / team / user has no rate limit set then return rate limit from litellm model group
  • If just one deployment in a model group -> return the headers in OpenAI compatible format
  • If multiple deployments in a model group then return the remaining tokens / requests for the model group

Motivation, pitch

Twitter / LinkedIn details

No response

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions