[Feature]: Proxy - return OpenAI compatible remaining_requests and remaining_tokens headers 

### The Feature

> he was using the headers to track usage and remaining quota. He was very surprised to find out his metric was broken on the migration, as LiteLLM was supposed to be a drop-in replacement.

Action items for this ticket
- [x] Return `x-ratelimit-*` headers in responses https://platform.openai.com/docs/guides/rate-limits/usage-tiers 
- [ ] If virtual key / team / user has no rate limit set then return rate limit from litellm model group 
- [ ] If just one deployment in a model group -> return the headers in OpenAI compatible format 
- [ ] If multiple deployments in a model group then return the remaining tokens / requests for the model group 

### Motivation, pitch

- 

### Twitter / LinkedIn details

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Feature]: Proxy - return OpenAI compatible remaining_requests and remaining_tokens headers #5957

The Feature

Motivation, pitch

Twitter / LinkedIn details

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Feature]: Proxy - return OpenAI compatible remaining_requests and remaining_tokens headers #5957

Description

The Feature

Motivation, pitch

Twitter / LinkedIn details

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions