Skip to content

Unable to access rate limit headers due to exceptions #1477

Open
@Enrico2

Description

@Enrico2

Confirm this is a Node library issue and not an underlying OpenAI API issue

  • This is an issue with the Node library

Describe the bug

I have encountered an issue when handling rate limits. Specifically, when a 429 error occurs, an exception is thrown, and the exception object does not include rate limit headers such as x-ratelimit-reset-requests. This makes it challenging to implement proper retry logic based on the server’s suggested wait times.

This is on the responses.create function, and happens both with and without withResponse()

The exception object does contain some headers, but not the ratelimit headers, e.g.:

{
  "status": 429,
  "headers": {
    "alt-svc": "h3=\":443\"; ma=86400",
    "cf-cache-status": "DYNAMIC",
    "cf-ray": "[redacted]",
    "connection": "keep-alive",
    "content-length": "354",
    "content-type": "application/json",
    "date": "Mon, 21 Apr 2025 19:13:46 GMT",
    "openai-organization": "[redacted]",
    "openai-processing-ms": "5977",
    "openai-version": "2020-10-01",
    "server": "cloudflare",
    "set-cookie": "[redacted]",
    "strict-transport-security": "max-age=31536000; includeSubDomains; preload",
    "x-content-type-options": "nosniff",
    "x-request-id": "[redacted]"
  },
  "request_id": "[redacted]",
  "error": {
    "message": "Rate limit reached for gpt-4.1 in organization [redacted] on tokens per min (TPM): Limit 30000, Used 18271, Requested 16921. Please try again in 10.384s. Visit https://platform.openai.com/account/rate-limits to learn more.",
    "type": "tokens",
    "param": null,
    "code": "rate_limit_exceeded"
  },
  "code": "rate_limit_exceeded",
  "param": null,
  "type": "tokens"
}

Is there a recommended approach to access these headers when a 429 error is thrown? Alternatively, is there a way to prevent the SDK from throwing an exception so that I can inspect the full response, including headers?

Right now my best effort way is to parse the error message that looks like

"Rate limit reached for gpt-4.1 in organization org-anrH4B5v1lvciEaBycvI6qJ4 on tokens per min (TPM): Limit 30000, Used 18271, Requested 16921. Please try again in 10.384s. Visit https://platform.openai.com/account/rate-limits to learn more"

which is obviously really bad practice.

Any guidance on how to handle this scenario effectively would be appreciated.

To Reproduce

  1. create a response using client.responses.create(req), in a for loop so that we can mimic exhausting the rate limit.
  2. see than an exception e is thrown, print JSON.stringify(e)
  3. See there's no way to access rl headers such as x-ratelimit-reset-requests

Code snippets

OS

macOS (but not limited to..)

Node version

Node v22

Library version

openai v4.95.1

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingopenai apiRelated to underlying OpenAI API

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions