Skip to content

fix(proxy/ProviderClient): honour Retry-After on 429 in callProviderWithRetry#5676

Open
Chen17-sq wants to merge 1 commit into
Helicone:mainfrom
Chen17-sq:fix/5672-retry-after-on-429
Open

fix(proxy/ProviderClient): honour Retry-After on 429 in callProviderWithRetry#5676
Chen17-sq wants to merge 1 commit into
Helicone:mainfrom
Chen17-sq:fix/5672-retry-after-on-429

Conversation

@Chen17-sq
Copy link
Copy Markdown

Closes #5672.

Summary

valhalla/jawn's callProviderWithRetry retried 429 responses with the same async-retry exponentialDelay it uses for 5xx, never consulting the provider's Retry-After header. With the defaults that means three retries fire well within most providers' cooldown windows, each one earning another 429 — exactly the loop the original report calls out.

This PR keeps the rest of the retry machinery untouched and adds two pieces:

parseRetryAfter

A small pure helper that accepts either RFC 7231 form:

  • delta-seconds — e.g. Retry-After: 3030_000 ms.
  • HTTP-date — e.g. Retry-After: Wed, 21 Oct 2026 07:28:00 GMT → milliseconds remaining from now.

It returns null for missing, malformed, or already-in-the-past values so the caller can transparently fall back to the existing backoff. The result is also clamped to 60 s so a hostile or misconfigured upstream cannot freeze the proxy for an unbounded amount of time.

Retry loop change

Inside the async-retry callback, when a 429 is received the proxy now sleeps for the parsed Retry-After value (when present) before throwing. async-retry's exponential backoff still applies on top, so the provider-mandated minimum can only ever lengthen the wait, never shorten one of the existing retries.

if (res.status === 429) {
  const retryAfterMs = parseRetryAfter(res.headers.get("retry-after"));
  if (retryAfterMs !== null && retryAfterMs > 0) {
    await sleep(retryAfterMs);
  }
}
throw new Error(`Status code ${res.status}`);

5xx behaviour is unchanged: 5xx responses still throw immediately and rely on the exponential backoff alone.

Scope

  • One file modified: valhalla/jawn/src/lib/proxy/ProviderClient.ts (+50 lines).
  • One new unit-test file: valhalla/jawn/src/lib/proxy/__tests__/ProviderClient.test.ts covering the helper end-to-end (delta-seconds, HTTP-date, negative cases, the 60 s cap).
  • worker/src/lib/clients/ProviderClient.ts has the same retry pattern and would benefit from the same treatment in a follow-up. I intentionally kept this PR narrow to the file the issue cites; happy to spin off the worker patch in a second PR if you'd prefer that shape.

Tested

$ cd valhalla/jawn
$ npx jest src/lib/proxy/__tests__/ProviderClient.test.ts
PASS src/lib/proxy/__tests__/ProviderClient.test.ts
  parseRetryAfter
    ✓ returns null for missing or empty values
    ✓ parses delta-seconds form into milliseconds
    ✓ trims surrounding whitespace before parsing seconds
    ✓ caps very large delta-seconds at 60 seconds
    ✓ rejects non-numeric, non-date strings
    ✓ parses HTTP-date form into milliseconds from `now`
    ✓ returns null for HTTP-date values in the past
    ✓ caps HTTP-date values far in the future at 60 seconds

Tests: 8 passed, 8 total

$ npx tsc --noEmit       # 0 errors

…ithRetry

Closes Helicone#5672.

valhalla/jawn's ProviderClient retried 429 responses with the same exponentialDelay it uses for 5xx, never consulting the Retry-After header. With async-retry's defaults this means three retries fire well within most providers' cooldown windows, each one earning another 429.

Add a small parseRetryAfter helper that accepts the RFC 7231 delta-seconds and HTTP-date forms, returns the wait in milliseconds, and caps the result at 60 s so a hostile or misconfigured provider cannot freeze the proxy for an unbounded amount of time. Past dates and malformed values return null and the caller falls back to its existing backoff.

Inside the retry callback, after detecting a 429, the proxy now sleeps for the parsed Retry-After value (when present) before throwing, so async-retry's exponential backoff applies on top of the provider-mandated minimum rather than racing it.

5xx behaviour is unchanged. The worker variant at worker/src/lib/clients/ProviderClient.ts has the same retry pattern and would benefit from the same treatment in a follow-up; this PR stays narrowly scoped to the file the issue cites.
Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@vercel
Copy link
Copy Markdown

vercel Bot commented May 16, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

3 Skipped Deployments
Project Deployment Actions Updated (UTC)
helicone Skipped Skipped May 16, 2026 5:38am
helicone-bifrost Skipped Skipped May 16, 2026 5:38am
helicone-eu Skipped Skipped May 16, 2026 5:38am

Request Review

@vercel vercel Bot temporarily deployed to Preview – helicone May 16, 2026 05:38 Inactive
@vercel vercel Bot temporarily deployed to Preview – helicone-eu May 16, 2026 05:38 Inactive
@vercel vercel Bot temporarily deployed to Preview – helicone-bifrost May 16, 2026 05:38 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

ProviderClient.ts — 429 and 5xx use same exponential backoff, Retry-After never consulted

1 participant