Skip to content

feat(adapters): auto-retry with exponential backoff [closes #118]#292

Merged
EmersonBraun merged 1 commit intomainfrom
phase1/adapter-retry
Apr 15, 2026
Merged

feat(adapters): auto-retry with exponential backoff [closes #118]#292
EmersonBraun merged 1 commit intomainfrom
phase1/adapter-retry

Conversation

@EmersonBraun
Copy link
Copy Markdown
Owner

Summary

Phase 1 #118. Every adapter now retries transient failures by default — no caller code needed.

Retry policy

What Behavior
HTTP 408, 429, 500, 502, 503, 504 retried with exponential backoff
Network errors (fetch throws) retried
HTTP 4xx (except 408/429) terminal — bad request / auth, won't get better
AbortError terminal — caller cancelled
Retry-After header respected (seconds or HTTP date)
Mid-stream errors not retried — partial output already on the wire

Defaults: 3 attempts, base 500ms × 2^n with full jitter, max 8s per delay.

Configuration

import { openai } from '@agentskit/adapters'

const adapter = openai({
  apiKey: KEY,
  model: 'gpt-4o',
  retry: {
    maxAttempts: 5,
    baseDelayMs: 1000,
    maxDelayMs: 30_000,
    jitter: true,
    onRetry: ({ attempt, delayMs, reason }) =>
      console.log(`retry #${attempt} in ${delayMs}ms — ${reason}`),
  },
})

Opt out entirely: retry: { maxAttempts: 1 }.
Custom decision logic: retry: { retryOn: ({ response, error }) => ... }.

Standalone primitive

import { fetchWithRetry } from '@agentskit/adapters'

const res = await fetchWithRetry(
  signal => fetch('https://api.example.com', { signal }),
  controller.signal,
  { maxAttempts: 4 },
)

Wired into

  • openai, anthropic, gemini, ollama, vercelAI
  • All accept the same retry?: RetryOptions field on their config

Tests

13 new cases on @agentskit/adapters (was 50, now 63 passing):

  • Success path (no retry)
  • Terminal 4xx (401, etc.)
  • 429 retry chain
  • 5xx retry chain
  • Retry-After (seconds + HTTP date)
  • Exponential backoff math
  • Exhaustion returns last response
  • Abort propagation (rethrows AbortError)
  • Network error retry chain
  • Mid-attempt abort
  • onRetry hook firing
  • Custom retryOn predicate

Test plan

  • Build succeeds
  • All 63 tests pass
  • Per-adapter config field threaded everywhere
  • Standalone fetchWithRetry exported
  • Reviewer: try a real OpenAI call with mock server returning 503 → success after retry

Refs #118 #211

Phase 1, story #118. Every adapter now retries transient failures by
default — no caller code needed.

Retry policy (default 3 attempts, configurable per-adapter):

  Retried (transient):
    HTTP 408 request timeout
    HTTP 429 rate limit
    HTTP 500 / 502 / 503 / 504 server errors
    Network errors (fetch throws TypeError, ECONNRESET, etc.)

  NOT retried (terminal):
    HTTP 4xx other than 408/429 — bad request or auth issue,
    retrying just repeats the failure
    AbortError — caller cancelled

Backoff: exponential (base 500ms × 2^attempt, capped at 8s) with full
jitter by default. Respects Retry-After response header when present
(both seconds-format and HTTP-date).

Mid-stream: once the response body starts streaming, retries stop.
Partial output is already on the wire — restarting would duplicate it.

Configurable per-adapter:

  openai({
    apiKey, model,
    retry: {
      maxAttempts: 5,
      baseDelayMs: 1000,
      maxDelayMs: 30_000,
      jitter: true,
      onRetry: ({ attempt, delayMs, reason }) =>
        console.log('retry', attempt, 'in', delayMs, 'ms —', reason),
    },
  })

Opt out: retry: { maxAttempts: 1 }.
Custom retryOn predicate also supported.

Architecture:
- New fetchWithRetry primitive in utils.ts — pure, testable, injectable
  sleep for unit tests
- createStreamSource takes optional retry param (back-compat — undefined
  = use defaults)
- Every adapter's config gets an optional retry?: RetryOptions field
- Threaded into openai, anthropic, gemini, ollama, vercel-ai

Tests:
- 13 new cases covering: success path, terminal 4xx, 429 retry,
  5xx retry chain, Retry-After (seconds + HTTP date), exponential
  backoff math, exhaustion behavior, abort propagation, network
  error retry, mid-attempt abort, onRetry hook, custom retryOn
- Total: 63 / 63 passing on @agentskit/adapters (was 50)

Refs #118 #211
@EmersonBraun EmersonBraun merged commit b15b11d into main Apr 15, 2026
1 of 4 checks passed
@EmersonBraun EmersonBraun deleted the phase1/adapter-retry branch April 15, 2026 13:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant