Skip to content

Implement circuit breaker pattern for external API calls #7

@DarrenZal

Description

@DarrenZal

Background

On Jan 9-10, 2026, the BGE Server hit OpenAI API rate limits (429 errors) due to an event flood. While retry logic was added (18b197c), a circuit breaker would provide better protection and faster failure.

Problem

Currently, when OpenAI returns 429 errors:

  1. Each request retries up to 5 times with exponential backoff
  2. During a flood, hundreds of requests queue up, all retrying
  3. This creates a "thundering herd" when the rate limit clears

Proposed Solution

Implement circuit breaker pattern for OpenAI API calls:

States:
┌────────┐     failures > threshold     ┌────────┐
│ CLOSED │ ──────────────────────────▶ │  OPEN  │
└────────┘                              └────────┘
    ▲                                       │
    │         success                       │ timeout
    │    ┌─────────────┐                    │
    └────│ HALF-OPEN   │◀───────────────────┘
         └─────────────┘

CLOSED: Normal operation, requests go through
OPEN: All requests fail immediately (no API call), return cached/error
HALF-OPEN: Allow one test request, if success → CLOSED, if fail → OPEN

Configuration

CIRCUIT_BREAKER_FAILURE_THRESHOLD = 5      # failures before opening
CIRCUIT_BREAKER_SUCCESS_THRESHOLD = 2      # successes to close
CIRCUIT_BREAKER_TIMEOUT = 60               # seconds before half-open

Benefits

  1. Fast failure - Don't waste time on doomed requests
  2. Reduced load - Stop hammering rate-limited API
  3. Graceful degradation - Return cached embeddings or skip

Implementation Options

  1. pybreaker - Python circuit breaker library
  2. Custom implementation - Simple state machine
  3. tenacity - Already handles retries, can add circuit breaker

Files to Modify

  • /opt/projects/koi-processor/src/core/bge_server.py
  • Possibly event bridge if it makes direct API calls

Related

Labels

enhancement, resilience

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions