Skip to content

evalops/circuit-breaker-llm

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Circuit Breaker for LLM Output Monitoring

Clean, pluggable safeguards for LLM responses. Verify outputs, enforce budgets, escalate to a stronger second opinion, and log everything for observability.

Features

  • Budgets: max tokens, cost (USD), latency, reasoning steps.
  • Verifiers: categorical judges (hallucination/no_hallucination) with confidence.
  • Second opinion: ensemble referee with strict/majority/weighted policies and optional corrected answer.
  • Adapters: Verdict (judges, judge→verify) and DSPy (categorical/yes-no).
  • Telemetry: JSONL logs + eval2otel-compatible conversion.
  • Examples: Ollama-ready judge→verify and ensemble demos.

Install

pip install -e .
# Optional extras
pip install verdict      # Verdict adapters / examples
pip install dspy-ai      # DSPy adapters / examples

Quick Start

Run a simple demo with heuristic verifiers:

python -m examples.demo

Run tests:

python -m unittest -q

CLI

python -m scripts.cb_cli --prompt "Explain widgets" --log logs/cb.jsonl \
  --budget-yaml budgets.yaml --eval2otel-json out/eval.json

# Optional DSPy judge
python -m scripts.cb_cli --prompt "Explain widgets" --use-dspy --context "widgets" --log logs/cb.jsonl

# Second-opinion (Verdict + Ollama)
python -m scripts.cb_cli --prompt "Explain widgets" --second-opinion \
  --model ollama/mistral:7b --api-base http://localhost:11434 \
  --temp-judge-a 0.2 --temp-judge-b 0.4 --temp-verify 0.0 --policy strict_pass

Verdict + Ollama (Judge→Verify)

# Pre-reqs
ollama serve && ollama pull mistral:7b
pip install verdict

MODEL=ollama/mistral:7b OLLAMA_API_BASE=http://localhost:11434 \
  python -m examples.verdict_ollama_demo

Or build programmatically:

from examples.verdict_integration import build_judge_then_verify_verifier
from circuit_breaker import CircuitBreaker

v = build_judge_then_verify_verifier(model_name="ollama/mistral:7b", api_base="http://localhost:11434")
breaker = CircuitBreaker(generate_fn=your_generate_fn, verifiers=[v], gamma=0.7)
print(breaker("Use the context to answer..."))

Second Opinion (Ensemble Referee)

Combine multiple verifiers for a stronger decision. Strict acceptance only if all pass:

from circuit_breaker import EnsembleReferee, CircuitBreaker
from examples.verdict_integration import build_judge_then_verify_verifier

v1 = build_judge_then_verify_verifier(model_name="ollama/mistral:7b", api_base="http://localhost:11434")
v2 = build_judge_then_verify_verifier(model_name="ollama/mistral:7b", api_base="http://localhost:11434")

referee = EnsembleReferee(
    verifiers=[v1, v2],
    policy="strict_pass",
    min_confidence=0.7,
    corrected_answer_fn=lambda r,c: r + "\n[Corrected or caveated answer here]",
)

breaker = CircuitBreaker(generate_fn=your_generate_fn, verifiers=[], referee=referee)

CLI demo:

MODEL=ollama/mistral:7b OLLAMA_API_BASE=http://localhost:11434 \
  python -m examples.second_opinion_demo

Policies:

  • strict_pass: all judges must say no_hallucination
  • majority: simple majority
  • weighted_majority: compare summed confidences

Budgets via YAML

from circuit_breaker.budgets import load_budget_from_yaml
budgets = load_budget_from_yaml("budgets.yaml")
breaker = CircuitBreaker(..., budgets=budgets)

Keys: max_tokens, max_cost_usd, latency_threshold_s, max_reasoning_steps, skip_escalation_if_over_budget.

Telemetry (eval2otel)

from circuit_breaker.telemetry import decision_to_eval2otel
eval_obj = decision_to_eval2otel(record, result, operation="chat", system="your-system", request_model="llama3")
# Forward to your eval2otel converter / OpenTelemetry pipeline

Project Layout

  • circuit_breaker/core.py: CircuitBreaker, models, decisions
  • circuit_breaker/verifiers.py: base + heuristic verifiers, callable adapter
  • circuit_breaker/referees.py: EnsembleReferee (second opinion)
  • circuit_breaker/dspy_adapter.py: DSPy judges (Hallucination, YesNo)
  • circuit_breaker/verdict_adapter.py: Verdict adapters (categorical, unit)
  • circuit_breaker/budgets.py: YAML loader
  • circuit_breaker/logging.py: JSONL logger
  • circuit_breaker/telemetry.py: eval2otel mapping
  • examples/*: demos (heuristics, Verdict+Ollama, second opinion)
  • scripts/cb_cli.py: CLI with budgets, DSPy, second-opinion, eval2otel export
  • tests/*: unit tests

License

MIT © EvalOps

About

Circuit Breaker for LLM output monitoring with budgets, verifiers, Verdict/DSPy adapters, and Ollama examples.

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages