Circuit Breaker for LLM Output Monitoring

Clean, pluggable safeguards for LLM responses. Verify outputs, enforce budgets, escalate to a stronger second opinion, and log everything for observability.

Features

Budgets: max tokens, cost (USD), latency, reasoning steps.
Verifiers: categorical judges (hallucination/no_hallucination) with confidence.
Second opinion: ensemble referee with strict/majority/weighted policies and optional corrected answer.
Adapters: Verdict (judges, judge→verify) and DSPy (categorical/yes-no).
Telemetry: JSONL logs + eval2otel-compatible conversion.
Examples: Ollama-ready judge→verify and ensemble demos.

Install

pip install -e .
# Optional extras
pip install verdict      # Verdict adapters / examples
pip install dspy-ai      # DSPy adapters / examples

Quick Start

Run a simple demo with heuristic verifiers:

python -m examples.demo

Run tests:

python -m unittest -q

CLI

python -m scripts.cb_cli --prompt "Explain widgets" --log logs/cb.jsonl \
  --budget-yaml budgets.yaml --eval2otel-json out/eval.json

# Optional DSPy judge
python -m scripts.cb_cli --prompt "Explain widgets" --use-dspy --context "widgets" --log logs/cb.jsonl

# Second-opinion (Verdict + Ollama)
python -m scripts.cb_cli --prompt "Explain widgets" --second-opinion \
  --model ollama/mistral:7b --api-base http://localhost:11434 \
  --temp-judge-a 0.2 --temp-judge-b 0.4 --temp-verify 0.0 --policy strict_pass

Verdict + Ollama (Judge→Verify)

# Pre-reqs
ollama serve && ollama pull mistral:7b
pip install verdict

MODEL=ollama/mistral:7b OLLAMA_API_BASE=http://localhost:11434 \
  python -m examples.verdict_ollama_demo

Or build programmatically:

from examples.verdict_integration import build_judge_then_verify_verifier
from circuit_breaker import CircuitBreaker

v = build_judge_then_verify_verifier(model_name="ollama/mistral:7b", api_base="http://localhost:11434")
breaker = CircuitBreaker(generate_fn=your_generate_fn, verifiers=[v], gamma=0.7)
print(breaker("Use the context to answer..."))

Second Opinion (Ensemble Referee)

Combine multiple verifiers for a stronger decision. Strict acceptance only if all pass:

from circuit_breaker import EnsembleReferee, CircuitBreaker
from examples.verdict_integration import build_judge_then_verify_verifier

v1 = build_judge_then_verify_verifier(model_name="ollama/mistral:7b", api_base="http://localhost:11434")
v2 = build_judge_then_verify_verifier(model_name="ollama/mistral:7b", api_base="http://localhost:11434")

referee = EnsembleReferee(
    verifiers=[v1, v2],
    policy="strict_pass",
    min_confidence=0.7,
    corrected_answer_fn=lambda r,c: r + "\n[Corrected or caveated answer here]",
)

breaker = CircuitBreaker(generate_fn=your_generate_fn, verifiers=[], referee=referee)

CLI demo:

MODEL=ollama/mistral:7b OLLAMA_API_BASE=http://localhost:11434 \
  python -m examples.second_opinion_demo

Policies:

strict_pass: all judges must say no_hallucination
majority: simple majority
weighted_majority: compare summed confidences

Budgets via YAML

from circuit_breaker.budgets import load_budget_from_yaml
budgets = load_budget_from_yaml("budgets.yaml")
breaker = CircuitBreaker(..., budgets=budgets)

Keys: max_tokens, max_cost_usd, latency_threshold_s, max_reasoning_steps, skip_escalation_if_over_budget.

Telemetry (eval2otel)

from circuit_breaker.telemetry import decision_to_eval2otel
eval_obj = decision_to_eval2otel(record, result, operation="chat", system="your-system", request_model="llama3")
# Forward to your eval2otel converter / OpenTelemetry pipeline

Project Layout

circuit_breaker/core.py: CircuitBreaker, models, decisions
circuit_breaker/verifiers.py: base + heuristic verifiers, callable adapter
circuit_breaker/referees.py: EnsembleReferee (second opinion)
circuit_breaker/dspy_adapter.py: DSPy judges (Hallucination, YesNo)
circuit_breaker/verdict_adapter.py: Verdict adapters (categorical, unit)
circuit_breaker/budgets.py: YAML loader
circuit_breaker/logging.py: JSONL logger
circuit_breaker/telemetry.py: eval2otel mapping
examples/*: demos (heuristics, Verdict+Ollama, second opinion)
scripts/cb_cli.py: CLI with budgets, DSPy, second-opinion, eval2otel export
tests/*: unit tests

License

MIT © EvalOps

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
circuit_breaker		circuit_breaker
examples		examples
scripts		scripts
tests		tests
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Circuit Breaker for LLM Output Monitoring

Features

Install

Quick Start

CLI

Verdict + Ollama (Judge→Verify)

Second Opinion (Ensemble Referee)

Budgets via YAML

Telemetry (eval2otel)

Project Layout

License

About

Uh oh!

Releases 1

Packages

Languages

License

evalops/circuit-breaker-llm

Folders and files

Latest commit

History

Repository files navigation

Circuit Breaker for LLM Output Monitoring

Features

Install

Quick Start

CLI

Verdict + Ollama (Judge→Verify)

Second Opinion (Ensemble Referee)

Budgets via YAML

Telemetry (eval2otel)

Project Layout

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages