Skip to content

drewburchfield/braintrust

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

braintrust

A Claude Code plugin from the not-my-job marketplace.

License

What it does

Delegates tasks to other AI CLIs running in parallel. Get second opinions on architecture decisions, offload research to models with larger context windows, or run design reviews across multiple models simultaneously.

Braintrust members (consulted in parallel, gated by what's installed and authenticated):

  • Antigravity CLI (agy) — primary Google AI path (runs your Antigravity account-tier Gemini model)
  • Gemini CLI — fallback Google AI path with explicit model selection (-m) and @path file context
  • Codex — GPT-5.x
  • Grok Build (grok) — Grok 4.3, a distinct xAI opinion set
  • Claude — via Task tool subagent

agy and Gemini share one "Google AI" slot (agy preferred); the others each contribute an independent opinion, so a full consult is up to four parallel voices.

Commands

Command What it does
/braintrust Orchestrate a task across multiple AI CLIs
/consult Alias for /braintrust

Use Cases

  • Get second opinions from different models simultaneously
  • Cross-model code review (Codex exec review or parallel across every available CLI)
  • Validate architecture decisions
  • Parallel research across multiple models
  • Security audits with diverse model perspectives
  • Offload large-context work to Google AI (agy / gemini 1M context)

v1.7.1 Highlights

  • agy actually works headless now. agy --print only flushes its answer to a real TTY; as a subprocess it hangs on macOS / returns empty on Win+Linux (upstream bug antigravity-cli#76, open through agy 1.0.4). braintrust now runs agy through a bundled PTY wrapper (/tmp/bt_agy_pty.py, written by the probe) that gives it a pseudo-terminal, strips ANSI, and enforces a real timeout. Verified: a bare call hangs rc=124; the wrapped call returns a full review in ~5-7s, so agy is a live Google AI voice again instead of "always down".
  • Removed the broken --print-timeout guidance from v1.7.0: agy's own --print-timeout is non-functional in headless use, so the external wrapper timeout is the only real bound.
  • macOS keychain warm-up added before agy calls to reduce the 1s keyringAuth OAuth re-prompt (antigravity-cli#51).
  • Windows caveat documented: the PTY trick does not help on Windows; fall back to gemini or scrape the persisted transcript.

v1.7.0 Highlights

  • Codex now runs as a true clean-slate reviewer. codex exec --ephemeral was never a blank slate: it loaded ~/.codex memories, MCP servers, and the global AGENTS.md, which contaminated reviews (observed: a diff review that answered an unrelated project's spec pulled from a memory) and biased synthesis. Every Codex consult now runs against an isolated throwaway CODEX_HOME (no memories, no MCP, no global AGENTS.md). This also removes the MCP-boot startup hangs.
  • Stop blackholing stderr. CLI calls now capture stderr to /tmp/bt_<cli>.err instead of 2>/dev/null, so failures are diagnosable. This fixed two long-standing misdiagnoses: an agy "transient empty" that is actually a hard rc=124 timeout, and a Grok AuthorizationRequired that is actually a 403 billing cap (out of credits / spending-limit) that grok login cannot fix.
  • agy failure handling is exit-code-aware. Cold start (retry once) vs a hard hang (rc=124, stop retrying and fall back to gemini) are now distinguished. Calls set an explicit --print-timeout so agy fails fast with its own diagnostics instead of being killed silently.
  • Grok error detection. Consults parse the {"type":"error","message":...} object and surface the real cause (e.g. billing) instead of a generic "empty/unauthenticated".

v1.6.0 Highlights

  • Grok Build (Grok 4.3) added as a first-class braintrust member. Headless: grok -p "..." -m grok-build --output-format json | jq -r '.text'.
  • Default consult is now "every installed + authenticated CLI" (up to four voices), not a fixed three.
  • Reliability fix for agy ↔ gemini "thrashing": the model probe now uses generous cold-start timeouts and a warm-up retry per CLI, so a slow first call no longer false-negatives a CLI and silently flips the Google AI path.
  • Gemini default model is now gemini-3.1-pro-preview (newest-best first). Dogfooding showed gemini-2.5-pro was actually the flakier model on current accounts; it is now a fallback only.
  • agy vs Gemini model paths documented: agy has no -m flag and runs your account-tier model; Gemini lets you pick the exact model. They are different access paths, not interchangeable.
  • June 18, 2026 sunset: Gemini CLI free-tier/OAuth access ends. agy is the primary path; the probe handles the switch automatically.

Requirements

All CLIs are optional; braintrust consults whichever are installed and authenticated.

  • Claude Code CLI — always available as a subagent from Claude Code
  • Antigravity CLI (agy)curl -fsSL https://antigravity.google/cli/install.sh | bash (primary Google AI path)
  • Codex CLI
  • Grok Build (grok)curl -fsSL https://x.ai/cli/install.sh | bash, then grok login (needs a Grok subscription for grok-build)
  • Gemini CLI (power-user fallback for explicit model selection / @path; free-tier sunset 2026-06-18)

Install

claude plugins install braintrust@not-my-job

License

MIT

About

Orchestrate other AI CLIs (Gemini, Codex, Claude Code) for second opinions, research, and codebase analysis

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages