GitHub - drewburchfield/braintrust: Orchestrate other AI CLIs (Gemini, Codex, Claude Code) for second opinions, research, and codebase analysis

A Claude Code plugin from the not-my-job marketplace.

What it does

Delegates tasks to other AI CLIs running in parallel. Get second opinions on architecture decisions, offload research to models with larger context windows, or run design reviews across multiple models simultaneously.

Braintrust members (consulted in parallel, gated by what's installed and authenticated):

Antigravity CLI (agy) — primary Google AI path (runs your Antigravity account-tier Gemini model)
Gemini CLI — fallback Google AI path with explicit model selection (-m) and @path file context
Codex — GPT-5.x
Grok Build (grok) — Grok 4.3, a distinct xAI opinion set
Claude — via Task tool subagent

agy and Gemini share one "Google AI" slot (agy preferred); the others each contribute an independent opinion, so a full consult is up to four parallel voices.

Commands

Command	What it does
`/braintrust`	Orchestrate a task across multiple AI CLIs
`/consult`	Alias for `/braintrust`

Use Cases

Get second opinions from different models simultaneously
Cross-model code review (Codex exec review or parallel across every available CLI)
Validate architecture decisions
Parallel research across multiple models
Security audits with diverse model perspectives
Offload large-context work to Google AI (agy / gemini 1M context)

v1.7.1 Highlights

agy actually works headless now. agy --print only flushes its answer to a real TTY; as a subprocess it hangs on macOS / returns empty on Win+Linux (upstream bug antigravity-cli#76, open through agy 1.0.4). braintrust now runs agy through a bundled PTY wrapper (/tmp/bt_agy_pty.py, written by the probe) that gives it a pseudo-terminal, strips ANSI, and enforces a real timeout. Verified: a bare call hangs rc=124; the wrapped call returns a full review in ~5-7s, so agy is a live Google AI voice again instead of "always down".
Removed the broken --print-timeout guidance from v1.7.0: agy's own --print-timeout is non-functional in headless use, so the external wrapper timeout is the only real bound.
macOS keychain warm-up added before agy calls to reduce the 1s keyringAuth OAuth re-prompt (antigravity-cli#51).
Windows caveat documented: the PTY trick does not help on Windows; fall back to gemini or scrape the persisted transcript.

v1.7.0 Highlights

Codex now runs as a true clean-slate reviewer. codex exec --ephemeral was never a blank slate: it loaded ~/.codex memories, MCP servers, and the global AGENTS.md, which contaminated reviews (observed: a diff review that answered an unrelated project's spec pulled from a memory) and biased synthesis. Every Codex consult now runs against an isolated throwaway CODEX_HOME (no memories, no MCP, no global AGENTS.md). This also removes the MCP-boot startup hangs.
Stop blackholing stderr. CLI calls now capture stderr to /tmp/bt_<cli>.err instead of 2>/dev/null, so failures are diagnosable. This fixed two long-standing misdiagnoses: an agy "transient empty" that is actually a hard rc=124 timeout, and a Grok AuthorizationRequired that is actually a 403 billing cap (out of credits / spending-limit) that grok login cannot fix.
agy failure handling is exit-code-aware. Cold start (retry once) vs a hard hang (rc=124, stop retrying and fall back to gemini) are now distinguished. Calls set an explicit --print-timeout so agy fails fast with its own diagnostics instead of being killed silently.
Grok error detection. Consults parse the {"type":"error","message":...} object and surface the real cause (e.g. billing) instead of a generic "empty/unauthenticated".

v1.6.0 Highlights

Grok Build (Grok 4.3) added as a first-class braintrust member. Headless: grok -p "..." -m grok-build --output-format json | jq -r '.text'.
Default consult is now "every installed + authenticated CLI" (up to four voices), not a fixed three.
Reliability fix for agy ↔ gemini "thrashing": the model probe now uses generous cold-start timeouts and a warm-up retry per CLI, so a slow first call no longer false-negatives a CLI and silently flips the Google AI path.
Gemini default model is now gemini-3.1-pro-preview (newest-best first). Dogfooding showed gemini-2.5-pro was actually the flakier model on current accounts; it is now a fallback only.
agy vs Gemini model paths documented: agy has no -m flag and runs your account-tier model; Gemini lets you pick the exact model. They are different access paths, not interchangeable.
June 18, 2026 sunset: Gemini CLI free-tier/OAuth access ends. agy is the primary path; the probe handles the switch automatically.

Requirements

All CLIs are optional; braintrust consults whichever are installed and authenticated.

Claude Code CLI — always available as a subagent from Claude Code
Antigravity CLI (agy) — curl -fsSL https://antigravity.google/cli/install.sh | bash (primary Google AI path)
Codex CLI
Grok Build (grok) — curl -fsSL https://x.ai/cli/install.sh | bash, then grok login (needs a Grok subscription for grok-build)
Gemini CLI (power-user fallback for explicit model selection / @path; free-tier sunset 2026-06-18)

Install

claude plugins install braintrust@not-my-job

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
.claude-plugin		.claude-plugin
commands		commands
hooks		hooks
skills/braintrust		skills/braintrust
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

What it does

Commands

Use Cases

v1.7.1 Highlights

v1.7.0 Highlights

v1.6.0 Highlights

Requirements

Install

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

What it does

Commands

Use Cases

v1.7.1 Highlights

v1.7.0 Highlights

v1.6.0 Highlights

Requirements

Install

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages