Delegates tasks to other AI CLIs running in parallel. Get second opinions on architecture decisions, offload research to models with larger context windows, or run design reviews across multiple models simultaneously.
Braintrust members (consulted in parallel, gated by what's installed and authenticated):
- Antigravity CLI (agy) — primary Google AI path (runs your Antigravity account-tier Gemini model)
- Gemini CLI — fallback Google AI path with explicit model selection (
-m) and@pathfile context - Codex — GPT-5.x
- Grok Build (grok) — Grok 4.3, a distinct xAI opinion set
- Claude — via Task tool subagent
agy and Gemini share one "Google AI" slot (agy preferred); the others each contribute an independent opinion, so a full consult is up to four parallel voices.
| Command | What it does |
|---|---|
/braintrust |
Orchestrate a task across multiple AI CLIs |
/consult |
Alias for /braintrust |
- Get second opinions from different models simultaneously
- Cross-model code review (Codex
exec reviewor parallel across every available CLI) - Validate architecture decisions
- Parallel research across multiple models
- Security audits with diverse model perspectives
- Offload large-context work to Google AI (agy / gemini 1M context)
- agy actually works headless now.
agy --printonly flushes its answer to a real TTY; as a subprocess it hangs on macOS / returns empty on Win+Linux (upstream bug antigravity-cli#76, open through agy 1.0.4). braintrust now runs agy through a bundled PTY wrapper (/tmp/bt_agy_pty.py, written by the probe) that gives it a pseudo-terminal, strips ANSI, and enforces a real timeout. Verified: a bare call hangsrc=124; the wrapped call returns a full review in ~5-7s, so agy is a live Google AI voice again instead of "always down". - Removed the broken
--print-timeoutguidance from v1.7.0: agy's own--print-timeoutis non-functional in headless use, so the external wrapper timeout is the only real bound. - macOS keychain warm-up added before agy calls to reduce the 1s
keyringAuthOAuth re-prompt (antigravity-cli#51). - Windows caveat documented: the PTY trick does not help on Windows; fall back to gemini or scrape the persisted transcript.
- Codex now runs as a true clean-slate reviewer.
codex exec --ephemeralwas never a blank slate: it loaded~/.codexmemories, MCP servers, and the globalAGENTS.md, which contaminated reviews (observed: a diff review that answered an unrelated project's spec pulled from a memory) and biased synthesis. Every Codex consult now runs against an isolated throwawayCODEX_HOME(no memories, no MCP, no global AGENTS.md). This also removes the MCP-boot startup hangs. - Stop blackholing stderr. CLI calls now capture stderr to
/tmp/bt_<cli>.errinstead of2>/dev/null, so failures are diagnosable. This fixed two long-standing misdiagnoses: an agy "transient empty" that is actually a hardrc=124timeout, and a GrokAuthorizationRequiredthat is actually a 403 billing cap (out of credits / spending-limit) thatgrok logincannot fix. - agy failure handling is exit-code-aware. Cold start (retry once) vs a hard hang (
rc=124, stop retrying and fall back to gemini) are now distinguished. Calls set an explicit--print-timeoutso agy fails fast with its own diagnostics instead of being killed silently. - Grok error detection. Consults parse the
{"type":"error","message":...}object and surface the real cause (e.g. billing) instead of a generic "empty/unauthenticated".
- Grok Build (Grok 4.3) added as a first-class braintrust member. Headless:
grok -p "..." -m grok-build --output-format json | jq -r '.text'. - Default consult is now "every installed + authenticated CLI" (up to four voices), not a fixed three.
- Reliability fix for agy ↔ gemini "thrashing": the model probe now uses generous cold-start timeouts and a warm-up retry per CLI, so a slow first call no longer false-negatives a CLI and silently flips the Google AI path.
- Gemini default model is now
gemini-3.1-pro-preview(newest-best first). Dogfooding showedgemini-2.5-prowas actually the flakier model on current accounts; it is now a fallback only. - agy vs Gemini model paths documented: agy has no
-mflag and runs your account-tier model; Gemini lets you pick the exact model. They are different access paths, not interchangeable. - June 18, 2026 sunset: Gemini CLI free-tier/OAuth access ends. agy is the primary path; the probe handles the switch automatically.
All CLIs are optional; braintrust consults whichever are installed and authenticated.
- Claude Code CLI — always available as a subagent from Claude Code
- Antigravity CLI (agy) —
curl -fsSL https://antigravity.google/cli/install.sh | bash(primary Google AI path) - Codex CLI
- Grok Build (grok) —
curl -fsSL https://x.ai/cli/install.sh | bash, thengrok login(needs a Grok subscription forgrok-build) - Gemini CLI (power-user fallback for explicit model selection /
@path; free-tier sunset 2026-06-18)
claude plugins install braintrust@not-my-job
MIT