A harness-agnostic orchestration engine for multi-agent consensus workflows.
Multiple AI agents analyze the same problem, debate claims across rounds, merge overlaps, vote per claim, and converge on structured consensus via one start() call.
packages/argue: reusable NPM library (engine + contracts)packages/argue-cli: command line host (arguecommand)
CLI config lookup order:
--config <path>./argue.config.json~/.config/argue/config.json
Run input (--input <path>, e.g. task.json) is optional and overrides config defaults for the current run. CLI flags override both.
Quick workspace commands:
npm run checknpm testnpm run build
| Problem | How argue solves it |
|---|---|
| Agents produce isolated partial answers | Parallel initial round + cross-round peer context |
| Debate quality is ad-hoc | Claim-level judgements (agree / disagree / revise) |
| Duplicate findings pollute results | Explicit claim merge with deterministic survivor rule |
| Binary final vote loses nuance | Per-claim consensus with configurable threshold |
| Hard to choose a final spokesperson | Peer-review weighted scoring + representative selection |
| Host code gets orchestration-heavy | Engine owns state machine, wait, elimination, and consensus |
┌──────────────────────────────────────────────────────┐
│ argue Engine │
│ │
│ ┌────────┐ ┌────────┐ ┌────────────┐ │
│ │Initial │──>│ Debate │──>│ Final Vote │ │
│ │ Round │ │ Rounds │ │ per claim │ │
│ │ (0) │ │ (1..N) │ │ (N+1) │ │
│ └────────┘ └────────┘ └──────┬─────┘ │
│ early-stop? + claim merge │
│ │ │
│ v │
│ threshold consensus + │
│ elimination-aware voting │
│ │ │
│ v │
│ ┌──────────────┐ │
│ │ Scoring │ │
│ │ peer-review │ │
│ └──────┬───────┘ │
│ │ │
│ v │
│ ┌──────────────┐ │
│ │ Report │ │
│ │ builtin / │ │
│ │representative│ │
│ └──────┬───────┘ │
│ │ │
│ v │
│ ┌──────────────┐ │
│ │ Action │ │
│ │ (optional) │ │
│ └──────┬───────┘ │
└───────────────────────────────|──────────────────────┘
|
^ v
| AgentTaskDelegate ArgueResult
host dispatches status + claims + rounds
round/report/action + representative + metrics
tasks + action output
- Initial round: all participants produce claims.
- Debate rounds (
1..N): participants judge claims, can propose merges, and can revise statements. - Early-stop: after
minRounds, stop early if all returned judgements agree and no new claims appear. - Final vote: participants vote
accept/rejectper active claim. - Claim consensus: each claim is resolved by
acceptCount / totalVoters >= threshold. - Representative: selected from active participants by score (or host-designated active id).
- Report compose:
builtin: engine-generated report with structured summaryrepresentative: engine dispatches a separatekind="report"task; failures fallback to builtin
- Action (optional): if
actionPolicyis set, engine dispatches akind="action"task to an agent (default: representative) to perform real-world operations based on the debate outcome. Action failure does not invalidate the debate result.
Host implements a single delegate with task kind routing:
kind="round"+phase="initial" | "debate" | "final_vote": round taskkind="report": representative report composition taskkind="action": post-debate action execution task
All use the same dispatch + awaitResult interface.
import { ArgueEngine, MemorySessionStore, DefaultWaitCoordinator } from "argue";
const taskDelegate = myTaskDelegate;
const engine = new ArgueEngine({
taskDelegate,
sessionStore: new MemorySessionStore(),
waitCoordinator: new DefaultWaitCoordinator(taskDelegate)
});
const result = await engine.start({
requestId: "review-42",
task: "Review PR #42",
participants: [
{ id: "agent-a", role: "security-reviewer" },
{ id: "agent-b", role: "architecture-reviewer" },
{ id: "agent-c", role: "correctness-reviewer" }
],
roundPolicy: { minRounds: 2, maxRounds: 4 },
consensusPolicy: { threshold: 0.67 },
reportPolicy: { composer: "representative" },
actionPolicy: {
prompt: "Fix all identified issues and post a summary comment on the PR."
}
});
// result.status: consensus | partial_consensus | unresolved | failed
// result.claimResolutions: per-claim vote outcome
// result.representative: selected spokesperson
// result.eliminations: timeout/error removals
// result.rounds: full round records
// result.action?: action output (if actionPolicy was set)Implemented in current master:
- Early-stop (
minRoundsgated) - Claim merge lifecycle (
active/merged/withdrawn,mergedInto, proposer union) - Claim-level final vote +
consensus / partial_consensus / unresolved - Effective-voter denominator per claim
- Permanent elimination on timeout/error (including final_vote)
- Representative report mode via separate report session
- Builtin fallback on representative report failure
- Strict schema cleanup (removed
waitingPolicy.mode,lateArrivalPolicy,reportPolicy.maxReportChars)
Reference implementation ADR: docs/adr/0002-m2-implementation.md
Implemented in current master:
argue run/argue execexecute end-to-endargue actexecute action against existing result- provider types:
mock,cli,api,sdk - CLI agent types:
claude,codex,copilot,gemini,pi,opencode,droid,amp,generic - output artifacts:
result.json,events.jsonl,summary.md --action <prompt>/--action-agent <id>for post-debate action--verbosefor detailed per-agent output with stances, confidence, and full responses
npm install
npm run build # full one-shot build
npm run dev # watch mode — auto-recompiles on saveIn watch mode, open another terminal and test with:
npx argue <command> # uses the latest compiled outputBefore committing:
npm run ci # check + test + buildMIT