argue

A harness-agnostic orchestration engine for multi-agent consensus workflows.

Multiple AI agents analyze the same problem, debate claims across rounds, merge overlaps, vote per claim, and converge on structured consensus via one start() call.

Monorepo layout

packages/argue: reusable NPM library (engine + contracts)
packages/argue-cli: command line host (argue command)

CLI config lookup order:

--config <path>
./argue.config.json
~/.config/argue/config.json

Run input (--input <path>, e.g. task.json) is optional and overrides config defaults for the current run. CLI flags override both.

Quick workspace commands:

npm run check
npm test
npm run build

Why argue

Problem	How argue solves it
Agents produce isolated partial answers	Parallel initial round + cross-round peer context
Debate quality is ad-hoc	Claim-level judgements (`agree` / `disagree` / `revise`)
Duplicate findings pollute results	Explicit claim merge with deterministic survivor rule
Binary final vote loses nuance	Per-claim consensus with configurable threshold
Hard to choose a final spokesperson	Peer-review weighted scoring + representative selection
Host code gets orchestration-heavy	Engine owns state machine, wait, elimination, and consensus

Architecture at a glance

┌──────────────────────────────────────────────────────┐
│                     argue Engine                     │
│                                                      │
│  ┌────────┐   ┌────────┐   ┌────────────┐            │
│  │Initial │──>│ Debate │──>│ Final Vote │            │
│  │ Round  │   │ Rounds │   │ per claim  │            │
│  │  (0)   │   │ (1..N) │   │  (N+1)     │            │
│  └────────┘   └────────┘   └──────┬─────┘            │
│              early-stop? + claim merge               │
│                              │                       │
│                              v                       │
│                 threshold consensus +                │
│                 elimination-aware voting             │
│                              │                       │
│                              v                       │
│                        ┌──────────────┐              │
│                        │   Scoring    │              │
│                        │ peer-review  │              │
│                        └──────┬───────┘              │
│                              │                       │
│                              v                       │
│                        ┌──────────────┐              │
│                        │    Report    │              │
│                        │  builtin /   │              │
│                        │representative│              │
│                        └──────┬───────┘              │
│                              │                       │
│                              v                       │
│                        ┌──────────────┐              │
│                        │   Action     │              │
│                        │  (optional)  │              │
│                        └──────┬───────┘              │
└───────────────────────────────|──────────────────────┘
                                |
      ^                         v
      | AgentTaskDelegate   ArgueResult
  host dispatches          status + claims + rounds
  round/report/action      + representative + metrics
  tasks                    + action output

Core flow

Initial round: all participants produce claims.
Debate rounds (1..N): participants judge claims, can propose merges, and can revise statements.
Early-stop: after minRounds, stop early if all returned judgements agree and no new claims appear.
Final vote: participants vote accept/reject per active claim.
Claim consensus: each claim is resolved by acceptCount / totalVoters >= threshold.
Representative: selected from active participants by score (or host-designated active id).
Report compose:
- builtin: engine-generated report with structured summary
- representative: engine dispatches a separate kind="report" task; failures fallback to builtin
Action (optional): if actionPolicy is set, engine dispatches a kind="action" task to an agent (default: representative) to perform real-world operations based on the debate outcome. Action failure does not invalidate the debate result.

Delegate contract

Host implements a single delegate with task kind routing:

kind="round" + phase="initial" | "debate" | "final_vote": round task
kind="report": representative report composition task
kind="action": post-debate action execution task

All use the same dispatch + awaitResult interface.

Quick start

import { ArgueEngine, MemorySessionStore, DefaultWaitCoordinator } from "argue";

const taskDelegate = myTaskDelegate;

const engine = new ArgueEngine({
  taskDelegate,
  sessionStore: new MemorySessionStore(),
  waitCoordinator: new DefaultWaitCoordinator(taskDelegate)
});

const result = await engine.start({
  requestId: "review-42",
  task: "Review PR #42",
  participants: [
    { id: "agent-a", role: "security-reviewer" },
    { id: "agent-b", role: "architecture-reviewer" },
    { id: "agent-c", role: "correctness-reviewer" }
  ],
  roundPolicy: { minRounds: 2, maxRounds: 4 },
  consensusPolicy: { threshold: 0.67 },
  reportPolicy: { composer: "representative" },
  actionPolicy: {
    prompt: "Fix all identified issues and post a summary comment on the PR."
  }
});

// result.status: consensus | partial_consensus | unresolved | failed
// result.claimResolutions: per-claim vote outcome
// result.representative: selected spokesperson
// result.eliminations: timeout/error removals
// result.rounds: full round records
// result.action?: action output (if actionPolicy was set)

M2 status

Implemented in current master:

Early-stop (minRounds gated)
Claim merge lifecycle (active/merged/withdrawn, mergedInto, proposer union)
Claim-level final vote + consensus / partial_consensus / unresolved
Effective-voter denominator per claim
Permanent elimination on timeout/error (including final_vote)
Representative report mode via separate report session
Builtin fallback on representative report failure
Strict schema cleanup (removed waitingPolicy.mode, lateArrivalPolicy, reportPolicy.maxReportChars)

Reference implementation ADR: docs/adr/0002-m2-implementation.md

CLI status

Implemented in current master:

argue run / argue exec execute end-to-end
argue act execute action against existing result
provider types: mock, cli, api, sdk
CLI agent types: claude, codex, copilot, gemini, pi, opencode, droid, amp, generic
output artifacts: result.json, events.jsonl, summary.md
--action <prompt> / --action-agent <id> for post-debate action
--verbose for detailed per-agent output with stances, confidence, and full responses

Development

npm install
npm run build            # full one-shot build
npm run dev              # watch mode — auto-recompiles on save

In watch mode, open another terminal and test with:

npx argue <command>      # uses the latest compiled output

Before committing:

npm run ci               # check + test + build

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
.github/workflows		.github/workflows
docs		docs
packages		packages
.gitignore		.gitignore
README.md		README.md
package-lock.json		package-lock.json
package.json		package.json
tsconfig.base.json		tsconfig.base.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

argue

Monorepo layout

Why argue

Architecture at a glance

Core flow

Delegate contract

Quick start

M2 status

CLI status

Development

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

argue

Monorepo layout

Why argue

Architecture at a glance

Core flow

Delegate contract

Quick start

M2 status

CLI status

Development

License

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages