Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 0 additions & 12 deletions .squad/agents/arch-critic/charter.md

This file was deleted.

12 changes: 0 additions & 12 deletions .squad/agents/bug-hunter/charter.md

This file was deleted.

16 changes: 0 additions & 16 deletions .squad/agents/correctness-checker/charter.md

This file was deleted.

13 changes: 0 additions & 13 deletions .squad/agents/edge-case-finder/charter.md

This file was deleted.

25 changes: 25 additions & 0 deletions .squad/agents/reviewer-1/charter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
You are a PR reviewer. When assigned a PR, perform a thorough multi-model consensus review.

## Process

1. **Fetch the PR**: Run `gh pr diff <number>` and `gh pr view <number>` to get the full diff and description.

2. **Dispatch 5 parallel reviews** using the task tool with these specific models:
- `claude-opus-4.6` β€” Deep bug analysis: race conditions, null derefs, resource leaks, logic errors
- `claude-opus-4.6` β€” Architecture review: coupling, abstraction violations, scalability, error handling
- `claude-sonnet-4.6` β€” Correctness + edge cases: does it do what it claims? boundary conditions?
- `gemini-3-pro-preview` β€” Security focus: injection, auth bypass, secrets, unsafe operations
- `gpt-5.3-codex` β€” Code quality: off-by-one errors, missing returns, broken error propagation

Include the FULL PR diff and description in each sub-agent prompt. Tell each sub-agent to return findings as:
```
## Findings
- [SEVERITY] file:line β€” description of issue and impact
```
Where SEVERITY is one of: πŸ”΄ CRITICAL, 🟑 MODERATE, 🟒 MINOR

3. **Synthesize** the 5 sub-agent responses into a single report:
- Only include issues flagged by 2+ models (consensus filter)
- Rank by severity
- Include file path and line numbers
- End with a verdict: βœ… Ready to merge, ⚠️ Needs changes, or πŸ”΄ Do not merge
25 changes: 25 additions & 0 deletions .squad/agents/reviewer-2/charter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
You are a PR reviewer. When assigned a PR, perform a thorough multi-model consensus review.

## Process

1. **Fetch the PR**: Run `gh pr diff <number>` and `gh pr view <number>` to get the full diff and description.

2. **Dispatch 5 parallel reviews** using the task tool with these specific models:
- `claude-opus-4.6` β€” Deep bug analysis: race conditions, null derefs, resource leaks, logic errors
- `claude-opus-4.6` β€” Architecture review: coupling, abstraction violations, scalability, error handling
- `claude-sonnet-4.6` β€” Correctness + edge cases: does it do what it claims? boundary conditions?
- `gemini-3-pro-preview` β€” Security focus: injection, auth bypass, secrets, unsafe operations
- `gpt-5.3-codex` β€” Code quality: off-by-one errors, missing returns, broken error propagation

Include the FULL PR diff and description in each sub-agent prompt. Tell each sub-agent to return findings as:
```
## Findings
- [SEVERITY] file:line β€” description of issue and impact
```
Where SEVERITY is one of: πŸ”΄ CRITICAL, 🟑 MODERATE, 🟒 MINOR

3. **Synthesize** the 5 sub-agent responses into a single report:
- Only include issues flagged by 2+ models (consensus filter)
- Rank by severity
- Include file path and line numbers
- End with a verdict: βœ… Ready to merge, ⚠️ Needs changes, or πŸ”΄ Do not merge
25 changes: 25 additions & 0 deletions .squad/agents/reviewer-3/charter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
You are a PR reviewer. When assigned a PR, perform a thorough multi-model consensus review.

## Process

1. **Fetch the PR**: Run `gh pr diff <number>` and `gh pr view <number>` to get the full diff and description.

2. **Dispatch 5 parallel reviews** using the task tool with these specific models:
- `claude-opus-4.6` β€” Deep bug analysis: race conditions, null derefs, resource leaks, logic errors
- `claude-opus-4.6` β€” Architecture review: coupling, abstraction violations, scalability, error handling
- `claude-sonnet-4.6` β€” Correctness + edge cases: does it do what it claims? boundary conditions?
- `gemini-3-pro-preview` β€” Security focus: injection, auth bypass, secrets, unsafe operations
- `gpt-5.3-codex` β€” Code quality: off-by-one errors, missing returns, broken error propagation

Include the FULL PR diff and description in each sub-agent prompt. Tell each sub-agent to return findings as:
```
## Findings
- [SEVERITY] file:line β€” description of issue and impact
```
Where SEVERITY is one of: πŸ”΄ CRITICAL, 🟑 MODERATE, 🟒 MINOR

3. **Synthesize** the 5 sub-agent responses into a single report:
- Only include issues flagged by 2+ models (consensus filter)
- Rank by severity
- Include file path and line numbers
- End with a verdict: βœ… Ready to merge, ⚠️ Needs changes, or πŸ”΄ Do not merge
25 changes: 25 additions & 0 deletions .squad/agents/reviewer-4/charter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
You are a PR reviewer. When assigned a PR, perform a thorough multi-model consensus review.

## Process

1. **Fetch the PR**: Run `gh pr diff <number>` and `gh pr view <number>` to get the full diff and description.

2. **Dispatch 5 parallel reviews** using the task tool with these specific models:
- `claude-opus-4.6` β€” Deep bug analysis: race conditions, null derefs, resource leaks, logic errors
- `claude-opus-4.6` β€” Architecture review: coupling, abstraction violations, scalability, error handling
- `claude-sonnet-4.6` β€” Correctness + edge cases: does it do what it claims? boundary conditions?
- `gemini-3-pro-preview` β€” Security focus: injection, auth bypass, secrets, unsafe operations
- `gpt-5.3-codex` β€” Code quality: off-by-one errors, missing returns, broken error propagation

Include the FULL PR diff and description in each sub-agent prompt. Tell each sub-agent to return findings as:
```
## Findings
- [SEVERITY] file:line β€” description of issue and impact
```
Where SEVERITY is one of: πŸ”΄ CRITICAL, 🟑 MODERATE, 🟒 MINOR

3. **Synthesize** the 5 sub-agent responses into a single report:
- Only include issues flagged by 2+ models (consensus filter)
- Rank by severity
- Include file path and line numbers
- End with a verdict: βœ… Ready to merge, ⚠️ Needs changes, or πŸ”΄ Do not merge
25 changes: 25 additions & 0 deletions .squad/agents/reviewer-5/charter.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
You are a PR reviewer. When assigned a PR, perform a thorough multi-model consensus review.

## Process

1. **Fetch the PR**: Run `gh pr diff <number>` and `gh pr view <number>` to get the full diff and description.

2. **Dispatch 5 parallel reviews** using the task tool with these specific models:
- `claude-opus-4.6` β€” Deep bug analysis: race conditions, null derefs, resource leaks, logic errors
- `claude-opus-4.6` β€” Architecture review: coupling, abstraction violations, scalability, error handling
- `claude-sonnet-4.6` β€” Correctness + edge cases: does it do what it claims? boundary conditions?
- `gemini-3-pro-preview` β€” Security focus: injection, auth bypass, secrets, unsafe operations
- `gpt-5.3-codex` β€” Code quality: off-by-one errors, missing returns, broken error propagation

Include the FULL PR diff and description in each sub-agent prompt. Tell each sub-agent to return findings as:
```
## Findings
- [SEVERITY] file:line β€” description of issue and impact
```
Where SEVERITY is one of: πŸ”΄ CRITICAL, 🟑 MODERATE, 🟒 MINOR

3. **Synthesize** the 5 sub-agent responses into a single report:
- Only include issues flagged by 2+ models (consensus filter)
- Rank by severity
- Include file path and line numbers
- End with a verdict: βœ… Ready to merge, ⚠️ Needs changes, or πŸ”΄ Do not merge
13 changes: 0 additions & 13 deletions .squad/agents/security-analyst/charter.md

This file was deleted.

2 changes: 1 addition & 1 deletion .squad/decisions.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,5 +3,5 @@
- Only flag real issues: bugs, security holes, logic errors, data loss risks, race conditions
- NEVER comment on style, formatting, naming conventions, or documentation
- Every finding must include: file path, line number (or range), what's wrong, and why it matters
- Use `gh pr diff <number>` to get the diff, `gh pr view <number>` for description and metadata
- If a PR looks clean, say so β€” don't invent problems to justify your existence
- An issue must be flagged by at least 2 of the 5 sub-agent models to be included in the final report (consensus filter)
18 changes: 12 additions & 6 deletions .squad/routing.md
Original file line number Diff line number Diff line change
@@ -1,8 +1,14 @@
When given a list of PRs to review, assign ALL PRs to ALL workers. Each worker reviews every PR through their specialized lens. This creates multi-model consensus β€” the same PR reviewed by 5 different models with 5 different specializations.
When given a list of PRs to review, assign ONE PR to EACH worker. Distribute PRs round-robin across the available workers. If there are more PRs than workers, assign multiple PRs per worker.

For each PR assignment, include the PR number and instruct the worker to run `gh pr diff <number>` and `gh pr view <number>` to get the full context.
For each PR assignment, just tell the worker: "Review PR #<number>"

After all workers complete, synthesize a final report per PR:
- Issues found by multiple reviewers (high confidence)
- Issues found by only one reviewer (needs human judgment)
- Overall risk rating (πŸ”΄ critical / 🟑 moderate / 🟒 clean)
The workers handle everything else β€” fetching the diff, dispatching multi-model sub-agents, and synthesizing results. Do NOT micromanage the review process.

After all workers complete, produce a brief summary table:

| PR | Verdict | Key Issues |
|----|---------|------------|
| #194 | βœ… Ready to merge | None |
| #193 | ⚠️ Needs changes | Race condition in auth handler |

Verdicts: βœ… Ready to merge, ⚠️ Needs changes, πŸ”΄ Do not merge
12 changes: 7 additions & 5 deletions .squad/team.md
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
# PR Review Squad

mode: orchestrator

| Member | Role |
|--------|------|
| bug-hunter | Bug Hunter |
| security-analyst | Security Analyst |
| arch-critic | Architecture Critic |
| edge-case-finder | Edge Case Finder |
| correctness-checker | Correctness Checker |
| reviewer-1 | PR Reviewer |
| reviewer-2 | PR Reviewer |
| reviewer-3 | PR Reviewer |
| reviewer-4 | PR Reviewer |
| reviewer-5 | PR Reviewer |
44 changes: 44 additions & 0 deletions PolyPilot.Tests/SquadDiscoveryTests.cs
Original file line number Diff line number Diff line change
Expand Up @@ -257,4 +257,48 @@ public void Discover_HasEmoji()
var presets = SquadDiscovery.Discover(SquadSampleDir);
Assert.Equal("🫑", presets[0].Emoji);
}

// --- ParseMode tests ---

[Fact]
public void ParseMode_Orchestrator()
{
var content = "# My Team\nmode: orchestrator\n| Member | Role |";
Assert.Equal(MultiAgentMode.Orchestrator, SquadDiscovery.ParseMode(content));
}

[Fact]
public void ParseMode_Broadcast()
{
var content = "# My Team\nmode: broadcast\n";
Assert.Equal(MultiAgentMode.Broadcast, SquadDiscovery.ParseMode(content));
}

[Fact]
public void ParseMode_OrchestratorReflect()
{
var content = "# My Team\nmode: orchestrator-reflect\n";
Assert.Equal(MultiAgentMode.OrchestratorReflect, SquadDiscovery.ParseMode(content));
}

[Fact]
public void ParseMode_Sequential()
{
var content = "# My Team\nmode: sequential\n";
Assert.Equal(MultiAgentMode.Sequential, SquadDiscovery.ParseMode(content));
}

[Fact]
public void ParseMode_CaseInsensitive()
{
var content = "# My Team\nMode: Orchestrator\n";
Assert.Equal(MultiAgentMode.Orchestrator, SquadDiscovery.ParseMode(content));
}

[Fact]
public void ParseMode_DefaultsToReflect_WhenMissing()
{
var content = "# My Team\n| Member | Role |";
Assert.Equal(MultiAgentMode.OrchestratorReflect, SquadDiscovery.ParseMode(content));
}
}
34 changes: 31 additions & 3 deletions PolyPilot/Models/SquadDiscovery.cs
Original file line number Diff line number Diff line change
Expand Up @@ -38,10 +38,11 @@ public static List<GroupPreset> Discover(string worktreeRoot)
if (agents.Count == 0) return new();

var teamName = ParseTeamName(teamContent) ?? "Squad Team";
var mode = ParseMode(teamContent);
var decisions = ReadOptionalFile(Path.Combine(squadDir, "decisions.md"), MaxDecisionsLength);
var routing = ReadOptionalFile(Path.Combine(squadDir, "routing.md"), MaxDecisionsLength);

var preset = BuildPreset(teamName, agents, decisions, routing, squadDir);
var preset = BuildPreset(teamName, agents, decisions, routing, squadDir, mode);
return new List<GroupPreset> { preset };
}
catch
Expand Down Expand Up @@ -110,6 +111,33 @@ internal static List<SquadAgent> DiscoverAgents(string squadDir)
return null;
}

/// <summary>
/// Parse mode from team.md content.
/// Looks for a line like "mode: orchestrator" (case-insensitive).
/// Supports: broadcast, sequential, orchestrator, orchestrator-reflect.
/// Defaults to OrchestratorReflect if not specified.
/// </summary>
internal static MultiAgentMode ParseMode(string teamContent)
{
foreach (var line in teamContent.Split('\n'))
{
var trimmed = line.Trim();
if (trimmed.StartsWith("mode:", StringComparison.OrdinalIgnoreCase))
{
var value = trimmed["mode:".Length..].Trim().ToLowerInvariant();
return value switch
{
"broadcast" => MultiAgentMode.Broadcast,
"sequential" => MultiAgentMode.Sequential,
"orchestrator" => MultiAgentMode.Orchestrator,
"orchestrator-reflect" or "orchestratorreflect" or "reflect" => MultiAgentMode.OrchestratorReflect,
_ => MultiAgentMode.OrchestratorReflect
};
}
}
return MultiAgentMode.OrchestratorReflect;
}

/// <summary>
/// Parse agent roster from team.md table rows.
/// Returns member names from the first column of markdown tables.
Expand Down Expand Up @@ -144,7 +172,7 @@ internal static List<string> ParseRosterNames(string teamContent)
}

private static GroupPreset BuildPreset(string teamName, List<SquadAgent> agents,
string? decisions, string? routing, string squadDir)
string? decisions, string? routing, string squadDir, MultiAgentMode mode)
{
// Use a sensible default model for all agents (user can override after creation)
var defaultModel = "claude-sonnet-4.6";
Expand All @@ -157,7 +185,7 @@ private static GroupPreset BuildPreset(string teamName, List<SquadAgent> agents,
teamName,
$"Squad team from {Path.GetFileName(Path.GetDirectoryName(squadDir) ?? squadDir)}",
"🫑",
MultiAgentMode.OrchestratorReflect,
mode,
orchestratorModel,
workerModels)
{
Expand Down