Skip to content

feat: Multi-agent orchestration with Squad integration#104

Merged
PureWeen merged 58 commits intomainfrom
copilot/add-multi-agent-support
Feb 21, 2026
Merged

feat: Multi-agent orchestration with Squad integration#104
PureWeen merged 58 commits intomainfrom
copilot/add-multi-agent-support

Conversation

Copy link
Contributor

Copilot AI commented Feb 14, 2026

Multi-Agent Orchestration with Squad Integration

Adds a complete multi-agent system to PolyPilot — create teams of AI sessions that work together using different models, with full bradygaster/squad format compatibility.

Features

Orchestration Modes

  • Broadcast — Same prompt to all sessions simultaneously
  • Sequential — Sessions process one at a time in order
  • Orchestrator — Single-pass plan → dispatch → collect → synthesize
  • OrchestratorReflect — Iterative loop with stall detection, quality evaluation, and auto-adjustment

Squad Integration (.squad/ format)

  • Discovery — Reads .squad/ (or legacy .ai-team/) directories from worktree roots → GroupPreset with agent charters as system prompts, decisions.md as shared context, routing.md as orchestrator planning context
  • Write-back — Saving a preset writes .squad/ format back to the worktree for repo-level sharing, plus presets.json as personal backup
  • Round-trip — Write → discover → compare verified in tests
  • Three-tier merge — Built-in presets < User presets (~/.polypilot/presets.json) < Repo teams (.squad/)

Preset Picker UI

  • Sectioned picker: "📂 From Repo" (Squad) / "⚙️ Built-in" / "👤 My Presets"
  • 4 built-in presets: Code Review Team, Multi-Perspective Analysis, Quick Reflection Cycle, Deep Research
  • Squad presets show 🫡 badge indicating repo-level source

Reflection Loop

  • Sentinel detection ([[REFLECTION_COMPLETE]]) for goal-met signaling
  • Jaccard similarity stall detection (0.9 threshold, sliding window of 5)
  • Dedicated evaluator session support (independent scoring)
  • Auto-adjustment: failed workers, brief responses, quality degradation
  • Pause/resume with stall state reset
  • Error budget (3 consecutive errors → stall)

Group Management

  • Per-worker system prompts (agent personas from Squad charters)
  • Per-session model assignment with capability warnings
  • Group deletion: multi-agent groups close all sessions; regular groups move sessions to default
  • Reconciliation protection: multi-agent sessions never auto-moved to other groups

Key Files

File Purpose
PolyPilot/Services/CopilotService.Organization.cs Orchestration engine, group lifecycle, preset management
PolyPilot/Models/SquadDiscovery.cs .squad/GroupPreset parser
PolyPilot/Models/SquadWriter.cs GroupPreset.squad/ writer
PolyPilot/Models/ModelCapabilities.cs GroupPreset, UserPresets, model registry
PolyPilot/Models/ReflectionCycle.cs Reflection state, stall detection, evaluator prompts
PolyPilot/Components/Layout/SessionSidebar.razor Preset picker UI, multi-agent controls
docs/multi-agent-orchestration.md Architecture spec

Test Coverage

  • 1114 tests passing (19 new gap tests from multi-model review)
  • SquadDiscoveryTests — 22 tests (discovery, parsing, merge, edge cases)
  • SquadWriterTests — 16 tests (write, round-trip, sanitize, stale dir cleanup)
  • MultiAgentRegressionTests — 37 tests (orchestration, TCS ordering, reconciliation)
  • MultiAgentGapTests — 17 tests (ParseTaskAssignments, ModelCapabilities, BuildCompletionSummary)
  • SessionOrganizationTests — Group CRUD, deletion, preset save with worktree
  • 26 CDP scenarios in multi-agent-scenarios.json

Review Process

Feature reviewed by Opus 4.6, Sonnet 4.6, and Codex 5.2 in parallel. Two bugs found and fixed:

  1. BuildCompletionSummary ternary ordering — stalls now correctly show "⚠️ Stalled" instead of "⏹️ Cancelled"
  2. SquadWriter stale agent cleanup — overwriting now removes old agent dirs before re-writing

Note: 4 pre-existing DiffParserTests failures from main (line-ending \r issue) — not related to this PR.

Copilot AI and others added 2 commits February 14, 2026 05:17
…e, dashboard UI, and tests

Co-authored-by: PureWeen <5375137+PureWeen@users.noreply.github.com>
…chestrator prompt formatting

Co-authored-by: PureWeen <5375137+PureWeen@users.noreply.github.com>
Copilot AI changed the title [WIP] Add support for multi agent mode in dashboard Add multi-agent orchestration mode for session groups Feb 14, 2026
Copilot AI requested a review from PureWeen February 14, 2026 05:24
PureWeen and others added 3 commits February 17, 2026 09:47
…text prefix, bridge role support

- SetSessionRole enforces max 1 orchestrator per group (auto-demotes previous)
- Broadcast/Sequential dispatch prepends multi-agent context prefix with role and team info
- Add multi_agent_set_role bridge message type, payload, and WsBridgeServer handler
- Fix _copilotService -> _copilot field reference bug in bridge handlers
- Add .card-role-badge CSS styles for orchestrator/worker badges
- Add 3 new tests (orchestrator invariant, payload serialization, message type)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Resolves conflicts in BridgeMessages.cs, WsBridgeServer.cs, and
SessionCard.razor.css to include both multi-agent and fiesta/unread-badge
changes from main.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Orchestrator mode now runs a complete plan→dispatch→collect→synthesize loop:
- Orchestrator receives prompt and assigns tasks using @worker:name markers
- Tasks dispatched to workers in parallel with SendPromptAndWaitAsync
- Worker results collected and sent back to orchestrator for synthesis
- Phase events (Planning/Dispatching/WaitingForWorkers/Synthesizing/Complete)

Sidebar improvements:
- Add '🤖 + Multi-Agent' button to create multi-agent groups from sidebar
- Add '🤖 Convert to Multi-Agent' option in group context menu
- Show 🤖 badge on multi-agent group headers in sidebar
- Add ConvertToMultiAgent method to CopilotService

Tests: 5 new tests for ParseTaskAssignments and ConvertToMultiAgent

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen force-pushed the copilot/add-multi-agent-support branch from 7857070 to dce3134 Compare February 17, 2026 16:23
PureWeen and others added 5 commits February 17, 2026 10:29
Sessions in multi-agent groups now show:
- 🎯 Set as Orchestrator / 👷 Set as Worker in the ⋯ menu
- 🎯 badge next to the session name when it's the orchestrator

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Add compact orchestration controls (mode dropdown + textarea + send button)
under multi-agent group headers in the session sidebar, so users can
send broadcast messages and change modes without collapsing the expanded
session view.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…nt groups

When a session belonging to a multi-agent group is expanded (full-screen),
the dashboard grid and its multi-agent controls become hidden. This adds a
compact sticky toolbar at the top of the expanded view showing:
- Group name with multi-agent badge
- Mode selector dropdown (Broadcast/Sequential/Orchestrator)
- Inline text input with Send All button
- Phase progress indicator when orchestration is running

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Users can now access multi-agent orchestration without leaving the
expanded session view:

Sidebar:
- Mode selector (Broadcast/Sequential/Orchestrator) under group header
- Compact Send All input bar with textarea + send button
- Real-time orchestrator phase progress indicator

Expanded session view:
- Sticky toolbar when active session is in a multi-agent group
- Shows group name, mode selector, Send All input
- Phase progress indicator during orchestration loop

Phase indicators show animated status:
- 🎯 Planning... → 📡 Dispatching... → ⏳ Waiting... → 🔄 Synthesizing...

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen changed the title Add multi-agent orchestration mode for session groups Multi-agent orchestration: full orchestrator loop, sidebar controls, and role management Feb 17, 2026
PureWeen and others added 11 commits February 17, 2026 21:53
- Use fixed positioning instead of absolute for mobile to prevent off-screen clipping
- Center dropdown on screen with translateX(-50%)
- Increase touch target size to 44px minimum (iOS guideline)
- Improve max-height calculation for small screens (50vh)
- Add touch event handling support (@ontouchstart:preventDefault)

Fixes model dropdown being cut off at edges of mobile screens.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Per-Agent Model Assignment:
- Add PreferredModel to SessionMeta for per-session model override
- Add DefaultWorkerModel/DefaultOrchestratorModel to SessionGroup
- Add SetSessionPreferredModel() and GetEffectiveModel() to CopilotService
- Add EnsureSessionModelAsync() called before all dispatch paths (broadcast,
  sequential, orchestrator) to switch models at dispatch time
- Include model info in BuildMultiAgentPrefix and planning prompts so
  orchestrators know each worker's capabilities
- Add inline model picker in SessionListItem context menu for multi-agent groups
- Show model override indicator (⚡) in session metadata row

OrchestratorReflect Mode:
- Add OrchestratorReflect enum value to MultiAgentMode
- Add GroupReflectionState class with goal, iteration tracking, stall detection
- Implement SendViaOrchestratorReflectAsync with iterative loop:
  Plan -> Dispatch -> Collect -> Synthesize+Evaluate -> repeat until
  [[GROUP_REFLECT_COMPLETE]] sentinel or stall/max iterations
- Add StartGroupReflection/StopGroupReflection/PauseGroupReflection methods
- Add OrchestratorReflect option in sidebar mode selector
- Add group reflection status bar with iteration counter, goal, pause/stop

Tests (20 new, 650 total passing):
- PerAgentModelAssignmentTests: store/clear/effective model, serialization
- GroupReflectionStateTests: creation, stall detection, completion summaries,
  serialization, evaluation extraction

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Model Capability System:
- Add ModelCapabilities static registry with capability flags per model family
  (CodeExpert, ReasoningExpert, Fast, CostEfficient, ToolUse, Vision, LargeContext)
- GetRoleWarnings() warns when assigning cheap models as orchestrator or
  non-tool-use models as workers
- GetStrengths() returns human-readable model description
- Warnings displayed inline in SessionListItem model picker

Group Presets (one-click multi-agent creation):
- Add GroupPreset with 4 built-in templates:
  Code Review Team, Multi-Perspective Analysis, Fast Iteration Squad, Deep Research
- Each preset defines orchestrator model, worker models, and dispatch mode
- 🚀 Preset button in sidebar toolbar opens picker panel
- CreateGroupFromPresetAsync creates group + sessions with correct roles/models

Race-Safe Model Switching:
- Add per-session SemaphoreSlim in EnsureSessionModelAsync via ConcurrentDictionary
- Double-check pattern: re-verify model after acquiring lock
- Prevents concurrent dispatches from racing on model switch

Richer Evaluation Prompts:
- BuildSynthesisWithEvalPrompt now includes quality assessment criteria
  (completeness, correctness, relevance) and iteration-aware urgency hints
- Cross-iteration feedback tracking: previous evaluation included in next iteration

Tests (10 new, 660 total passing):
- ModelCapabilitiesTests: known/unknown models, fuzzy match, role warnings
- GroupPresetTests: built-in validation, OrchestratorReflect mode coverage

Architecture consulted with: Claude Opus 4.6 (system design), Gemini 3 Pro
(UX/prompt design), GPT-5 (extensibility/concurrency patterns).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…erence, and auto-adjust banners

- GroupReflectionState: EvaluationResult history, QualityTrend tracking, optional EvaluatorSession, PendingAdjustments for UI banners
- Dedicated evaluator: separate session scores each iteration independently via SCORE/RATIONALE format
- ModelCapabilities: InferFromName() for unknown model variants (opus/sonnet/haiku/codex/mini/max patterns)
- ParseEvaluationScore: robust 0-1 score extraction with clamping
- AutoAdjustFromFeedback: quality degradation from eval history, PendingAdjustments for banner UX
- Sidebar: evaluation score display, adjustment banner with warning styling
- 690 tests passing (21 new: EvaluationTracking, ModelNameInference, ParseEvaluationScore)
- Consulted: Gemini 3 Pro (UX), GPT-5 (extensibility architecture)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
9 scenario tests that serve as executable documentation of complete user flows:
- Scenario_CreateGroupFromPreset: preset picker → group creation → role/model validation
- Scenario_WeakOrchestratorWarnings: model picker → role warnings → group diagnostics
- Scenario_FullReflectCycleWithScoring: 4-iteration cycle with eval scores (0.4→0.7→0.65→0.92)
- Scenario_AutoAdjustDetectsIssuesAndSurfacesBanner: quality degradation → amber banner
- Scenario_SaveAndReuseCustomPreset: save → persist → reload with user badge
- Scenario_DedicatedEvaluatorScoring: SCORE/RATIONALE parsing → completion detection
- Scenario_StallDetectionStopsLoop: hash-based repeat detection → auto-stop
- Scenario_NewModelReleasesHandledGracefully: name-pattern inference for future models
- Scenario_DiagnosticsGuideMisconfiguration: error/warning/info diagnostics flow

Also enhanced GetStrengths to generate descriptions from inferred capabilities.
699 tests passing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Merge EvaluationHistory, QualityTrend, EvaluationResult, PendingAdjustments into ReflectionCycle
- Remove GroupReflectionState from SessionOrganization.cs
- Update CopilotService.Organization.cs to use ReflectionCycle for multi-agent reflect
- Update tests: replace GroupReflectionState.Create -> ReflectionCycle.Create
- 699 tests passing, 0 regressions

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- HandleCreateGroup: use CreateGroupFromPresetAsync instead of old CreateMultiAgentGroupAsync
- CreateFromPreset: remove nonexistent GetActiveSessionWorkingDirectory call
- CopilotService: expose BaseDir as internal for sidebar preset access

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…teSessionForm

The main merge added OnCreateGroup bindings to CreateSessionForm, but that
component has no such parameter. This caused InvalidOperationException on render.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen and others added 11 commits February 18, 2026 15:28
Both the grid-view group input bar and the expanded toolbar now
show 'Iterations: [number]' when Reflect mode is selected.
Auto-starts reflection with configured max iterations on send.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Root cause: TrySetResult was called BEFORE IsProcessing=false in the
event handler's CompleteResponse. If the TCS continuation runs
synchronously (common in orchestrator reflection loops), the next
SendPromptAsync call sees IsProcessing=true and throws.

Fix: Clear IsProcessing before calling TrySetResult/TrySetException
in both the success and error paths.

Also add try-catch around the reflection while loop body so iteration
errors don't silently kill the entire reflection cycle. Failed iterations
retry up to 3 times before stalling.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
… tests

Root cause: ReconcileOrganization would auto-move sessions from _default to
repo groups based on WorktreeId, even if the sessions were orphaned from a
deleted multi-agent group. This silently scattered team sessions across repos.

Fixes:
- Skip auto-reassignment for sessions in existing multi-agent groups
- Skip auto-move to repo group for sessions with Orchestrator role or
  PreferredModel set (markers of multi-agent team membership)
- Make ReconcileOrganization internal for testability

New GroupingStabilityTests class (14 tests):
- Multi-agent group full-state JSON round-trip
- Multiple group types survive serialization
- DeleteGroup moves sessions to default and preserves metadata
- Reconciliation protects multi-agent group sessions
- Orphaned sessions from deleted groups stay in default
- Orphaned multi-agent sessions not auto-moved to repo groups
- Full lifecycle: create → delete → reconcile stability
- wasMultiAgent heuristic detection
- ReflectionState survives JSON round-trip

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Keeps lightweight Debug logs for:
- LoadOrganization: group/session count on load
- ReconcileOrganization: orphaned sessions moved to _default
- ReconcileOrganization: sessions pruned (no longer known)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Tests cover:
- Organization JSON corruption resilience (missing fields, extra fields, complex round-trips)
- Reconciliation scattering protection (multi-agent sessions, orphaned workers/orchestrators)
- Preset creation Role/PreferredModel markers (round-trip preservation)
- Mode enum completeness (all values, string serialization)
- Reflection loop error resilience (retry logic, sentinel detection, stall handling)
- TCS ordering invariant (IsProcessing before TrySetResult)
- Full lifecycle delete-recreate scenarios (no contamination)
- App restart simulation (serialize-reconcile-verify)
- wasMultiAgent heuristic Theory tests

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- docs/multi-agent-orchestration.md: Full spec covering OrchestratorReflect loop,
  all 4 modes, sentinel protocol, stall detection, TCS ordering invariant,
  reconciliation rules, error handling, and testing guide
- PolyPilot.Tests/Scenarios/multi-agent-scenarios.json: 10 executable CDP
  scenarios covering reflection loop, stall detection, reconciliation,
  broadcast, preset creation, and TCS ordering regression
- .github/copilot-instructions.md: Updated pointer to docs and scenarios

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
# Conflicts:
#	PolyPilot/Components/Layout/SessionListItem.razor
The 10-second hardcoded timeout in ResumeSessionAsync was prematurely
clearing IsProcessing on sessions that were still actively working.
Tool calls (dotnet build, git push, etc.) can easily go 30-60 seconds
between events, causing the resume logic to declare the turn dead.

Changes:
- Remove the 10-second resume timeout entirely — the processing
  watchdog (120s inactivity / 600s tool execution) already handles
  stuck sessions properly
- Move event handler subscription (copilotSession.On) BEFORE the
  watchdog setup to fix a race where events arriving immediately
  after SDK resume were missed because the handler wasn't wired yet

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The processing watchdog was incorrectly using the 120s inactivity timeout
even when the session was actively running multi-turn tool calls. This
happened because AssistantTurnStartEvent resets ActiveToolCallCount to 0
between tool rounds, making the model's 'thinking' gap between tools
look like inactivity.

Added HasUsedToolsThisTurn flag that stays true for the entire processing
cycle once any tool executes. The watchdog now uses the 600s tool timeout
when: a tool is actively running (hasActiveTool), the session was resumed
mid-turn (IsResumed), or tools have been used this turn (HasUsedToolsThisTurn).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Workers can now have a SystemPrompt on SessionMeta that defines their
specialization (e.g., 'security auditor', 'performance optimizer').

- SessionMeta.SystemPrompt: nullable, serializable to org.json
- BuildOrchestratorPlanningPrompt: includes worker descriptions so the
  orchestrator routes tasks based on expertise
- ExecuteWorkerAsync: prepends worker's system prompt instead of generic
- GroupPreset.WorkerSystemPrompts: per-worker prompts indexed to models
- CreateGroupFromPresetAsync: applies preset system prompts to workers
- SetSessionSystemPrompt: public API for setting/clearing prompts
- Built-in presets updated with meaningful personas:
  Code Review Team (correctness + security reviewers)
  Quick Reflection Cycle (implementation + testing + docs specialists)
  Deep Research (analyst + creative problem solver)
- 7 new regression tests (JSON round-trip, null safety, prompt content)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Findings from OrchestratorReflect review with Sonnet 4.6 + GPT-5.3-Codex:

Critical:
- Carry ProcessingGeneration across SessionState replacement on reconnect
  to prevent stale callbacks from passing generation checks

High:
- Add atomic SendingFlag to prevent TOCTOU race in SendPromptAsync
- Gate orphaned event handlers: skip SessionIdleEvent/SessionErrorEvent
  on !isCurrentState to prevent stale handlers clearing IsProcessing
- Add lock around _queuedImagePaths inner List mutations
- JsonIgnore ConsecutiveStalls — private stall state not recoverable
  from JSON, persisting counter creates inconsistent state on restart
- Split ConsecutiveErrors from ConsecutiveStalls — different thresholds
  and recovery strategies for errors vs stalls

Medium:
- Reset IsResumed after first CompleteResponse so subsequent turns use
  normal 120s watchdog timeout instead of permanent 600s
- Add RunContinuationsAsynchronously to all TaskCompletionSource<string>
  to prevent inline continuation reentrancy
- Empty worker assignment on first iteration treated as error not goal met

Low:
- Add IsCancelled flag to ReflectionCycle for StopGroupReflection
- Replace GetHashCode() with full string equality in stall detection
- Prune ghost __evaluator_* sessions on startup if not referenced by
  active reflection cycle

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
PureWeen and others added 11 commits February 19, 2026 16:39
- Fix #9 off-by-one: CurrentIteration==0 check unreachable (now ==1)
- Fix #10 incomplete: set IsCancelled on OperationCanceledException
- Fix ConsecutiveErrors: reset to 0 after successful iteration
- Fix #11 stale comments: update hash references to string equality
- Fix #12 incomplete: mark pruned ghost evaluators in _closedSessionIds

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Concurrency fixes:
- Swap _sessions before wiring event handler on reconnect (#2)
- Block ALL events from orphaned handlers, not just terminal (#3)
- Add lock(_imageQueueLock) to all image queue mutations (#4)
  including dequeue, reinsert, ClearQueue, rename, close, dispose
- Clear IsResumed on error and watchdog paths (#5)
- Add RunContinuationsAsynchronously to remaining TCS (#6)

Architecture/contract fixes:
- Add [JsonIgnore] to ShouldWarnOnStall, LastSimilarity (#7)
- Fix ConsecutiveErrors increment-before-check ordering (#8)
- Set IsCancelled on all non-success termination paths (#10)
  including stall, error-stall, max-iteration, OperationCanceled,
  empty-assignment error stall, and single-agent StopReflectionCycle
- Add session dir deletion for ghost evaluator pruning (#12)
- Add CompletedAt to StopReflectionCycle (#12 related)

Already correct (no changes needed):
- #9: CurrentIteration == 1 check was already fixed
- #11: Comments already reference string-based stall detection

Documentation:
- Update stall detection from 'hash match' to 'string equality'
- Update error handling to show ConsecutiveErrors (not ConsecutiveStalls)
- Add IsCancelled invariant to exit conditions table
- Add 5 new invariants: orphan gate, reconnect ordering,
  image queue locking, IsResumed clearing, TCS creation
- Document empty-assignment retry behavior

817/817 tests pass.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Architecture spec: new 'Squad Integration' section with mapping table,
  discovery flow, preset priority, security constraints, GroupPreset extensions
- Copilot instructions: document Squad discovery from .squad/ directories
- Scenarios: 7 new CDP scenarios for Squad discovery, charter→system prompt,
  decisions.md injection, legacy .ai-team/ compat, preset priority, graceful
  handling of missing files, worker descriptions in orchestrator planning
- Tests: 2 new ScenarioReferenceTests validating multi-agent scenario IDs
  and verifying Squad integration scenarios are present (819 tests passing)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add SquadDiscovery.cs: parses .squad/ and .ai-team/ directories into GroupPresets
  (team.md roster, agent charters as system prompts, decisions.md shared context,
  routing.md orchestrator context)
- Extend GroupPreset with IsRepoLevel, SourcePath, SharedContext, RoutingContext
- Extend SessionGroup with SharedContext and RoutingContext for orchestration
- Add three-tier preset merge in UserPresets.GetAll(baseDir, repoWorkingDirectory)
- Update SessionSidebar.razor: sectioned preset picker with From Repo / Built-in /
  My Presets sections and repo source badge
- Inject shared context (decisions.md) into worker prompts via ExecuteWorkerAsync
- Inject routing context into orchestrator planning prompt
- Store Squad context on group during CreateGroupFromPresetAsync
- Add 20 SquadDiscoveryTests covering discovery, parsing, merge, edge cases
- Add test data fixtures for .squad/ and .ai-team/ formats
- Document copilot-instructions.md auto-inheritance via SDK

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When deleting a multi-agent group, sessions were moved to the default
'Sessions' group but kept their Role, PreferredModel, and WorktreeId
markers — appearing as orphaned multi-agent sessions in the sidebar.

Now DeleteGroup checks IsMultiAgent: if true, sessions are removed from
organization and closed asynchronously. Non-multi-agent groups retain
the old behavior (move sessions to default).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Architecture spec: add Group Deletion section documenting multi-agent
  vs regular group behavior, add SharedContext/RoutingContext to data model,
  fix stale GroupPreset code block, update key files table with
  SquadDiscovery and ModelCapabilities, update test counts
- Copilot instructions: expand Squad section with three-tier merge,
  preset picker sections, routing.md injection, deletion behavior
- Scenarios: add delete-multi-agent-group-closes-sessions scenario
  verifying sessions are removed not orphaned
- ScenarioReferenceTests: add group deletion scenario presence check

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Resolved conflicts in:
- PolyPilot.Tests/PolyPilot.Tests.csproj (added both ErrorMessageHelper and Squad includes)
- CopilotService.Events.cs (took main's Volatile.Read/Write, InvokeOnUI for error handler, IsResumed clearing)
- CopilotService.cs (whitespace in ResumeSessionAsync)

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add SquadWriter.cs: writes GroupPreset as .squad/ directory structure
  (team.md, agents/{name}/charter.md, decisions.md, routing.md)
- Update SaveGroupAsPreset to write .squad/ when worktree is available,
  with presets.json as personal backup
- 15 SquadWriter tests (write, round-trip, sanitize, edge cases)
- Add 3 CDP scenarios: save-preset-creates-squad-dir,
  round-trip-squad-write-read, squad-write-sanitizes-names
- Update docs and copilot instructions for write-back behavior
- 912 tests passing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Resolve 3 conflicts:
- BridgeMessages.cs: keep both multi-agent + change-model/rename constants
- CopilotService.cs: take main's fresh TCS + On() ordering, keep our
  ProcessingGeneration/HasUsedToolsThisTurn carry-forward
- WsBridgeClient.cs: keep multi-agent methods + main's ConcurrentDictionary

4 pre-existing DiffParser test failures from main PR #159 (not ours).
1014 tests passing.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Resolve 3 conflicts:
- CopilotService.Events.cs: keep both SendingFlag reset and ProcessingStartedAt/ToolCallCount/ProcessingPhase cleanup
- CopilotService.Organization.cs: keep internal visibility + main's _lastReconcileSessionHash field
- WsBridgeServer.cs: keep both multi-agent handlers and repo management handlers

1078 tests passing (4 pre-existing DiffParser \r line-ending failures from main).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Drop 3 Copilot Console debug commits (fixed properly on main in #172).
Resolve WsBridgeServer conflict: keep multi-agent + FetchImage handlers.
1095 tests passing (4 pre-existing DiffParser \r failures from main).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen force-pushed the copilot/add-multi-agent-support branch from 1a442ca to 947b23a Compare February 21, 2026 17:16
Bugs fixed:
- BuildCompletionSummary: reorder ternary so IsStalled takes priority
  over IsCancelled (stalls were showing as 'Cancelled by user')
- SquadWriter: clean stale agent dirs before re-writing to prevent
  phantom agents on re-discovery

Tests added (19 new, 1114 total passing):
- MultiAgentGapTests.cs: ParseTaskAssignments (6), ModelCapabilities (4),
  BuildCompletionSummary (4)
- ScenarioReferenceTests: structural validation + reflect loop checks (2)
- SquadWriterTests: stale dir cleanup verification (1)
- SessionOrganizationTests: SaveGroupAsPreset with worktree write-back (1)
- ReflectionCycleTests: stalled summary priority (1)

Scenarios added (5 new, 26 total):
- sequential-mode-processes-in-order
- pause-resume-reflection-cycle
- dedicated-evaluator-session
- routing-context-in-orchestrator-plan

Docs fixed:
- team.md format: 'Name | Role | Model' → 'Member | Role'

Verified live: Squad .squad/ discovery shows 'PolyPilot Review Squad'
in '📂 From Repo' section with 🫡 badge in preset picker.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen marked this pull request as ready for review February 21, 2026 19:18
@PureWeen PureWeen changed the title Multi-agent orchestration: full orchestrator loop, sidebar controls, and role management feat: Multi-agent orchestration with Squad integration Feb 21, 2026
- SendPromptAndWaitAsync: use SendPromptAsync return value directly instead
  of capturing stale state TCS (prevents 10-min hang after reconnection)
- Add RunContinuationsAsynchronously to reconnect TCS (matches normal path)
- Fix WaitingForWorkers phase: fire BEFORE Task.WhenAll, not after
- Deduplicate worker assignments before parallel dispatch (prevents
  concurrent send failure for same-worker duplicate @worker blocks)
- Mark sessions as hidden during multi-agent group deletion to prevent
  ReconcileOrganization ghost sessions in default group
- Clean up _modelSwitchLocks semaphore on session close (memory leak)
- Fix cross-platform test: use Path.Combine instead of hardcoded backslash

Findings from: Opus 4.6, Codex 5.3, Sonnet 4.6 reviews
1118/1118 tests passing

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@PureWeen PureWeen merged commit 85f21f1 into main Feb 21, 2026
@PureWeen PureWeen deleted the copilot/add-multi-agent-support branch February 22, 2026 00:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants