Skip to content

feat: Subagent Architecture for Skillchain Execution#3

Open
ancoleman wants to merge 9 commits intomainfrom
feature/subagent-architecture
Open

feat: Subagent Architecture for Skillchain Execution#3
ancoleman wants to merge 9 commits intomainfrom
feature/subagent-architecture

Conversation

@ancoleman
Copy link
Owner

@ancoleman ancoleman commented Dec 10, 2025

Summary

This PR implements a comprehensive subagent architecture for skillchain execution, solving the context rot problem and enabling resumable sessions. Now includes the Claude Agent Manager package for programmatic agent orchestration.

Key Features

🚀 NEW: Claude Agent Manager Package (packages/claude_agent_manager/)

A complete Python library (~6,500+ lines) for programmatic Claude Code CLI orchestration:

  • Core Module: AgentDetector, EventEmitter (16 events), StreamJsonParser, ProcessManager
  • Session Module: SessionManager (auto-resume), SessionWatcher, SessionStorage
  • Orchestration Module: AgentCoordinator, TaskQueue (5 priorities), CircuitBreaker
  • Skillchain Integration: SkillchainExecutor, RegistryManager, ProgressManager
  • CLI: claude-agent command with run/sessions/watch/skillchain subcommands
  • Tests: 122 tests across 3 test files
  • Docs: 6 comprehensive pages in pages/docs/agents/
# Install
pip install -e "./packages/claude_agent_manager[skillchain]"

# Verify
claude-agent check

# Run
claude-agent run "Say hello" --cwd .
claude-agent skillchain route "dashboard with charts"

Phase 1 & 2: Specialized Subagents

  • 6 custom subagent definitions with tool restrictions:
    • skill-executor.md - Base execution specialist
    • skillchain-validator.md - Read-only validation (permissionMode: plan)
    • skillchain-planner.md - Dynamic chain planning
    • frontend-skill-executor.md - UI specialist (no Bash)
    • backend-skill-executor.md - API/database specialist (security-first)
    • infra-skill-executor.md - Infrastructure specialist (safety guardrails)
  • Standardized SKILL COMPLETE report format
  • Install via ./install.sh agents install

Phase 3: Resumable Sessions

  • .skillchain-progress.json schema for session persistence
  • /skillchain resume command for interrupted sessions
  • Progress file I/O in delegated.md orchestrator
  • Accumulated context merging across skills

Phase 4: Testing Framework

  • subagent_tester.py - 825-line Python CLI test framework
  • 22 test cases across 3 test suites
  • CI workflow for validation (test-subagents.yml)
  • Resume scenario documentation

Test plan

  • Run ./install.sh agents install to install subagents
  • Execute a skillchain with delegated mode to verify subagent usage
  • Interrupt a skillchain and verify /skillchain resume works
  • Run python evaluation/subagent_tester.py --all to validate test infrastructure
  • Verify CI workflow passes on this branch
  • Install agent manager: pip install -e "./packages/claude_agent_manager[skillchain]"
  • Verify claude-agent check works
  • Test claude-agent run "Say hello" --cwd .
  • Test claude-agent skillchain route "dashboard with charts"

Files Changed

New: Claude Agent Manager Package (50 files)

  • packages/claude_agent_manager/ - Complete Python package
    • claude_agent_manager/core/ - Detector, Events, Parser, Process
    • claude_agent_manager/session/ - Manager, Watcher, Storage
    • claude_agent_manager/orchestration/ - Coordinator, Queue, CircuitBreaker
    • claude_agent_manager/skillchain/ - Executor, Progress, Registry
    • claude_agent_manager/cli/ - CLI interface
    • tests/ - 122 tests

New: Documentation (6 files)

  • pages/docs/agents/overview.md - Agent ecosystem overview
  • pages/docs/agents/manager-overview.md - Package reference
  • pages/docs/agents/architecture.md - Component diagrams
  • pages/docs/agents/skillchain-integration.md - Integration patterns
  • pages/docs/agents/real-world-usage.md - 7 practical scenarios
  • pages/docs/agents/cli-reference.md - CLI command reference

Updated

  • README.md - Added Agent Manager section
  • CHANGELOG.md - Added v0.7.0 release notes
  • VERSION - Bumped to 0.7.0
  • pages/sidebars.ts - Added agentsSidebar
  • pages/docusaurus.config.ts - Added Agents to navbar/footer
  • .gitignore - Added Python build artifacts

Subagent Files (24)

  • .claude-commands/agents/ - 6 subagent definitions
  • .claude-commands/skillchain-data/shared/progress-schema.yaml
  • .claude-commands/skillchain/resume.md
  • .github/workflows/test-subagents.yml
  • evaluation/subagent_tester.py
  • evaluation/subagent-tests/ - 5 test files

🤖 Generated with Claude Code

ancoleman and others added 9 commits December 10, 2025 12:12
Phase 1 - Core Agent Definitions:
- skill-executor.md: Base skill execution specialist
- skillchain-validator.md: Read-only validation (permissionMode: plan)
- skillchain-planner.md: Dynamic skill chain planning

Phase 2 - Domain Executors:
- frontend-skill-executor.md: UI skills, no Bash, theme-aware
- backend-skill-executor.md: API/DB skills, security-first
- infra-skill-executor.md: K8s/Terraform/CI, safety guardrails

Integration:
- Updated delegated.md to use skill-executor and skillchain-validator
- Updated install.sh with agents installation option
- Added subagent test framework (evaluation/subagent-tests/)

Architecture:
- Coordinator (delegated.md) spawns specialized subagents via Task tool
- Each skill executes in fresh context (prevents context rot)
- >80% skill activation rate guaranteed

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Phase 3 - Persistence Layer:
- progress-schema.yaml: Complete JSON schema for .skillchain-progress.json
- resume.md: New /skillchain resume command for interrupted sessions
- delegated.md: Progress file I/O (create, update, cleanup)
- Accumulated context merging across skill executions

Phase 4 - Testing Framework:
- subagent_tester.py: 825-line Python CLI test framework
  - Claude Code CLI integration with JSONL parsing
  - YAML test case loading with validation
  - Color-coded output and JSON export
- 22 test cases across 3 test files:
  - skill-executor/protocol.yaml (7 tests)
  - skillchain-validator/basic.yaml (7 tests)
  - frontend-skill-executor/theming.yaml (8 tests)
- test-subagents.yml: CI workflow for validation
- Resume test scenarios with sample fixtures

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Based on thorough research of:
- source_data/ideas/cc-integration/patterns.md
- Claude Code CLI reference documentation
- GitHub issues for subprocess spawning

Key fixes:
- Removed non-existent --agent flag
- Fixed prompt passing to use -p flag correctly
- Added CLI detection with expanded PATH
- Added authentication verification (API key, Bedrock, Vertex)
- Fixed stdin handling (subprocess.DEVNULL) to prevent blocking
- Added --check command for setup verification
- Added --max-turns for cost control
- Improved JSONL output parsing with system:init handling
- Added proper environment passthrough

New features:
- get_expanded_path() for finding claude in common locations
- detect_claude_binary() with fallback search
- check_authentication() for setup verification
- --check mode to verify CLI setup before running tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Created evaluation/SUBAGENT_TESTER.md with 9 sections:
1. Overview - Purpose and architecture
2. Prerequisites - CLI, auth, dependencies
3. Quick Start - Common usage examples
4. CLI Reference - All command-line arguments
5. Test File Format - Complete YAML schema
6. How It Works - CLI spawning, JSONL parsing, validation
7. Output Formats - Console and JSON export
8. Troubleshooting - Common errors and solutions
9. CI Integration - GitHub Actions workflow

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
CLI error: "When using --print, --output-format=stream-json requires --verbose"

Added --verbose as mandatory flag in _build_cli_args().

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Problem: Invalid API key in environment was overriding OAuth authentication,
causing "Invalid API key" errors even when logged in via 'claude login'.

Solution:
- Remove ANTHROPIC_API_KEY from subprocess environment
- Prioritize OAuth (claude login) over API key in auth detection
- Update documentation to reflect OAuth as preferred method

This ensures Claude Code subscriptions work correctly with OAuth.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Add Authentication Status section at startup showing:
  - Environment variable status (shell vs subprocess)
  - OAuth/Keychain detection
  - CLI path verification
- Add auth error detection during JSONL parsing
- Log clear error messages when Invalid API key detected
- Suggest fix: 'claude logout && claude login'

This helps debug authentication issues when tests fail due to
stored credentials conflicting with OAuth.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Introduces a complete Python library for programmatic Claude Code CLI
orchestration with real-time streaming, session management, multi-agent
coordination, and skillchain integration.

## Package: claude_agent_manager (~6,500+ lines)

### Core Module
- AgentDetector: 3-tier CLI binary detection with PATH expansion
- EventEmitter: Async event system with 16 event types
- StreamJsonParser: JSONL stream parsing with usage tracking
- ProcessManager: Async subprocess management with streaming I/O
- Fixed session ID parsing for type:"system"+subtype:"init" format

### Session Module
- SessionManager: Session lifecycle with auto-resume via --resume flag
- SessionWatcher: Poll-based file monitoring for ~/.claude/projects/
- SessionStorage: JSON persistence to ~/.claude_agent_manager/sessions/
- Session state machine: IDLE → BUSY → IDLE/ERROR/CLOSED

### Orchestration Module
- AgentCoordinator: Multi-agent coordination with semaphore concurrency
- TaskQueue: Priority queue (heapq) with 5 priority levels
- CircuitBreaker: State machine (CLOSED→OPEN→HALF_OPEN) for fault tolerance
- Pipeline execution with {prev_result} context substitution

### Skillchain Integration Module
- SkillchainExecutor: Programmatic skill chain execution
- RegistryManager: Load skills from skillchain-data/registries/
- ProgressManager: Manage .skillchain-progress.json for resumable sessions
- Goal routing with blueprint detection and dependency sorting

### CLI (claude-agent command)
- claude-agent check: Verify Claude CLI installation
- claude-agent run: Execute single prompts with streaming
- claude-agent sessions: Session management (list/show/delete)
- claude-agent watch: Real-time session file monitoring
- claude-agent skillchain: run/route/status/resume/blueprints

### Tests
- 122 tests across test_core.py, test_session.py, test_orchestration.py
- pytest + pytest-asyncio test framework

### Documentation (6 pages in docs/agents/)
- overview.md: Agent ecosystem with Mermaid diagrams
- manager-overview.md: Package reference with code examples
- architecture.md: Detailed component diagrams
- skillchain-integration.md: SkillchainExecutor patterns
- real-world-usage.md: 7 practical scenarios
- cli-reference.md: Complete CLI command reference

### Updated
- README.md: Added Agent Manager section with examples
- CHANGELOG.md: Added 0.7.0 release notes
- VERSION: Bumped to 0.7.0
- sidebars.ts: Added agentsSidebar
- docusaurus.config.ts: Added Agents to navbar/footer

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
- Rename PAR to PARSER (PAR is reserved keyword for parallel blocks)
- Remove colon from "system:init" in state diagram transitions
- Add spaces around colons in state transitions per Mermaid syntax

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant