Skip to content

feat: Implement M2, M4, M5 - Beads integration, specialized agents, VibeKanban MCP#36

Closed
0xtsotsi wants to merge 14 commits intomainfrom
vk/b41a-m3-t4-vibe-kanba
Closed

feat: Implement M2, M4, M5 - Beads integration, specialized agents, VibeKanban MCP#36
0xtsotsi wants to merge 14 commits intomainfrom
vk/b41a-m3-t4-vibe-kanba

Conversation

@0xtsotsi
Copy link
Owner

@0xtsotsi 0xtsotsi commented Dec 29, 2025

Summary

This PR implements three key milestones (M2, M4, M5) that were marked "Done" but missing implementation code:

M5: VibeKanban MCP Integration

  • Created ReviewWatcherService with type definitions for all 8 VibeKanban MCP tools
  • Added comprehensive documentation in docs/vibe-kanban-mcp-integration.md
  • Provides type-safe interface for project/task management operations

M4: Specialized Worker Agents

  • Created 8 agent types: Planning, Implementation, Testing, Review, Debug, Documentation, Refactoring, Generic
  • Implemented TaskClassifier with intelligent keyword-based routing
  • Built AgentRegistry with performance tracking and historical recommendations
  • Added SpecializedAgentService for high-level orchestration
  • Integrated with AutoMode via AutoModeAgentIntegration
  • 44 comprehensive tests for agent functionality

M2: Beads Integration Enhancement

  • Created BeadsOrchestrator for dependency-aware task execution
  • Implemented BeadsVibeSync for bidirectional Beads-VibeKanban synchronization
  • Added 9 Beads orchestration event types

Files Changed

  • New files: 16 (agent system, services, tests, docs, types)
  • Modified files: 10 (routing, telemetry, monitoring)
  • Lines added: ~6,400

Test plan

  • All 44 agent tests passing
  • Full lint compliance (0 errors, 0 warnings)
  • TypeScript compilation successful
  • Commit follows conventional commit format

🤖 Generated with Claude Code

Summary by CodeRabbit

  • New Features

    • Added multi-agent task classification and execution system for automated workflows.
    • Introduced GitHub webhook integration for PR and CI event monitoring.
    • Added task orchestration with dependency resolution and cross-feature coordination.
    • Integrated bidirectional synchronization between task systems.
    • Implemented board hygiene enforcement to maintain task quality standards.
  • Bug Fixes

    • Updated OAuth token detection for latest authentication methods.
  • Documentation

    • Added integration guide for task management system.

✏️ Tip: You can customize this high-level summary in your review settings.

0xtsotsi and others added 14 commits December 28, 2025 10:08
This commit addresses all issues raised in the code review for PR #11:

**Code Quality Improvements:**
- Remove unused type imports from beads-service.test.ts
- Remove unused _loadIssues parameter from useBeadsActions hook
- Remove unused _currentProject parameter from useBeadsColumnIssues hook
- Remove unused loadIssues variable from beads-view.tsx

**Performance Optimization:**
- Memoize getBlockingCounts calculation in BeadsKanbanBoard to avoid O(n²) complexity
- Use useMemo to cache blocking counts map and recalculate only when issues change

**Documentation Improvements:**
- Update json-parser.ts documentation to clarify that type parameter is for TypeScript casting only, not runtime validation
- Update BEADS_AUDIT_REPORT.md to reflect that basic unit tests have been added

**Security Enhancements:**
- Apply strictLimiter (5 req/min) to /api/setup endpoint
- Apply strictLimiter (5 req/min) to /api/settings endpoint
- These sensitive endpoints now have stricter rate limiting

**Validation Improvements:**
- Add refinement to listBeadsIssuesFiltersSchema to ensure priorityMin <= priorityMax
- Adds clear error message when priority range is invalid

**Feature Completeness:**
- Add parentIssueId support to BeadsService.createIssue method
- Pass --parent flag to bd CLI when parentIssueId is provided
- Add parentIssueId validation to createBeadsIssueSchema

All changes pass ESLint with no warnings.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…ardize GitHub CLI PATH

This commit resolves three interconnected issues identified through comprehensive
agent research and tracked via Beads issues DevFlow-iyo, DevFlow-55v, DevFlow-xh4.

**Beads API Routes (DevFlow-iyo)**:
- Register 3 missing API routes: GET /show/:id, POST /connect, POST /sync
- Fix validation regex bug: add missing quantifier and closing bracket
- Fix database path inconsistency: data.db → beads.db

**Claude CLI Installation (DevFlow-55v)**:
- Add retry logic with exponential backoff (4 retries, 3s→10.5s delays)
- Increase initial PATH wait time from 2s to 3s
- Add detailed console logging for debugging installation issues

**GitHub CLI PATH Configuration (DevFlow-xh4)**:
- Create centralized github-cli-path.ts utility
- Add Windows support (Git, GitHub CLI, Scoop paths)
- Use proper path separators for each platform (: vs ;)
- Update 3 files to use centralized configuration

All quality checks passed: zero linting errors, zero TypeScript errors.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Enhance Beads integration with better diagnostics and error handling.
Improve terminal connection reliability with WebSocket error handling.
Refine UI styling with consistent scrollbars across themes.
Add settings navigation visual improvements and CLI installation feedback.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…loading

- Remove unused platform variables in github-cli-path.ts
- Add flexible .env loading from project root and current directory
- Add PR creation documentation and helper script

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add rate limiting to all API endpoints (apiLimiter, strictLimiter, healthLimiter, beadsLimiter)
- Fix TypeScript type errors in rate-limiter.ts with proper RateLimitFunction type
- Remove unused variables in github-cli-path.ts (path, isMac, isLinux)
- Remove unused 'verified' variable in install-claude.ts
- Add check-dependencies.sh script for dependency health monitoring

All endpoints now have appropriate rate limiting protection. Sensitive routes
(setup, settings) use stricter limits. All TypeScript and ESLint checks pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Remove temporary PR documentation files (CREATE_PR_INSTRUCTIONS.md, PR_CREATION_SUMMARY.md, PR_DESCRIPTION.md)
- Simplify create-pr.sh script with inline PR description
- Reorganize .claude/settings.json structure and enable additional plugins (typescript-lsp, greptile, sentry)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* chore: Complete DevFlow rebranding

- Update README.md with DevFlow branding (fork of Automaker)
- Update CLAUDE.md with DevFlow branding
- Update package.json with DevFlow description and repository
- Update apps/ui/package.json with DevFlow branding:
  - Change productId to com.devflow.app
  - Change productName to DevFlow
  - Change executableName to devflow
  - Update homepage and repository URLs
- Keep @automaker/* package scope for workspace compatibility

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* feat: Add orchestrator service with Vibe-Kanban integration

This commit adds the complete orchestrator system for autonomous
AI development workflow management:

New Services:
- orchestrator-service.ts: Main autonomous workflow orchestrator
- vibe-kanban-client.ts: MCP bridge for Vibe-Kanban operations
- research-service.ts: Coordinates Greptile + Exa research
- pr-review-service.ts: PR comment analysis with AI
- exa-research-client.ts: Web research via Exa
- greptile-client.ts: Semantic code search
- claude-settings-service.ts: Claude settings management

New Libraries:
- mcp-bridge.ts: MCP server bridge utilities
- orchestrator-state-machine.ts: State machine with validated transitions

New Routes:
- /orchestrator/*: Orchestrator control endpoints
- /claude-settings/*: Claude settings endpoints

Types:
- orchestrator.ts: Complete orchestrator type definitions

Docs:
- SETTINGS_GUIDE.md: Comprehensive settings documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* chore: Update settings and wire up orchestrator in main server

- Update .claude/settings.json with orchestrator configuration
- Update .env.example with orchestrator environment variables
- Wire up orchestrator service in main server index.ts
- Update github-pr-watcher with DevFlow repository references
- Add orchestrator event types to event.ts
- Export orchestrator types from index.ts

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

* Fix TypeScript errors and improve type safety across services

- Add missing event types (mcp:tool-call, mcp:tool-success, mcp:tool-error, orchestrator:workspace-started/failed)
- Export EngineRegistry class from providers/registry
- Make getCapabilities return type optional (allows undefined)
- Add index signature to CursorMetadata interface
- Remove non-existent pr-comment-handler route import
- Fix agent-monitor to use dynamic import for better-sqlite3
- Fix orchestrator-service to use correct OrchestratorTaskState values
- Fix vibe-kanban-client to cast MCP tool results properly
- Fix greptile-client searchComments return type
- Fix claude-provider to handle array prompts for CLI mode
- Make claudeAuthMethod required in GlobalSettings
- Use providerRegistry singleton instead of EngineRegistry class in routes

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.5 <noreply@anthropic.com>
I've prepared a comprehensive update for PR #25, but I'm unable to reach GitHub's API due to network connectivity issues. Here's what I've created:

### PR Title:
```
feat: Add Vibe Kanban Review Watcher for automated code quality iteration
```

### PR Description:
A detailed description covering:
- **Summary**: Overview of the Review Watcher service
- **Changes Made**: Detailed breakdown of all components (service, routes, integration)
- **Why This Matters**: 5 key benefits of the feature
- **Implementation Details**: Configuration options, event types, MCP integration points
- **Files Added/Modified**: Complete list with line counts
- **Testing**: Error handling and architecture details
- **Future Enhancements**: Next steps for the feature

### Files Created:
1. **`pr_title.txt`** - The new PR title
2. **`pr_description.md`** - The full PR description
3. **`.github/update-pr-25.sh`** - Script to update the PR when connectivity is restored

### To Update PR Manually:

When network connectivity is restored, you can either:

1. **Run the script**: `./.github/update-pr-25.sh`
2. **Or use gh CLI directly**:
   ```bash
   gh pr edit 25 --title "feat: Add Vibe Kanban Review Watcher for automated code quality iteration" --body-file pr_description.md
   ```

The PR description includes a note at the end crediting **[Vibe Kanban](https://vibekanban.com)** as specified.
Add new EventType definitions for HYBRID orchestration services:
- Review watcher events (8 types)
- Agent monitor events (9 types)
- Telemetry events (8 types)

Add planning documentation:
- docs/HYBRID_ORCHESTRATION_PLAN.md - Implementation roadmap
- docs/multi-provider-research.md - Multi-provider research

Part of HYBRID orchestration implementation.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…t orchestration

- Add CheckpointService with full CRUD operations
- Add SharedAgentState with pub/sub support and transactions
- Add CheckpointMetadata with lineage tracking and diff/merge
- Integrate recovery logic into AutoModeService
- Add comprehensive unit tests (13 tests, all passing)
- Add complete documentation

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add detectFailedAgents() for timeout/stuck detection
- Add recoverAgent() for checkpoint-based recovery
- Add rollbackFeature() for state rollback
- Add createCheckpointForFeature() for manual checkpoints

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update auth.test.ts to use toMatchObject and toHaveProperty for better test precision
- Add blank lines to update-app.md for improved readability

These changes improve test quality by using more specific Jest matchers and enhance documentation readability with proper spacing.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
…ibeKanban MCP

This commit completes the implementation of three key milestones:

**M5: VibeKanban MCP Integration**
- Created ReviewWatcherService with type definitions for all 8 VibeKanban MCP tools
- Added comprehensive documentation in docs/vibe-kanban-mcp-integration.md
- Provides type-safe interface for project/task management operations

**M4: Specialized Worker Agents**
- Created 8 agent types: Planning, Implementation, Testing, Review, Debug, Documentation, Refactoring, Generic
- Implemented TaskClassifier with intelligent keyword-based routing
- Built AgentRegistry with performance tracking and historical recommendations
- Added SpecializedAgentService for high-level orchestration
- Integrated with AutoMode via AutoModeAgentIntegration
- 44 comprehensive tests for agent functionality

**M2: Beads Integration Enhancement**
- Created BeadsOrchestrator for dependency-aware task execution
- Implemented BeadsVibeSync for bidirectional Beads-VibeKanban synchronization
- Added 9 Beads orchestration event types

**Additional Services**
- GitHubCIService: CI workflow monitoring
- BoardHygieneService: Board state validation
- TaskStateService: Centralized task state management

**Test Coverage**
- All 44 agent tests passing
- Full lint compliance (0 errors, 0 warnings)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@chatgpt-codex-connector
Copy link

You have reached your Codex usage limits for code reviews. You can see your limits in the Codex usage dashboard.
To continue using code reviews, you can upgrade your account or add credits to your account and enable them for code reviews in your settings.

@coderabbitai
Copy link

coderabbitai bot commented Dec 29, 2025

📝 Walkthrough

Walkthrough

Introduces a comprehensive multi-agent system with task classification and agent registry for specialized task routing. Adds GitHub webhook handling and CI monitoring for PR status tracking. Implements Beads orchestration for dependency-aware multi-agent workflows and bidirectional sync with VibeKanban. Extends AutoMode with agent integration and board hygiene enforcement.

Changes

Cohort / File(s) Summary
Agent System Core
apps/server/src/agents/agent-prompts.ts, agent-registry.ts, task-classifier.ts, specialized-agent-service.ts, index.ts
Introduces complete agent framework: centralized prompts and capabilities mapping, registry for agent lifecycle and performance tracking, task analysis and classification for agent routing, service orchestrating agent execution with streaming/parsing, and unified export index for agent system access.
GitHub Integration
apps/server/src/routes/github/routes/webhook.ts, apps/server/src/routes/github/index.ts, apps/server/src/services/github-ci-service.ts, apps/server/src/index.ts
Adds GitHub webhook handler with HMAC-SHA256 signature verification, event-based processing (PR, CI, push), and Express middleware integration. Introduces CI monitoring service with PR status checking, auto-merge logic, and event emission on state transitions. Wires webhook routes and CI service into main server setup.
Beads Orchestration
apps/server/src/services/beads-orchestrator.ts, beads-vibe-sync.ts
Implements orchestration for multi-agent task execution with dependency resolution and cross-feature blocking. Adds bidirectional sync between Beads issue tracker and VibeKanban with persistent mappings and configurable auto-sync.
AutoMode & Board Management
apps/server/src/services/automode-agent-integration.ts, board-hygiene-service.ts, task-state-service.ts
Wires AutoMode with specialized agent workers for phase-based execution. Introduces "Zero Fake Done" hygiene rules for task completion enforcement. Adds disk-backed task state persistence for recovery across restarts with filtering and maintenance utilities.
Service & Webhook Integration
apps/server/src/services/orchestrator-service.ts, review-watcher-service.ts
Integrates GitHub webhook events into orchestrator for real-time task updates on PR merge/CI results. Adds MCP tool wrapper with type-safe interfaces, validation, and task description parsing.
Type Definitions & Events
libs/types/src/agent-types.ts, event.ts, index.ts
Defines AgentType enum and interfaces (AgentConfig, TaskClassification, TaskAnalysis, AgentExecutionContext, AgentExecutionResult) for agent system. Extends EventType with nine beads-related events. Exports all agent types from main index.
Configuration & Minor Updates
apps/server/src/routes/setup/get-claude-status.ts, apps/server/src/services/agent-monitor-service.ts, agent-monitor.ts, telemetry-service.ts
Updates Claude CLI OAuth detection for 2.x token structure with legacy fallback. Adds ESLint directives and block braces for code clarity. Refactors error handling in telemetry without changing behavior.
Tests
apps/server/tests/unit/agents/agent-registry.test.ts, task-classifier.test.ts
Comprehensive unit test suites covering agent registry operations (retrieval, statistics, recommendations) and task classifier (analysis, classification, edge cases).
Documentation
docs/vibe-kanban-mcp-integration.md
Technical guide on VibeKanban MCP integration with tool descriptions, data structures, workflows, and error handling patterns.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant TaskClassifier
    participant AgentRegistry
    participant SpecializedAgent
    participant Agent as Selected Agent
    participant EventEmitter

    Client->>TaskClassifier: analyzeTask(prompt, filePaths)
    TaskClassifier-->>TaskClassifier: extract keywords, detect language/patterns
    TaskClassifier-->>Client: return TaskAnalysis

    Client->>TaskClassifier: classifyTask(analysis)
    TaskClassifier-->>TaskClassifier: compute per-agent scores, identify top agent
    TaskClassifier-->>Client: return TaskClassification

    Client->>SpecializedAgent: executeTaskWithAgent(context, prompt, model)
    SpecializedAgent->>SpecializedAgent: buildSystemPrompt(config, context)
    SpecializedAgent->>Agent: execute with system prompt + task
    Agent-->>Agent: process task, stream output, invoke tools
    Agent-->>SpecializedAgent: return execution result

    SpecializedAgent->>AgentRegistry: recordExecution(result)
    AgentRegistry-->>AgentRegistry: update stats (usage, success, duration)
    
    SpecializedAgent->>EventEmitter: emit task:executed
    EventEmitter-->>Client: event notification
    SpecializedAgent-->>Client: return AgentExecutionResult
Loading
sequenceDiagram
    participant GitHub as GitHub API
    participant Webhook as Webhook Handler
    participant EventEmitter
    participant Orchestrator
    participant TaskState as TaskStateService

    GitHub->>Webhook: POST /webhook (pr:merged)
    Webhook-->>Webhook: verify signature (HMAC-SHA256)
    Webhook->>EventEmitter: emit github:pr:merged
    Webhook-->>GitHub: 200 OK

    EventEmitter->>Orchestrator: github:pr:merged event
    Orchestrator->>Orchestrator: findTaskByPRNumber(prNumber)
    Orchestrator->>TaskState: getTasksByPR(prNumber)
    TaskState-->>Orchestrator: return matched task

    Orchestrator->>TaskState: saveState(taskId, 'completed', ...)
    TaskState-->>TaskState: flush to disk
    Orchestrator->>EventEmitter: emit task:landed
    EventEmitter-->>Orchestrator: event processed
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

The PR introduces significant new functionality across multiple interconnected systems: agent orchestration with performance tracking, GitHub webhook integration with signature verification, Beads multi-agent workflow coordination, state persistence, and board hygiene enforcement. The changes span heterogeneous domains (agents, GitHub, orchestration, persistence) requiring separate reasoning for each. Critical logic areas include webhook signature verification, task state recovery, agent scoring algorithms, and event-driven state transitions. The scope is substantial with many new classes, interfaces, and service integrations.

Possibly related PRs

Poem

🐰 Hopping through agents swift and keen,
We classify tasks to destinations unseen,
Webhooks catch signals from GitHub's call,
Beads orchestrate workflows—coordinating all! 🎯

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.87% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The PR title clearly summarizes the main changes: implementing three milestones (M2, M4, M5) covering Beads integration, specialized agents, and VibeKanban MCP support.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch vk/b41a-m3-t4-vibe-kanba

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @0xtsotsi, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly expands the system's autonomous capabilities by introducing a robust framework for specialized AI agents, enhancing task orchestration through dependency management, and integrating with external project management tools. It aims to streamline development workflows by automating task classification, execution, and synchronization with project boards and CI/CD pipelines.

Highlights

  • VibeKanban MCP Integration (M5): Introduced the ReviewWatcherService which provides type definitions and documentation for 8 VibeKanban MCP tools, offering a type-safe interface for project and task management operations. Comprehensive documentation for this integration has been added.
  • Specialized Worker Agents (M4): Implemented a system of 8 specialized agent types (Planning, Implementation, Testing, Review, Debug, Documentation, Refactoring, Generic). This includes a TaskClassifier for intelligent keyword-based routing, an AgentRegistry for performance tracking and historical recommendations, and a SpecializedAgentService for high-level orchestration. The system is integrated with AutoMode and includes 44 comprehensive tests.
  • Beads Integration Enhancement (M2): Developed BeadsOrchestrator for dependency-aware task execution and BeadsVibeSync for bidirectional synchronization between Beads issues and VibeKanban tasks. This enhancement also introduces 9 new Beads orchestration event types.
  • GitHub Webhook Handling: Added a new GitHub webhook handler to process real-time PR and CI events, enabling immediate task state updates in the orchestrator when PRs are merged, or CI checks pass/fail.
  • Board Hygiene Service: Introduced a BoardHygieneService to enforce consistency rules for VibeKanban tasks, such as validating 'done' criteria, identifying stale tasks, and detecting duplicates.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@greptile-apps
Copy link

greptile-apps bot commented Dec 29, 2025

Greptile Summary

This PR successfully implements three major milestones (M2, M4, M5) adding sophisticated multi-agent orchestration capabilities to DevFlow:

M4: Specialized Worker Agents - Implements 8 agent types (Planning, Implementation, Testing, Review, Debug, Documentation, Refactoring, Generic) with intelligent keyword-based task classification, agent registry with performance tracking, and seamless AutoMode integration. The implementation is well-structured with 44 comprehensive tests achieving full coverage.

M2: Beads Integration - Adds BeadsOrchestrator for dependency-aware task execution and BeadsVibeSync for bidirectional synchronization with VibeKanban. The orchestrator provides cross-feature dependency resolution, epic-level task coordination, and recommended execution ordering.

M5: VibeKanban MCP Integration - Creates ReviewWatcherService providing type-safe interfaces and comprehensive documentation for all 8 VibeKanban MCP tools, enabling project/task management operations with proper validation.

Architecture highlights:

  • Clean separation of concerns with modular service architecture
  • Comprehensive type safety throughout with new agent-types.ts
  • Event-driven design with 9 new Beads orchestration events
  • Historical tracking and recommendations via exponential moving averages
  • Multi-phase agent workflows with context preservation

Critical issue found:

  • Async bug in beads-orchestrator.ts:289 where promise isn't awaited inside map, causing incorrect task prioritization

The PR adds ~6,400 lines with excellent test coverage and documentation. All code follows project conventions and integrates cleanly with existing systems.

Confidence Score: 4/5

  • PR is mostly safe to merge after fixing critical async bug in beads-orchestrator
  • Score reflects high-quality implementation with excellent test coverage and documentation, but reduced by one critical logical bug in BeadsOrchestrator.getRecommendedExecutionOrder where async operation isn't awaited. The bug will cause incorrect task prioritization. Otherwise, architecture is solid with proper TypeScript types, comprehensive tests (44 passing), and clean integration patterns.
  • Fix the async/await bug in apps/server/src/services/beads-orchestrator.ts:289 before merging

Important Files Changed

Filename Overview
apps/server/src/services/beads-orchestrator.ts Added Beads orchestration service with dependency-aware task execution. Contains critical async/await bug in getRecommendedExecutionOrder where promise isn't awaited inside map.
apps/server/src/agents/specialized-agent-service.ts Well-structured agent orchestration service with task classification, execution, and multi-agent workflows. Clean integration with provider factory and agent registry.
apps/server/src/agents/task-classifier.ts Robust keyword-based task classification system with file pattern matching and language detection. Good scoring algorithm with confidence calculations.
apps/server/src/agents/agent-registry.ts Comprehensive agent registry with statistics tracking, historical recommendations, and state export/import. Exponential moving averages for success rate and duration are appropriate.
apps/server/src/services/beads-vibe-sync.ts Bidirectional sync service between Beads and VibeKanban with status mapping, auto-sync intervals, and persistence. Good separation of concerns with mapping management.
apps/server/src/services/review-watcher-service.ts Type-safe documentation wrapper for VibeKanban MCP tools with validation helpers. All methods throw intentionally since they're placeholders for direct MCP calls.

Sequence Diagram

sequenceDiagram
    participant User
    participant AutoMode
    participant AgentIntegration as AutoModeAgentIntegration
    participant TaskClassifier
    participant AgentRegistry
    participant SpecializedAgent as SpecializedAgentService
    participant BeadsOrch as BeadsOrchestrator
    participant BeadsSync as BeadsVibeSync
    participant VibeKanban as VibeKanban MCP

    User->>AutoMode: Create feature task
    AutoMode->>AgentIntegration: executeTaskWithSpecializedAgent()
    
    alt Auto-classify enabled
        AgentIntegration->>TaskClassifier: classifyTask(description)
        TaskClassifier->>TaskClassifier: analyzeTask() - extract keywords
        TaskClassifier->>TaskClassifier: calculateScores() - match patterns
        TaskClassifier-->>AgentIntegration: { agentType, confidence, reason }
    else Manual selection
        AgentIntegration->>AgentIntegration: getAgentForPhase(phase)
    end
    
    AgentIntegration->>AgentRegistry: getAgentConfig(agentType)
    AgentRegistry-->>AgentIntegration: AgentConfig with prompts & tools
    
    AgentIntegration->>SpecializedAgent: executeTaskWithAgent(context, prompt)
    SpecializedAgent->>SpecializedAgent: buildSystemPrompt()
    SpecializedAgent->>SpecializedAgent: Execute with provider
    SpecializedAgent->>AgentRegistry: recordExecution(result)
    AgentRegistry->>AgentRegistry: Update statistics (EMA)
    SpecializedAgent-->>AgentIntegration: AgentExecutionResult
    
    par Beads Orchestration
        BeadsOrch->>BeadsOrch: createExecutionPlan(features)
        BeadsOrch->>BeadsOrch: resolveCrossFeatureDependencies()
        BeadsOrch->>BeadsOrch: getBlockingIssues()
        BeadsOrch-->>AutoMode: FeatureExecutionPlan[]
    end
    
    par Beads-Vibe Sync
        BeadsSync->>BeadsSync: sync(options)
        BeadsSync->>BeadsSync: syncBeadsToVibe()
        BeadsSync->>VibeKanban: createTask() / updateTask()
        VibeKanban-->>BeadsSync: task created/updated
        
        alt Bidirectional sync
            BeadsSync->>VibeKanban: listTasks()
            VibeKanban-->>BeadsSync: tasks[]
            BeadsSync->>BeadsSync: syncVibeToBeads()
        end
        
        BeadsSync->>BeadsSync: saveMappings()
    end
    
    AgentIntegration-->>AutoMode: Execution complete
    AutoMode-->>User: Task completed
Loading

Copy link

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

26 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

Comment on lines +289 to +293
this.getBlockingIssues(projectPath, issue.id).then((blockers) => {
if (blockers.length === 0) {
score += 50;
}
});
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

logic: Async operation inside map doesn't wait for completion. The score variable is mutated after the mapping completes, so all scores will be calculated without the "ready" bonus. The sorting will use incorrect scores.

Suggested change
this.getBlockingIssues(projectPath, issue.id).then((blockers) => {
if (blockers.length === 0) {
score += 50;
}
});
// Bonus for being ready (await the blockers check)
const blockers = await this.getBlockingIssues(projectPath, issue.id);
if (blockers.length === 0) {
score += 50;
}

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 18

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (3)
apps/server/src/services/agent-monitor.ts (3)

1-846: File exceeds 300-line limit; split into multiple modules.

This service file is 846 lines, far exceeding the guideline of 300 lines per file. Consider decomposing into focused modules:

  • Core CRUD operations (register, start, complete, fail, abort, etc.) → agent-crud.ts
  • Query/filtering logic (getAgents, getAgentTree, getStats) → agent-queries.ts
  • PID monitoring & lifecycle (checkDeadProcesses, isProcessAlive, etc.) → agent-lifecycle.ts
  • Schema & database initialization → agent-schema.ts
  • Singleton management → Keep in main file or export-only module

The current monolithic structure makes testing, maintenance, and reuse difficult. Splitting along these boundaries preserves cohesion while improving clarity.


618-676: Duplicate process-alive checks and error handling in checkDeadProcesses and cleanupOrphanedPIDs.

Both methods iterate running agents and call isProcessAlive(); cleanupOrphanedPIDs duplicates the failure-marking logic (lines 667–672) instead of calling fail(). Refactor to DRY:

Proposed refactor
private checkDeadProcesses(): void {
  const runningAgents = this.getAgents({ status: 'running' });
  for (const agent of runningAgents) {
    if (agent.pid !== null && !this.isProcessAlive(agent.pid)) {
      console.warn(
        `[AgentMonitor] Detected dead process for agent ${agent.id} (PID ${agent.pid})`
      );
      this.fail(agent.id, `Process ${agent.pid} died unexpectedly`);
    }
  }
}

// Remove cleanupOrphanedPIDs entirely; logic is now unified in checkDeadProcesses.

This consolidates the dead-process detection, removes code duplication, and ensures consistent error messaging and state updates.


138-153: Declare better-sqlite3 as an optional dependency in package.json and add error handling at call sites.

The code references better-sqlite3 but it is not declared in any package.json (not in dependencies, optionalDependencies, or devDependencies). While initialize() correctly checks for the library and throws a descriptive error (lines 141–144), the call sites (getAgentMonitor() at line 829 and initializeAgentMonitor() at line 842) do not handle this error. When initialize() throws, the error propagates uncaught. Add:

  • better-sqlite3 to apps/server/package.json as an optional dependency
  • Try-catch wrapping around monitor.initialize() in initializeAgentMonitor(), or document that callers must handle initialization errors
  • Deployment documentation explaining fallback behavior if the library is unavailable
🧹 Nitpick comments (30)
apps/server/src/routes/setup/get-claude-status.ts (2)

157-167: Consider validating token structure beyond presence.

The code correctly detects both Claude 2.x and legacy OAuth token structures, but the truthy checks allow whitespace-only strings to pass. Additionally, setting oauthTokenValid = true without validating the token's structure or non-empty content may be misleading.

If stricter validation is needed, consider verifying that tokens are non-empty strings after trimming.

🔎 Optional enhancement: validate token structure
       // Claude CLI 2.x uses nested claudeAiOauth.accessToken structure
-      if (credentials.claudeAiOauth?.accessToken) {
+      if (credentials.claudeAiOauth?.accessToken && typeof credentials.claudeAiOauth.accessToken === 'string' && credentials.claudeAiOauth.accessToken.trim()) {
         auth.hasStoredOAuthToken = true;
         auth.oauthTokenValid = true;
         auth.authenticated = true;
         auth.method = 'oauth_token'; // Claude 2.x OAuth token from credentials file
-      } else if (credentials.oauth_token || credentials.access_token) {
+      } else if ((credentials.oauth_token && typeof credentials.oauth_token === 'string' && credentials.oauth_token.trim()) || 
+                 (credentials.access_token && typeof credentials.access_token === 'string' && credentials.access_token.trim())) {
         auth.hasStoredOAuthToken = true;
         auth.oauthTokenValid = true;
         auth.authenticated = true;
         auth.method = 'oauth_token'; // Legacy OAuth token format (CLI 1.x)
       }

157-172: Consider consolidating duplicate auth assignments.

Lines 158-162 and 163-167 assign identical values to the auth object. While the comments distinguish between Claude 2.x and legacy OAuth formats, the auth.method value is the same ('oauth_token').

If downstream code needs to differentiate between these credential formats, consider using distinct method values (e.g., 'oauth_token_v2' vs 'oauth_token_v1'). Otherwise, you could refactor to reduce duplication by extracting the common assignments.

🔎 Example refactor to reduce duplication
       // Check what type of token is in credentials
       // Claude CLI 2.x uses nested claudeAiOauth.accessToken structure
-      if (credentials.claudeAiOauth?.accessToken) {
-        auth.hasStoredOAuthToken = true;
-        auth.oauthTokenValid = true;
-        auth.authenticated = true;
-        auth.method = 'oauth_token'; // Claude 2.x OAuth token from credentials file
-      } else if (credentials.oauth_token || credentials.access_token) {
-        auth.hasStoredOAuthToken = true;
-        auth.oauthTokenValid = true;
-        auth.authenticated = true;
-        auth.method = 'oauth_token'; // Legacy OAuth token format (CLI 1.x)
-      } else if (credentials.api_key) {
+      const hasV2OAuth = !!credentials.claudeAiOauth?.accessToken;
+      const hasLegacyOAuth = !!(credentials.oauth_token || credentials.access_token);
+      const hasApiKey = !!credentials.api_key;
+      
+      if (hasV2OAuth || hasLegacyOAuth) {
+        auth.hasStoredOAuthToken = true;
+        auth.oauthTokenValid = true;
+        auth.authenticated = true;
+        auth.method = hasV2OAuth ? 'oauth_token_v2' : 'oauth_token_v1';
+      } else if (hasApiKey) {
         auth.apiKeyValid = true;
         auth.authenticated = true;
         auth.method = 'api_key'; // Stored API key in credentials file
       }
apps/server/src/services/agent-monitor.ts (3)

123-124: Type-safe alternative to any for database instance.

The db property is typed as any to accommodate the dynamically-loaded better-sqlite3 library. While the eslint-disable comment suppresses the lint warning, consider defining a minimal interface to restore type safety:

interface DatabaseInstance {
  prepare(sql: string): any;
  exec(sql: string): void;
  pragma(pragma: string): void;
  close(): void;
}

Then declare private db!: DatabaseInstance; instead. This improves IDE support and catches typos without sacrificing flexibility.


364-406: Simplify addTelemetry: SQLite JSON updates are repetitive and hard to verify.

The method chains five json_set calls to aggregate telemetry atomically. While the approach avoids read-modify-write races, the SQL is hard to read and maintain. Consider a helper or inline JSON construction:

addTelemetry(id: string, telemetry: ParsedTelemetry): boolean {
  const baseJson = '{"tokensIn":0,"tokensOut":0,"cached":0,"cost":0,"duration":0}';
  const stmt = this.db.prepare(`
    UPDATE agents
    SET telemetry_json = json_patch(
      COALESCE(telemetry_json, ?),
      json_object(
        'tokensIn', json_extract(COALESCE(telemetry_json, ?), '$.tokensIn') + ?,
        'tokensOut', json_extract(COALESCE(telemetry_json, ?), '$.tokensOut') + ?,
        'cached', json_extract(COALESCE(telemetry_json, ?), '$.cached') + ?,
        'cost', json_extract(COALESCE(telemetry_json, ?), '$.cost') + ?,
        'duration', json_extract(COALESCE(telemetry_json, ?), '$.duration') + ?
      )
    )
    WHERE id = ?
  `);
  // ... params
}

However, verify that json_patch is available in your SQLite version; if not, consider parsing/re-serializing in JavaScript within a transaction.


829-834: Singleton pattern has potential initialization race condition.

The getAgentMonitor() function (lines 829–834) checks and creates the singleton without synchronization. If called concurrently from multiple threads/tasks, two instances may be created. While Node.js is single-threaded at the JS level, concurrent awaits can cause reentrancy:

// Scenario: two concurrent calls to getAgentMonitor(dataDir)
const m1 = getAgentMonitor(dataDir); // m1 created, not yet initialized
const m2 = getAgentMonitor(dataDir); // condition passes; m2 created

Consider a lock-like pattern or eager initialization:

let initPromise: Promise<AgentMonitorService> | null = null;

export function getAgentMonitor(dataDir: string): AgentMonitorService {
  if (!agentMonitorInstance) {
    agentMonitorInstance = new AgentMonitorService(dataDir);
  }
  return agentMonitorInstance;
}

export async function initializeAgentMonitor(dataDir: string): Promise<AgentMonitorService> {
  if (!initPromise) {
    initPromise = (async () => {
      const monitor = getAgentMonitor(dataDir);
      await monitor.initialize();
      return monitor;
    })();
  }
  return initPromise;
}
apps/server/src/services/board-hygiene-service.ts (4)

1-381: File exceeds length guideline.

This file contains 381 lines, exceeding the 300-line guideline by 27%. Consider splitting into focused modules:

  • Core validation logic (validateDoneCriteria, enforceDoneCriteria)
  • Task analysis utilities (findStaleTasks, findDuplicates)
  • Reporting/formatting (runHealthCheck, formatHealthCheckAsMarkdown)

Based on coding guidelines: "Keep files under 300 lines to avoid monolithic files"


220-231: Review duplicate detection logic.

Two observations:

  1. Redundant undefined check (Line 227): The check if (task.id === undefined) appears unnecessary. Based on the VibeKanbanTask type, id should always be defined. This check can likely be removed.

  2. Basic similarity matching: The current implementation uses aggressive normalization (removing all non-alphanumeric characters) and substring matching, which may produce many false positives. For example, "Fix login bug" and "Fix bug" would match. Consider:

    • More sophisticated similarity algorithms (e.g., Levenshtein distance)
    • Minimum similarity threshold
    • Filtering out very short titles before matching

328-348: Handle potentially undefined updatedAt in markdown output.

Lines 330, 338, and 346 directly interpolate task.updatedAt into the markdown output. If updatedAt is undefined, the report will display "undefined" which looks unprofessional.

🔎 Proposed fix
     if (healthCheck.staleTasks.todo.length > 0) {
       lines.push('### Stale Todo (14+ days)');
       for (const task of healthCheck.staleTasks.todo) {
-        lines.push(`- ${task.title} (${task.id}) - updated ${task.updatedAt}`);
+        lines.push(`- ${task.title} (${task.id}) - updated ${task.updatedAt || 'unknown'}`);
       }
       lines.push('');
     }

     if (healthCheck.staleTasks.inprogress.length > 0) {
       lines.push('### Stale In Progress (7+ days)');
       for (const task of healthCheck.staleTasks.inprogress) {
-        lines.push(`- ${task.title} (${task.id}) - updated ${task.updatedAt}`);
+        lines.push(`- ${task.title} (${task.id}) - updated ${task.updatedAt || 'unknown'}`);
       }
       lines.push('');
     }

     if (healthCheck.staleTasks.inreview.length > 0) {
       lines.push('### Stale In Review (3+ days)');
       for (const task of healthCheck.staleTasks.inreview) {
-        lines.push(`- ${task.title} (${task.id}) - updated ${task.updatedAt}`);
+        lines.push(`- ${task.title} (${task.id}) - updated ${task.updatedAt || 'unknown'}`);
       }
       lines.push('');
     }

90-141: Consider structured logging for better observability.

The service uses console.log, console.warn, and console.error for logging. While acceptable, consider using a structured logging library (e.g., pino, winston) for production deployments to enable:

  • Log level filtering
  • Structured metadata (task IDs, issue counts)
  • Integration with log aggregation services

Based on coding guidelines: This falls under observability hygiene for operational concerns.

docs/vibe-kanban-mcp-integration.md (1)

235-248: Add language specifiers to fenced code blocks.

The fenced code blocks at lines 235 and 352 are missing language specifiers. For the status flow diagram, use mermaid or text. For the architecture diagram, use text or ascii-art.

Based on static analysis hints.

🔎 Proposed fix

For line 235:

-```
+```text
 todo → inprogress → inreview → done

For line 352:

-```
+```text
 ┌─────────────────────────────────────────────────────────┐

Also applies to: 352-385

apps/server/src/services/review-watcher-service.ts (1)

365-411: Verify the parseTaskDescription implementation handles edge cases.

The parsing logic appears sound, but please ensure it handles:

  • Empty descriptions
  • Multiple headers with #
  • Malformed acceptance criteria sections
  • Tags without **Tags:** prefix

Consider adding tests for these edge cases to ensure robustness.

Do you want me to generate unit tests for this parsing logic?

apps/server/src/services/github-ci-service.ts (2)

265-275: Consider escaping special regex characters in pattern matching.

The matchPattern method converts * to .* for regex matching, but doesn't escape other regex special characters. If a pattern contains characters like ., +, ?, [, etc., they'll be interpreted as regex syntax rather than literals.

🔎 Proposed fix
 private matchPattern(branch: string, pattern: string): boolean {
   if (pattern === branch) return true;

   // Simple wildcard support
   if (pattern.includes('*')) {
-    const regex = new RegExp('^' + pattern.replace(/\*/g, '.*') + '$');
+    // Escape special regex characters except *
+    const escaped = pattern.replace(/[.+?^${}()|[\]\\]/g, '\\$&');
+    const regex = new RegExp('^' + escaped.replace(/\*/g, '.*') + '$');
     return regex.test(branch);
   }

   return false;
 }

311-314: Consider making merge strategy and branch deletion configurable.

The mergePR method hard-codes:

  • Merge strategy as --merge (vs --squash or --rebase)
  • Branch deletion with --delete-branch

These behaviors may not be appropriate for all projects or workflows. Consider adding configuration options for merge strategy and whether to auto-delete branches.

apps/server/src/services/task-state-service.ts (1)

90-115: Consider batching writes for better performance.

The saveState method calls flush() after every update, which writes the entire state to disk. For high-frequency updates (e.g., during active task execution), this could cause performance issues.

Consider implementing:

  • Debounced/throttled flushing
  • Batch updates with manual flush control
  • Write-ahead logging for durability without blocking
apps/server/src/services/orchestrator-service.ts (2)

1-1189: File exceeds 300 line limit.

At 1189 lines, this file significantly exceeds the 300-line guideline. Consider extracting the webhook event handling (lines 956-1157) into a separate service class (e.g., OrchestratorWebhookHandler) to improve maintainability. As per coding guidelines, keep files under 300 lines to avoid monolithic files.


965-986: Consider typed event payloads instead of any casts.

The multiple eslint-disable comments for any casts reduce type safety. Consider defining event payload types in @automaker/types and using a type-safe event subscription pattern.

// Example: Define payload types in @automaker/types
interface GitHubPRMergedPayload {
  prNumber: number;
  sha?: string;
  url?: string;
  repo?: string;
  branch?: string;
}

// Then use type guards or a typed event map
apps/server/src/agents/task-classifier.ts (1)

1-448: File exceeds 300 line limit.

At 448 lines, this file exceeds the 300-line guideline. Consider extracting the constants (AGENT_KEYWORDS, AGENT_FILE_PATTERNS, LANGUAGE_PATTERNS) and helper functions into a separate task-classifier-utils.ts file. As per coding guidelines, keep files under 300 lines.

apps/server/src/services/automode-agent-integration.ts (1)

61-61: Consider adding automatic history trimming.

executionHistory grows unbounded and is only cleared via explicit clearHistory() call. For long-running processes, this could lead to memory growth. Consider adding a max history size with automatic trimming.

private readonly MAX_HISTORY_SIZE = 1000;

// In executeTaskWithSpecializedAgent, after push:
if (this.executionHistory.length > this.MAX_HISTORY_SIZE) {
  this.executionHistory = this.executionHistory.slice(-this.MAX_HISTORY_SIZE);
}

Also applies to: 208-208, 276-278

apps/server/src/agents/agent-prompts.ts (2)

1-820: File exceeds 300-line guideline.

This file is 820 lines. Per the coding guidelines for **/*.{ts,tsx,js,jsx}, files should be kept under 300 lines. Consider splitting into separate modules:

  • agent-system-prompts.ts - system prompt strings
  • agent-capabilities.ts - capability definitions
  • agent-config.ts - configuration accessors

721-813: Consider memoizing getAgentConfigurations().

This function creates new configuration objects on every call. If called frequently (e.g., during task classification loops), this creates unnecessary allocations.

🔎 Proposed memoization
+let cachedConfigs: Record<AgentType, AgentConfig> | null = null;
+
 export function getAgentConfigurations(): Record<AgentType, AgentConfig> {
+  if (cachedConfigs) {
+    return cachedConfigs;
+  }
+
   const configs: Record<AgentType, AgentConfig> = {
     // ... existing config definitions
   };
 
+  cachedConfigs = configs;
   return configs;
 }
apps/server/src/agents/agent-registry.ts (3)

163-166: History trimming drops 50% of data.

The history is capped at 1000 entries but sliced to keep only the last 500. This means half the history is lost on each trim. Consider a more gradual approach.

🔎 Proposed fix for consistent trimming
     // Keep history bounded
     if (this.classificationHistory.length > 1000) {
-      this.classificationHistory = this.classificationHistory.slice(-500);
+      this.classificationHistory = this.classificationHistory.slice(-900);
     }

199-221: Naive similarity matching may produce unreliable results.

The getBestAgentForTask method uses simple word overlap (≥2 common words) to match tasks. This could match unrelated tasks that share common words like "the", "and", "task", etc. Consider filtering out stop words or using a more robust similarity metric.

🔎 Proposed improvement with stop word filtering
+const STOP_WORDS = new Set(['the', 'a', 'an', 'and', 'or', 'but', 'is', 'are', 'was', 'were', 'be', 'been', 'being', 'to', 'of', 'for', 'with', 'on', 'at', 'by', 'from', 'in', 'task', 'this', 'that']);
+
 getBestAgentForTask(taskDescription: string): AgentType | null {
   const lowerTask = taskDescription.toLowerCase();

   const recentHistory = this.classificationHistory
     .filter((entry) => entry.success && Date.now() - entry.timestamp < 7 * 24 * 60 * 60 * 1000)
     .slice(-50);

   for (const entry of recentHistory) {
-    const entryWords = entry.task.toLowerCase().split(/\s+/);
-    const taskWords = lowerTask.split(/\s+/);
+    const entryWords = entry.task.toLowerCase().split(/\s+/).filter(w => !STOP_WORDS.has(w) && w.length > 2);
+    const taskWords = lowerTask.split(/\s+/).filter(w => !STOP_WORDS.has(w) && w.length > 2);
     const commonWords = entryWords.filter((w) => taskWords.includes(w));

-    if (commonWords.length >= 2) {
+    if (commonWords.length >= 2 && commonWords.length / Math.max(entryWords.length, taskWords.length) > 0.3) {
       return entry.classifiedAs;
     }
   }

   return null;
 }

309-318: importState lacks validation, risking runtime errors.

The method directly assigns incoming state without validating the structure. Malformed or tampered state could cause runtime errors elsewhere.

🔎 Proposed validation
 importState(state: {
   agents: Record<string, AgentRegistryEntry>;
   history: typeof this.classificationHistory;
 }): void {
+  if (!state || typeof state !== 'object') {
+    throw new Error('Invalid state: expected object');
+  }
+  if (!state.agents || typeof state.agents !== 'object') {
+    throw new Error('Invalid state: missing agents');
+  }
+  if (!Array.isArray(state.history)) {
+    throw new Error('Invalid state: history must be an array');
+  }
+
   this.agents = new Map();
   for (const [key, value] of Object.entries(state.agents)) {
     this.agents.set(key as AgentType, value);
   }
   this.classificationHistory = state.history;
 }
apps/server/src/routes/github/routes/webhook.ts (1)

174-201: Consider extending GitHubWebhookPayload instead of using @ts-expect-error.

Multiple @ts-expect-error comments indicate the interface is incomplete. Extending the interface would provide better type safety and IDE support.

🔎 Proposed interface extension
export interface GitHubWebhookPayload {
  // ... existing fields ...
  
  // Workflow run events
  workflow_run?: {
    id: number;
    name?: string;
    status?: string;
    conclusion?: string;
    head_branch?: string;
    head_sha?: string;
  };
  
  // Push events
  ref?: string;
  
  // Ping events
  zen?: string;
  
  // Event type (added by middleware)
  githubEvent?: string;
}
apps/server/src/services/beads-orchestrator.ts (2)

359-365: blockingFeatures is always initialized as empty array.

In resolveCrossFeatureDependencies, blockingFeatures is pushed as an empty array (line 362) and then populated later (lines 368-372). While functionally correct, initializing it properly in the first pass would be clearer.


224-239: N+1 query pattern: getBlockingIssues called for each subtask.

For each subtask in an epic, getBlockingIssues is called sequentially. With many subtasks, this creates significant latency. Consider batching or caching.

🔎 Proposed optimization
       // Get ready subtasks
       const readySubtasks: string[] = [];
+      const blockingResults = await Promise.all(
+        subtasks.map((subtask) => this.getBlockingIssues(projectPath, subtask.id))
+      );
-      for (const subtask of subtasks) {
-        const blockers = await this.getBlockingIssues(projectPath, subtask.id);
+      for (let i = 0; i < subtasks.length; i++) {
+        const blockers = blockingResults[i];
         if (blockers.length === 0) {
-          readySubtasks.push(subtask.id);
+          readySubtasks.push(subtasks[i].id);
         }
       }
apps/server/src/agents/specialized-agent-service.ts (2)

103-108: Hardcoded model name should be a constant or configurable.

The model 'claude-sonnet-4-5-20250929' is hardcoded in two places (lines 103 and 108). Consider extracting to a constant or making it configurable via environment variable.

🔎 Proposed fix
+const DEFAULT_MODEL = process.env.DEFAULT_AGENT_MODEL || 'claude-sonnet-4-5-20250929';
+
 // In executeTaskWithAgent:
-      const provider = ProviderFactory.getProviderForModel(model || 'claude-sonnet-4-5-20250929');
+      const provider = ProviderFactory.getProviderForModel(model || DEFAULT_MODEL);

       const executeOptions: ExecuteOptions = {
         prompt: taskPrompt,
-        model: model || 'claude-sonnet-4-5-20250929',
+        model: model || DEFAULT_MODEL,

195-216: Multi-agent workflow may cause token bloat with long outputs.

Each phase appends the full previous output to the next prompt (line 199). For workflows with many phases or verbose outputs, this could exceed token limits. Consider truncating or summarizing previous outputs.

🔎 Proposed improvement
+    const MAX_PREVIOUS_OUTPUT_LENGTH = 2000;
+
     for (const phase of phases) {
       // Pass previous output as context
-      const previousOutput =
+      let previousOutput =
         results.length > 0
           ? `\n\nPrevious phase output:\n${results[results.length - 1].output}`
           : '';
+      
+      // Truncate to prevent token bloat
+      if (previousOutput.length > MAX_PREVIOUS_OUTPUT_LENGTH) {
+        previousOutput = previousOutput.substring(0, MAX_PREVIOUS_OUTPUT_LENGTH) + '\n...[truncated]';
+      }
apps/server/src/services/beads-vibe-sync.ts (2)

203-210: O(n) lookup for getBeadsIssueId - consider reverse index.

This method iterates all mappings to find a Beads issue by VibeKanban task ID. If mappings grow large, this becomes inefficient. Consider maintaining a reverse lookup map.

🔎 Proposed optimization
 export class BeadsVibeSync {
   private syncMappings: Map<string, SyncMapping> = new Map(); // beadsId -> mapping
+  private reverseMapping: Map<string, string> = new Map(); // vibeTaskId -> beadsId

   // Update getBeadsIssueId:
   getBeadsIssueId(vibeTaskId: string): string | undefined {
-    for (const mapping of this.syncMappings.values()) {
-      if (mapping.vibeTaskId === vibeTaskId) {
-        return mapping.beadsIssueId;
-      }
-    }
-    return undefined;
+    return this.reverseMapping.get(vibeTaskId);
   }

   // Update when adding mappings:
   this.syncMappings.set(issue.id, mapping);
+  this.reverseMapping.set(mapping.vibeTaskId, issue.id);

412-420: loadMappings silently ignores all errors.

Catching all errors means permission issues, JSON parse errors with partial data, or disk errors are silently ignored. Consider logging or distinguishing between "file not found" (expected) and other errors.

🔎 Proposed improvement
   private async loadMappings(projectPath: string): Promise<void> {
     const mappingsPath = path.join(projectPath, MAPPINGS_FILE);

     try {
       const content = await fs.readFile(mappingsPath, 'utf-8');
       const data = JSON.parse(content);

       this.syncMappings = new Map(Object.entries(data).map(([k, v]) => [k, v as SyncMapping]));
-    } catch {
-      // File doesn't exist or is invalid - start fresh
+    } catch (error: unknown) {
+      const err = error as NodeJS.ErrnoException;
+      if (err.code === 'ENOENT') {
+        // File doesn't exist - start fresh
+        this.syncMappings = new Map();
+      } else {
+        console.warn('[BeadsVibeSync] Error loading mappings, starting fresh:', err.message);
+        this.syncMappings = new Map();
+      }
-      this.syncMappings = new Map();
     }
   }
📜 Review details

Configuration used: defaults

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 70f8f4d and 467bc5b.

📒 Files selected for processing (26)
  • apps/server/src/agents/agent-prompts.ts
  • apps/server/src/agents/agent-registry.ts
  • apps/server/src/agents/index.ts
  • apps/server/src/agents/specialized-agent-service.ts
  • apps/server/src/agents/task-classifier.ts
  • apps/server/src/index.ts
  • apps/server/src/routes/github/index.ts
  • apps/server/src/routes/github/routes/webhook.ts
  • apps/server/src/routes/setup/get-claude-status.ts
  • apps/server/src/services/agent-monitor-service.ts
  • apps/server/src/services/agent-monitor.ts
  • apps/server/src/services/automode-agent-integration.ts
  • apps/server/src/services/beads-orchestrator.ts
  • apps/server/src/services/beads-vibe-sync.ts
  • apps/server/src/services/board-hygiene-service.ts
  • apps/server/src/services/github-ci-service.ts
  • apps/server/src/services/orchestrator-service.ts
  • apps/server/src/services/review-watcher-service.ts
  • apps/server/src/services/task-state-service.ts
  • apps/server/src/services/telemetry-service.ts
  • apps/server/tests/unit/agents/agent-registry.test.ts
  • apps/server/tests/unit/agents/task-classifier.test.ts
  • docs/vibe-kanban-mcp-integration.md
  • libs/types/src/agent-types.ts
  • libs/types/src/event.ts
  • libs/types/src/index.ts
🧰 Additional context used
📓 Path-based instructions (4)
**/*.{ts,tsx,js,jsx}

📄 CodeRabbit inference engine (AGENTS.md)

**/*.{ts,tsx,js,jsx}: Run type checking with npm run typecheck before syncing the Beads database as part of quality gates
Run linting with npm run lint before syncing the Beads database as part of quality gates

Keep files under 300 lines to avoid monolithic files

Files:

  • apps/server/src/services/telemetry-service.ts
  • libs/types/src/event.ts
  • apps/server/src/services/agent-monitor.ts
  • libs/types/src/index.ts
  • apps/server/src/agents/task-classifier.ts
  • apps/server/tests/unit/agents/agent-registry.test.ts
  • apps/server/src/routes/setup/get-claude-status.ts
  • apps/server/tests/unit/agents/task-classifier.test.ts
  • apps/server/src/agents/agent-prompts.ts
  • apps/server/src/routes/github/routes/webhook.ts
  • apps/server/src/agents/index.ts
  • apps/server/src/services/automode-agent-integration.ts
  • apps/server/src/services/orchestrator-service.ts
  • apps/server/src/services/github-ci-service.ts
  • libs/types/src/agent-types.ts
  • apps/server/src/routes/github/index.ts
  • apps/server/src/agents/agent-registry.ts
  • apps/server/src/services/agent-monitor-service.ts
  • apps/server/src/services/board-hygiene-service.ts
  • apps/server/src/agents/specialized-agent-service.ts
  • apps/server/src/services/review-watcher-service.ts
  • apps/server/src/services/beads-orchestrator.ts
  • apps/server/src/services/beads-vibe-sync.ts
  • apps/server/src/services/task-state-service.ts
  • apps/server/src/index.ts
apps/server/src/services/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Place services in apps/server/src/services/, with one service per file

Files:

  • apps/server/src/services/telemetry-service.ts
  • apps/server/src/services/agent-monitor.ts
  • apps/server/src/services/automode-agent-integration.ts
  • apps/server/src/services/orchestrator-service.ts
  • apps/server/src/services/github-ci-service.ts
  • apps/server/src/services/agent-monitor-service.ts
  • apps/server/src/services/board-hygiene-service.ts
  • apps/server/src/services/review-watcher-service.ts
  • apps/server/src/services/beads-orchestrator.ts
  • apps/server/src/services/beads-vibe-sync.ts
  • apps/server/src/services/task-state-service.ts
apps/server/src/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

apps/server/src/**/*.ts: Run linting and tests after editing server files with npm run lint --workspace=apps/server and npm run test:server
Run TypeScript compilation check for server files with npx tsc -p apps/server/tsconfig.json --noEmit

Files:

  • apps/server/src/services/telemetry-service.ts
  • apps/server/src/services/agent-monitor.ts
  • apps/server/src/agents/task-classifier.ts
  • apps/server/src/routes/setup/get-claude-status.ts
  • apps/server/src/agents/agent-prompts.ts
  • apps/server/src/routes/github/routes/webhook.ts
  • apps/server/src/agents/index.ts
  • apps/server/src/services/automode-agent-integration.ts
  • apps/server/src/services/orchestrator-service.ts
  • apps/server/src/services/github-ci-service.ts
  • apps/server/src/routes/github/index.ts
  • apps/server/src/agents/agent-registry.ts
  • apps/server/src/services/agent-monitor-service.ts
  • apps/server/src/services/board-hygiene-service.ts
  • apps/server/src/agents/specialized-agent-service.ts
  • apps/server/src/services/review-watcher-service.ts
  • apps/server/src/services/beads-orchestrator.ts
  • apps/server/src/services/beads-vibe-sync.ts
  • apps/server/src/services/task-state-service.ts
  • apps/server/src/index.ts
apps/server/src/routes/**/*.ts

📄 CodeRabbit inference engine (CLAUDE.md)

Place API routes in apps/server/src/routes/, with one file per route/resource

Files:

  • apps/server/src/routes/setup/get-claude-status.ts
  • apps/server/src/routes/github/routes/webhook.ts
  • apps/server/src/routes/github/index.ts
🧠 Learnings (3)
📚 Learning: 2025-12-24T19:32:07.586Z
Learnt from: CR
Repo: 0xtsotsi/DevFlow PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-24T19:32:07.586Z
Learning: Applies to **/*.{ts,tsx,js,jsx} : Run type checking with `npm run typecheck` before syncing the Beads database as part of quality gates

Applied to files:

  • libs/types/src/event.ts
  • apps/server/src/services/beads-vibe-sync.ts
📚 Learning: 2025-12-28T19:34:17.064Z
Learnt from: CR
Repo: 0xtsotsi/DevFlow PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:34:17.064Z
Learning: Applies to libs/automaker/types/src/**/*.ts : Export shared types from `libs/automaker/types/src/`

Applied to files:

  • libs/types/src/index.ts
  • apps/server/src/agents/index.ts
  • libs/types/src/agent-types.ts
📚 Learning: 2025-12-28T19:34:17.064Z
Learnt from: CR
Repo: 0xtsotsi/DevFlow PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-12-28T19:34:17.064Z
Learning: Applies to apps/server/src/routes/**/*.ts : Place API routes in `apps/server/src/routes/`, with one file per route/resource

Applied to files:

  • apps/server/src/index.ts
🧬 Code graph analysis (12)
apps/server/src/agents/task-classifier.ts (3)
apps/server/src/agents/index.ts (4)
  • AgentType (19-19)
  • TaskClassifier (9-9)
  • TaskAnalysis (26-26)
  • TaskClassification (25-25)
libs/types/src/index.ts (4)
  • AgentType (124-124)
  • AgentType (133-133)
  • TaskAnalysis (128-128)
  • TaskClassification (127-127)
libs/types/src/agent-types.ts (2)
  • TaskAnalysis (132-156)
  • TaskClassification (111-127)
apps/server/tests/unit/agents/agent-registry.test.ts (3)
apps/server/src/agents/agent-registry.ts (1)
  • AgentRegistry (20-319)
apps/server/src/agents/index.ts (2)
  • AgentRegistry (8-8)
  • AgentType (19-19)
libs/types/src/index.ts (2)
  • AgentType (124-124)
  • AgentType (133-133)
apps/server/tests/unit/agents/task-classifier.test.ts (1)
apps/server/src/agents/task-classifier.ts (1)
  • TaskClassifier (286-445)
apps/server/src/agents/agent-prompts.ts (1)
libs/types/src/agent-types.ts (1)
  • AgentConfig (73-106)
apps/server/src/routes/github/routes/webhook.ts (1)
apps/server/src/lib/events.ts (1)
  • EventEmitter (10-13)
apps/server/src/services/automode-agent-integration.ts (3)
apps/server/src/lib/events.ts (1)
  • EventEmitter (10-13)
apps/ui/src/store/app-store.ts (1)
  • ParsedTask (295-301)
apps/server/src/agents/specialized-agent-service.ts (1)
  • specializedAgentService (299-299)
apps/server/src/routes/github/index.ts (1)
apps/server/src/routes/github/routes/webhook.ts (1)
  • createGitHubWebhookHandler (311-348)
apps/server/src/agents/agent-registry.ts (2)
libs/types/src/agent-types.ts (3)
  • AgentRegistryEntry (161-172)
  • AgentConfig (73-106)
  • AgentExecutionResult (200-218)
apps/server/src/agents/agent-prompts.ts (1)
  • getAgentConfigurations (721-813)
apps/server/src/services/board-hygiene-service.ts (2)
apps/server/src/services/vibe-kanban-client.ts (1)
  • VibeKanbanClient (139-586)
libs/types/src/index.ts (1)
  • VibeKanbanTask (114-114)
apps/server/src/services/review-watcher-service.ts (1)
apps/server/src/lib/events.ts (1)
  • EventEmitter (10-13)
apps/server/src/services/beads-vibe-sync.ts (3)
libs/types/src/index.ts (2)
  • VibeKanbanTask (114-114)
  • BeadsIssue (87-87)
apps/server/src/lib/events.ts (1)
  • EventEmitter (10-13)
apps/server/src/services/vibe-kanban-client.ts (1)
  • getVibeKanbanClient (593-601)
apps/server/src/index.ts (1)
apps/server/src/routes/github/index.ts (1)
  • createGitHubRoutes (24-44)
🪛 ast-grep (0.40.3)
apps/server/src/services/orchestrator-service.ts

[warning] 1011-1011: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(PR Created:? #?${data.prNumber}\\b, 'i')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html

(regexp-from-variable)

🪛 LanguageTool
docs/vibe-kanban-mcp-integration.md

[uncategorized] ~321-~321: Did you mean the formatting language “Markdown” (= proper noun)?
Context: ...** before passing to MCP tools 2. Use markdown in task descriptions for better forma...

(MARKDOWN_NNP)

🪛 markdownlint-cli2 (0.18.1)
docs/vibe-kanban-mcp-integration.md

235-235: Fenced code blocks should have a language specified

(MD040, fenced-code-language)


352-352: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: Greptile Review
🔇 Additional comments (25)
apps/server/src/services/telemetry-service.ts (2)

630-633: LGTM - Cleaner syntax for unused error.

Removing the unused error binding is appropriate when the error isn't being logged or used.


644-647: LGTM - Consistent error handling pattern.

Same cleanup as the previous catch block - appropriate for cases where missing files are expected.

docs/vibe-kanban-mcp-integration.md (1)

1-393: Excellent documentation for MCP integration.

The documentation is comprehensive, well-structured, and provides clear examples for all 8 MCP tools. The inclusion of type definitions, usage examples, state mapping, and best practices makes this a valuable reference.

apps/server/src/services/review-watcher-service.ts (2)

115-140: LGTM - Clean initialization and UUID validation.

The initialization logging is helpful for debugging, and the UUID validation regex correctly matches standard UUID format.


152-354: Well-designed placeholder pattern for MCP tool documentation.

The placeholder methods provide excellent type safety and documentation while clearly directing users to the actual MCP tools. The input validation before throwing is good defensive programming.

apps/server/src/services/github-ci-service.ts (1)

149-200: LGTM - Well-designed CI watching pattern.

The implementation correctly:

  • Only emits events on status changes
  • Automatically stops watching on success
  • Continues watching on failure for potential fixes
  • Returns a cleanup function for manual control
  • Uses configurable polling intervals
apps/server/src/services/task-state-service.ts (1)

189-210: LGTM - Good cleanup logic with safe defaults.

The cleanup implementation correctly:

  • Only removes completed tasks
  • Uses configurable retention period
  • Logs cleanup operations
  • Preserves all active tasks
apps/server/src/services/agent-monitor-service.ts (1)

353-366: LGTM - Minor code style improvements.

The added braces improve code clarity and the ESLint disable comment is appropriate for the event emitter's type-flexible API.

Also applies to: 470-471

apps/server/src/routes/github/index.ts (1)

9-9: LGTM - Clean conditional webhook integration.

The webhook route is correctly registered only when an event emitter is provided, maintaining backward compatibility while enabling new functionality.

Also applies to: 21-21, 31-34

apps/server/src/index.ts (1)

234-234: LGTM - Correct integration with GitHub webhook support.

The events parameter correctly enables webhook functionality in the GitHub routes.

libs/types/src/index.ts (1)

121-133: LGTM! Agent type exports are well-structured.

The dual export pattern for AgentType (as type on line 124 and as value on line 133) correctly handles TypeScript's isolatedModules requirement for enums. This follows the existing export conventions in this barrel file. Based on learnings, this aligns with the guideline to export shared types from libs/automaker/types/src/.

libs/types/src/event.ts (1)

88-97: LGTM! Beads orchestration events are well-defined.

The new event types follow the established namespace:action naming convention and comprehensively cover the beads orchestration lifecycle (execution plans, feature completion/failure, sync operations, and auto-sync state). This aligns with the M2 Beads Integration Enhancement objectives.

apps/server/tests/unit/agents/agent-registry.test.ts (2)

1-192: LGTM! Comprehensive test coverage for AgentRegistry.

The tests thoroughly cover the AgentRegistry public API including agent type retrieval, configuration management, execution recording, statistics tracking, custom agent registration/unregistration, and state export/import. The mock handling in recordExecution tests (lines 165-179) correctly cleans up with mockRestore().

Also applies to: 200-334


193-199: Potential issue with enum key counting.

Object.keys(AgentType).length on a TypeScript string enum returns both the keys and values, effectively doubling the count. For the AgentType enum with 8 members, this would return 16 instead of 8.

Consider using Object.values(AgentType).length instead, which returns only the enum values.

🔎 Proposed fix
   describe('getAllAgentStats', () => {
     it('should return stats for all agents', () => {
       const stats = registry.getAllAgentStats();

-      expect(stats.size).toBe(Object.keys(AgentType).length);
+      expect(stats.size).toBe(Object.values(AgentType).length);
     });
   });

Likely an incorrect or invalid review comment.

apps/server/tests/unit/agents/task-classifier.test.ts (1)

1-170: LGTM! Well-structured tests for TaskClassifier.

The test suite comprehensively covers both analyzeTask and classifyTask methods, including keyword extraction, task-type detection flags, complexity calculation, language/file pattern detection, and classification for all agent types. Edge cases for empty and punctuation-only prompts (lines 156-168) ensure graceful degradation. The tests align well with the TaskAnalysis and TaskClassification interfaces defined in libs/types/src/agent-types.ts.

libs/types/src/agent-types.ts (1)

1-218: LGTM! Well-designed type definitions for the specialized agent system.

The types are comprehensive and well-documented with JSDoc comments. The AgentType enum provides clear categorization, and the supporting interfaces (AgentConfig, TaskClassification, TaskAnalysis, AgentRegistryEntry, AgentExecutionContext, AgentExecutionResult) establish a solid foundation for the agent system's configuration, classification, and execution flow. The complexity scale (1-10) in TaskAnalysis is clearly documented.

apps/server/src/services/orchestrator-service.ts (1)

956-989: Webhook integration lifecycle is well-managed.

The subscription is correctly established on start() (line 225) and cleaned up on stop() (lines 258-260). The event-driven approach enables real-time task state updates without waiting for poll cycles, which improves responsiveness.

Also applies to: 224-226, 258-260

apps/server/src/agents/task-classifier.ts (1)

286-344: Core classification logic is well-designed.

The keyword extraction, complexity calculation, language detection, and priority-based tiebreaking provide a solid foundation for task classification. The confidence calculation based on score gaps (lines 377-380) and alternatives generation (lines 386-393) add useful metadata for downstream decision-making.

Also applies to: 362-401, 403-448

apps/server/src/services/automode-agent-integration.ts (1)

58-86: LGTM! Clean integration between AutoMode and specialized agents.

The AutoModeAgentIntegration class provides a well-structured bridge with flexible configuration, event emissions for observability, and proper delegation to specializedAgentService. The execution flow (lines 134-222) handles context building, agent selection, and result tracking appropriately.

Also applies to: 131-222, 280-305

apps/server/src/agents/agent-registry.ts (1)

30-46: LGTM!

The constructor correctly initializes all agents from configurations with sensible default stats (optimistic 1.0 success rate).

apps/server/src/agents/index.ts (1)

1-30: LGTM!

Clean barrel file with well-organized exports. Correctly separates value exports (AgentType enum) from type-only exports. This follows good module organization practices.

apps/server/src/services/beads-orchestrator.ts (1)

409-422: stopAll doesn't actually cancel running executors.

The method only removes entries from the tracking map but doesn't abort the underlying promises. The comment acknowledges this limitation, but it could lead to resource leaks or unexpected behavior.

Consider implementing proper cancellation using AbortController if the executors support it, or document this limitation in the JSDoc.

apps/server/src/agents/specialized-agent-service.ts (2)

29-57: LGTM!

The classifyTask method nicely combines the TaskClassifier with historical registry data, falling back to historical recommendations when classification confidence is low (< 0.8). Good design pattern for improving accuracy over time.


156-173: Good error handling pattern.

Failed executions are properly recorded in the registry with duration and error details, enabling accurate statistics tracking. This ensures the success rate calculations remain accurate.

apps/server/src/services/beads-vibe-sync.ts (1)

108-152: LGTM!

The sync method properly handles the sync-in-progress guard, measures duration, emits appropriate events, and ensures isSyncing is reset in the finally block even on errors.

## Your Approach
1. **Read the Code**: Understand what it's supposed to do
2. **Identify Cases**: Happy path, edge cases, errors
3. **Write Clear Tests**: Descriptive names,Arrange-Act-Assert structure
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Minor typo: missing space.

-3. **Write Clear Tests**: Descriptive names,Arrange-Act-Assert structure
+3. **Write Clear Tests**: Descriptive names, Arrange-Act-Assert structure
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
3. **Write Clear Tests**: Descriptive names,Arrange-Act-Assert structure
3. **Write Clear Tests**: Descriptive names, Arrange-Act-Assert structure
🤖 Prompt for AI Agents
In apps/server/src/agents/agent-prompts.ts around line 161, there is a minor
typo: the phrase "Descriptive names,Arrange-Act-Assert structure" is missing a
space after the comma; update the text to "Descriptive names, Arrange-Act-Assert
structure" so the comma is followed by a space.

Comment on lines +354 to +360
analysis.prompt,
agentType,
AGENT_KEYWORDS[agentType]
);
const fileScore = calculateFileScore(analysis.filePatterns, agentType);
scores.set(agentType, keywordScore + fileScore);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Bug: calculateFileScore receives formatted strings instead of file paths.

analysis.filePatterns contains strings like "testing:/\.test\.(ts|js|tsx|jsx)$/i" (set at line 310), but calculateFileScore tests these against AGENT_FILE_PATTERNS regexes expecting actual file paths like "src/test.ts". This means file-based scoring will always return 0.

🔎 Proposed fix - Pass original filePaths to classifyTask

Option 1: Store original filePaths in TaskAnalysis and use those:

 export interface TaskAnalysis {
   // ... existing fields
+  /** Original file paths for scoring */
+  originalFilePaths?: string[];
 }

Option 2: Change calculateFileScore to parse the agentType from filePatterns:

 function calculateFileScore(filePatterns: string[], agentType: AgentType): number {
   if (!filePatterns || filePatterns.length === 0) {
     return 0;
   }

   let score = 0;
   for (const pattern of filePatterns) {
-    for (const regex of AGENT_FILE_PATTERNS[agentType]) {
-      if (regex.test(pattern)) {
-        score += 3;
-      }
-    }
+    // filePatterns format: "agentType:patternSource"
+    const [patternAgentType] = pattern.split(':');
+    if (patternAgentType === agentType) {
+      score += 3;
+    }
   }

   return score;
 }

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +61 to +66
// Search for .env in project root (parent of apps/server) for monorepo setup
dotenv.config({ path: path.resolve(process.cwd(), '../../.env') });
// Fallback to default .env locations
if (!process.env.CORS_ORIGIN) {
dotenv.config();
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

cat apps/server/package.json

Repository: 0xtsotsi/DevFlow

Length of output: 1695


🏁 Script executed:

cat apps/server/src/index.ts | head -80

Repository: 0xtsotsi/DevFlow

Length of output: 3865


🏁 Script executed:

# Check package.json in root
cat package.json | grep -A 5 "scripts"

Repository: 0xtsotsi/DevFlow

Length of output: 980


🏁 Script executed:

head -50 init.mjs

Repository: 0xtsotsi/DevFlow

Length of output: 1244


🏁 Script executed:

# Check if there are any github workflows or docs explaining server startup
find . -type f \( -name "*.md" -o -name "*.yml" -o -name "*.yaml" \) -path "*/.github/workflows/*" | head -5

Repository: 0xtsotsi/DevFlow

Length of output: 233


🏁 Script executed:

# Check for any docs about running the server
find . -name "README*" -o -name "SETUP*" | head -5

Repository: 0xtsotsi/DevFlow

Length of output: 186


🏁 Script executed:

grep -n "server\|spawn\|exec" init.mjs | head -30

Repository: 0xtsotsi/DevFlow

Length of output: 1640


🏁 Script executed:

# Get more context around server process spawn
rg -n "spawn.*server|exec.*server" apps/server init.mjs

Repository: 0xtsotsi/DevFlow

Length of output: 42


🏁 Script executed:

# Check the full init.mjs to find server startup
sed -n '1,150p' init.mjs | grep -A 5 -B 5 "server"

Repository: 0xtsotsi/DevFlow

Length of output: 260


🏁 Script executed:

# Check root package.json for dev:server script
grep -A 2 "dev:server" package.json

Repository: 0xtsotsi/DevFlow

Length of output: 777


🏁 Script executed:

# Get full init.mjs section around server startup
sed -n '340,380p' init.mjs

Repository: 0xtsotsi/DevFlow

Length of output: 1226


🏁 Script executed:

# Check what cwd context is used when spawning
sed -n '200,230p' init.mjs

Repository: 0xtsotsi/DevFlow

Length of output: 751


🏁 Script executed:

# Check if there's any issue tracking or discussion about dotenv loading
rg -i "dotenv|\.env" --type md | head -20

Repository: 0xtsotsi/DevFlow

Length of output: 2130


🏁 Script executed:

# Verify the actual behavior: check if CORS_ORIGIN is set in .env
grep -r "CORS_ORIGIN" . --include="*.env*" --include="*.md" | head -10

Repository: 0xtsotsi/DevFlow

Length of output: 996


🏁 Script executed:

# Check what the dotenv.config() fallback actually does by checking dotenv docs/behavior
# Also check if there are any CI/CD workflows that might show how server is started
cat .github/workflows/pr-check.yml | grep -A 10 -B 10 "server\|dotenv"

Repository: 0xtsotsi/DevFlow

Length of output: 42


🏁 Script executed:

# Verify the actual cwd when running tsx from workspace npm command
# Check if there's any documentation or code that explicitly sets cwd
rg -n "cwd.*apps/server|process.cwd" apps/server/src/index.ts

Repository: 0xtsotsi/DevFlow

Length of output: 130


🏁 Script executed:

# Check the tsconfig to understand compilation context
cat apps/server/tsconfig.json | head -30

Repository: 0xtsotsi/DevFlow

Length of output: 540


🏁 Script executed:

# Verify file existence - where is .env actually expected to be?
find . -maxdepth 2 -name ".env*" -type f

Repository: 0xtsotsi/DevFlow

Length of output: 74


🏁 Script executed:

# Verify the workspace behavior - check if npm workspace changes cwd
# Look at how the dev script actually runs by checking if there are any logs or traces
rg -n "process.cwd\|__dirname" apps/server/src/index.ts -A 2 -B 2

Repository: 0xtsotsi/DevFlow

Length of output: 42


🏁 Script executed:

# Check if the file is actually using import.meta or needs __dirname conversion
head -70 apps/server/src/index.ts | tail -20

Repository: 0xtsotsi/DevFlow

Length of output: 1083


🏁 Script executed:

# Verify if there are any other references to __dirname in the server code
rg -n "__dirname" apps/server/src/

Repository: 0xtsotsi/DevFlow

Length of output: 42


Replace process.cwd()-based path resolution with import.meta.url for reliable .env loading.

The code assumes process.cwd() is apps/server when resolving ../../.env, but the server is started from the project root. This causes the initial dotenv.config() call to resolve to an incorrect path (2 levels above the project root). While the fallback mechanism masks this issue in practice, it's unreliable and depends on CORS_ORIGIN being unset.

Use import.meta.url for reliable path resolution relative to the source file:

import { fileURLToPath } from 'url';
import { dirname } from 'path';

const __filename = fileURLToPath(import.meta.url);
const __dirname = dirname(__filename);

// Load from project root (3 levels up from apps/server/src/)
dotenv.config({ path: path.resolve(__dirname, '../../../.env') });

This eliminates the dependency on current working directory and ensures the correct .env file is loaded regardless of where the server process is started.

🤖 Prompt for AI Agents
In apps/server/src/index.ts around lines 61 to 66, replace the
process.cwd()-based dotenv path resolution with a resolution based on
import.meta.url: add imports for fileURLToPath from 'url' and dirname from
'path' near the top, compute __filename = fileURLToPath(import.meta.url) and
__dirname = dirname(__filename), then call dotenv.config({ path:
path.resolve(__dirname, '../../../.env') }) instead of using process.cwd(); keep
the fallback dotenv.config() call if you want the default behavior, but do not
rely on process.cwd() for the explicit project-root path.

Comment on lines +100 to +102
// Use timing-safe comparison to prevent timing attacks
return crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected));
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Potential crash if signature lengths differ.

crypto.timingSafeEqual throws if the input buffers have different lengths. If GitHub's signature format changes or is malformed, this will crash the handler instead of returning false.

🔎 Proposed fix
   // Use timing-safe comparison to prevent timing attacks
+  const sigBuffer = Buffer.from(signature);
+  const expectedBuffer = Buffer.from(expected);
+  if (sigBuffer.length !== expectedBuffer.length) {
+    return false;
+  }
-  return crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected));
+  return crypto.timingSafeEqual(sigBuffer, expectedBuffer);
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
// Use timing-safe comparison to prevent timing attacks
return crypto.timingSafeEqual(Buffer.from(signature), Buffer.from(expected));
}
// Use timing-safe comparison to prevent timing attacks
const sigBuffer = Buffer.from(signature);
const expectedBuffer = Buffer.from(expected);
if (sigBuffer.length !== expectedBuffer.length) {
return false;
}
return crypto.timingSafeEqual(sigBuffer, expectedBuffer);
}
🤖 Prompt for AI Agents
In apps/server/src/routes/github/routes/webhook.ts around lines 100 to 102, the
call to crypto.timingSafeEqual can throw when the input buffers differ in
length; update the logic to first verify both signature and expected values
produce buffers of equal length (after normalizing encoding and trimming
prefixes like "sha256=") and if lengths differ return false immediately,
otherwise call timingSafeEqual; ensure any malformed input is handled without
throwing so the handler returns false instead of crashing.

repo,
mergedBy: pr.user?.login,
mergedAt: pr.merged_at,
title: pr.title, // Will be added by accessing full PR data
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Reference to undefined property pr.title.

pr.title is referenced in eventData but title is not defined in the pull_request interface within GitHubWebhookPayload.

🔎 Proposed fix - add to interface
   pull_request?: {
     number: number;
+    title?: string;
     merged?: boolean;
     // ...
   };

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In apps/server/src/routes/github/routes/webhook.ts around line 134, the code
uses pr.title when building eventData but the pull_request shape on
GitHubWebhookPayload doesn't declare a title property; update the TypeScript
interface for pull_request in the GitHubWebhookPayload definition to include
title: string (or title?: string if it can be missing) so pr.title is typed, and
ensure any code that constructs pr comes from the full pull_request payload (or
use the already-provided full PR object) to guarantee title is populated.

Comment on lines +369 to +377
export function getBoardHygieneService(
vibeKanban: VibeKanbanClient,
config?: Partial<BoardHygieneConfig>
): BoardHygieneService {
if (!globalInstance) {
globalInstance = new BoardHygieneService(vibeKanban, config);
}
return globalInstance;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Singleton pattern ignores configuration updates.

The singleton implementation only respects the config parameter on the first call. Subsequent calls with different configurations will receive the existing instance with the original config, which could lead to unexpected behavior.

Consider one of these approaches:

  1. Remove singleton pattern: Let callers manage their own instances
  2. Update config on subsequent calls: Merge new config into existing instance
  3. Document the behavior: Clearly state that config is only used once and callers must use resetBoardHygieneService() to apply new config
🔎 Option 1: Remove singleton pattern
-/**
- * Create a singleton instance
- */
-let globalInstance: BoardHygieneService | null = null;
-
 export function getBoardHygieneService(
   vibeKanban: VibeKanbanClient,
   config?: Partial<BoardHygieneConfig>
 ): BoardHygieneService {
-  if (!globalInstance) {
-    globalInstance = new BoardHygieneService(vibeKanban, config);
-  }
-  return globalInstance;
+  return new BoardHygieneService(vibeKanban, config);
 }
-
-export function resetBoardHygieneService(): void {
-  globalInstance = null;
-}

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents
In apps/server/src/services/board-hygiene-service.ts around lines 369 to 377,
the singleton factory ignores subsequent config updates because it returns the
first-created instance unmodified; update behavior so callers' configs aren't
silently ignored. Fix by either removing the singleton and exporting a plain
constructor (or factory) so callers create instances with their desired config,
or modify getBoardHygieneService to merge the provided config into the existing
globalInstance (e.g., call a merge/updateConfig method on BoardHygieneService or
shallow-assign config into its config property) and document that
getBoardHygieneService mutates the singleton when called with a config; also add
or keep a resetBoardHygieneService() helper for tests or explicit
reinitialization.

Comment on lines +86 to +134
async checkPRStatus(prNumber: number): Promise<PRStatus> {
try {
// Use GitHub CLI to get CI status
const { stdout } = await execFileAsync('gh', [
'pr',
'checks',
String(prNumber),
'--json',
'name,status,conclusion,url,startedAt,completedAt',
]);

const checks = JSON.parse(stdout) as CICheckStatus[];

const completedChecks = checks.filter((c) => c.status === 'completed');
const totalChecks = checks.length;

const failed = completedChecks.some(
(c) => c.conclusion === 'failure' || c.conclusion === 'timed_out'
);
const passed =
totalChecks > 0 &&
completedChecks.length === totalChecks &&
completedChecks.every((c) => c.conclusion === 'success');
const pending =
totalChecks > completedChecks.length || checks.some((c) => c.status === 'in_progress');

return {
prNumber,
passed,
failed,
pending,
totalChecks,
completedChecks,
checks,
};
} catch (error) {
console.error(`[GitHubCI] Failed to check status for PR #${prNumber}:`, error);
// Return pending status on error to avoid false positives
return {
prNumber,
passed: false,
failed: false,
pending: true,
totalChecks: 0,
completedChecks: 0,
checks: [],
};
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

# First, read the full github-ci-service.ts file to understand the class structure
wc -l apps/server/src/services/github-ci-service.ts

Repository: 0xtsotsi/DevFlow

Length of output: 109


🏁 Script executed:

# Read the entire file to see the full context
cat -n apps/server/src/services/github-ci-service.ts

Repository: 0xtsotsi/DevFlow

Length of output: 12158


🏁 Script executed:

# Search for any gh CLI validation patterns in the codebase
rg -n "gh.*--version|which gh|command -v gh|execFileAsync.*gh" --type=ts apps/server/src/

Repository: 0xtsotsi/DevFlow

Length of output: 684


Add GitHub CLI availability validation to GitHubCIService.

The service calls gh CLI directly in checkPRStatus and mergePR without validating it's installed. While error handling logs failures and returns safe pending status, other parts of the codebase (e.g., apps/server/src/routes/setup/routes/gh-status.ts) perform explicit gh validation at startup. Add similar validation in the constructor or an initialization method to catch missing gh CLI early and provide clear warnings.

Comment on lines +1007 to +1014
for (const [taskId] of this.trackedTasks.entries()) {
try {
const task = await this.vibeKanban.getTask(taskId);

// Check if task description contains this PR number
const prPattern = new RegExp(`PR Created:? #?${data.prNumber}\\b`, 'i');
if (task.description && prPattern.test(task.description)) {
console.log(`[Orchestrator] Found task ${taskId} for PR #${data.prNumber}`);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Validate PR number before using in RegExp.

While prNumber is typed as number, the payload comes from webhook events via payload as any cast. To guard against malformed payloads and address the static analysis warning (CWE-1333), validate or coerce the input before constructing the regex.

🔎 Proposed fix
   private async handlePRMerged(data: {
     prNumber: number;
     sha?: string;
     url?: string;
     repo?: string;
     branch?: string;
   }): Promise<void> {
+    // Validate prNumber is actually a number to prevent ReDoS
+    const prNum = Number(data.prNumber);
+    if (!Number.isFinite(prNum) || prNum <= 0) {
+      console.warn(`[Orchestrator] Invalid PR number: ${data.prNumber}`);
+      return;
+    }
+
     console.log(`[Orchestrator] Handling PR merged: #${data.prNumber}`);

     // Find tracked task with this PR number
     for (const [taskId] of this.trackedTasks.entries()) {
       try {
         const task = await this.vibeKanban.getTask(taskId);

         // Check if task description contains this PR number
-        const prPattern = new RegExp(`PR Created:? #?${data.prNumber}\\b`, 'i');
+        const prPattern = new RegExp(`PR Created:? #?${prNum}\\b`, 'i');
         if (task.description && prPattern.test(task.description)) {
🧰 Tools
🪛 ast-grep (0.40.3)

[warning] 1011-1011: Regular expression constructed from variable input detected. This can lead to Regular Expression Denial of Service (ReDoS) attacks if the variable contains malicious patterns. Use libraries like 'recheck' to validate regex safety or use static patterns.
Context: new RegExp(PR Created:? #?${data.prNumber}\\b, 'i')
Note: [CWE-1333] Inefficient Regular Expression Complexity [REFERENCES]
- https://owasp.org/www-community/attacks/Regular_expression_Denial_of_Service_-_ReDoS
- https://cwe.mitre.org/data/definitions/1333.html

(regexp-from-variable)

🤖 Prompt for AI Agents
In apps/server/src/services/orchestrator-service.ts around lines 1007 to 1014,
the code constructs a RegExp using data.prNumber from an untrusted webhook
payload; validate and coerce that value before using it in a regex. Convert the
incoming value to a safe numeric string (e.g. parseInt/Number and confirm it's a
finite integer), bail out or skip this task if validation fails, then build the
regex from the validated string (or escape it if you choose to treat as string)
and only then test task.description; this prevents malformed payloads and
addresses the static analysis warning.

Comment on lines +1150 to +1157
private findTaskByPRNumber(prNumber: number): string | null {
for (const [taskId, tracked] of this.trackedTasks.entries()) {
if (tracked.metadata.prNumber === prNumber) {
return taskId;
}
}
return null;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

findTaskByPRNumber relies on metadata.prNumber which is never set.

This helper searches tracked.metadata.prNumber, but the PR number is never stored in metadata when a PR is created (line 618 only updates appendToDescription). This means findTaskByPRNumber will always return null, and handleCIPassed/handleCIFailed will silently skip tasks.

🔎 Proposed fix - Store PR number in metadata when created
           if (result.valid) {
             await this.vibeKanban.updateTask(task.id, {
               status: 'pr_created',
               appendToDescription: `\n\n**PR Created:** #${prNumber}`,
             });
-            this.updateTaskState(task.id, 'pr_created', 'monitoring_pr');
+            // Store PR number in metadata for lookup
+            const tracked = this.trackedTasks.get(task.id);
+            if (tracked) {
+              tracked.metadata.prNumber = prNumber;
+            }
+            this.updateTaskState(task.id, 'pr_created', 'monitoring_pr');

Committable suggestion skipped: line range outside the PR's diff.

Comment on lines +215 to +228
private async flush(): Promise<void> {
try {
const records = Array.from(this.states.values());
const dir = path.dirname(this.dbPath);

// Ensure directory exists
await fs.mkdir(dir, { recursive: true });

// Write to file
await fs.writeFile(this.dbPath, JSON.stringify(records, null, 2));
} catch (error) {
console.error('[TaskState] Failed to save state:', error);
}
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion | 🟠 Major

Consider atomic writes to prevent data corruption.

The flush method writes directly to the state file without using an atomic write pattern (write to temporary file + rename). If the process crashes or is killed during write, the state file could be corrupted.

🔎 Proposed atomic write pattern
 private async flush(): Promise<void> {
   try {
     const records = Array.from(this.states.values());
     const dir = path.dirname(this.dbPath);

     // Ensure directory exists
     await fs.mkdir(dir, { recursive: true });

-    // Write to file
-    await fs.writeFile(this.dbPath, JSON.stringify(records, null, 2));
+    // Write to temp file first, then rename (atomic operation)
+    const tempPath = `${this.dbPath}.tmp`;
+    await fs.writeFile(tempPath, JSON.stringify(records, null, 2));
+    await fs.rename(tempPath, this.dbPath);
   } catch (error) {
     console.error('[TaskState] Failed to save state:', error);
   }
 }
🤖 Prompt for AI Agents
In apps/server/src/services/task-state-service.ts around lines 215 to 228, the
flush method currently writes directly to the state file which can corrupt data
if the process crashes mid-write; change it to perform an atomic write by
writing JSON to a temporary file in the same directory (use a unique name like
dbPath + `.tmp-<pid>-<timestamp>`), fsync the temp file (or use
fs.promises.writeFile then open+sync if needed), then rename the temp file to
the final dbPath (fs.rename) to atomically replace the file; make sure to still
create the directory as you do now, handle and log errors, and clean up the temp
file on failure.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant amount of new functionality, including specialized agents, Beads orchestration, and VibeKanban integration. The overall structure is well-designed and modular. I've identified a few areas for improvement, mainly around performance and correctness in the agent configuration and tool handling. Specifically, I've suggested memoizing agent configurations to avoid redundant computations and correcting the logic for determining an agent's allowed tools. I've also pointed out a couple of minor type-safety issues and a potential bug in how telemetry events are handled. Overall, this is a solid contribution that greatly expands the capabilities of the system.

Comment on lines +84 to +103
getAgentsWithTool(toolName: string): AgentType[] {
const result: AgentType[] = [];

for (const [agentType, entry] of this.agents.entries()) {
const { allowedTools } = entry.config;

// If no tools are restricted, agent can use any tool
if (!allowedTools || allowedTools.length === 0) {
result.push(agentType);
continue;
}

// Check if tool is in allowed list
if (allowedTools.includes(toolName)) {
result.push(agentType);
}
}

return result;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The getAgentsWithTool method checks entry.config.allowedTools, but this property is never populated in the agent configurations defined in agent-prompts.ts. The tools an agent can use are defined within entry.config.capabilities. The logic should be updated to check for the tool within the agent's capabilities.

  getAgentsWithTool(toolName: string): AgentType[] {
    const result: AgentType[] = [];

    for (const [agentType, entry] of this.agents.entries()) {
      // An agent can use a tool if any of its capabilities include the tool.
      const canUseTool = entry.config.capabilities.some((capability) =>
        capability.tools.includes(toolName)
      );

      if (canUseTool) {
        result.push(agentType);
      }
    }

    return result;
  }

Comment on lines +94 to +99
// Merge agent-specific tools with any global restrictions
let allowedTools = agentConfig.allowedTools;
if (!allowedTools || allowedTools.length === 0) {
// No tool restrictions for this agent
allowedTools = undefined;
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The allowedTools for an agent are not being correctly determined. The agentConfig.allowedTools property is never set in the configurations. The tools an agent can use are defined within its capabilities. You should collect all tools from the agent's capabilities and pass them to the provider. This ensures that agents are restricted to using only the tools they are designed for.

    // Collect all tools from the agent's capabilities
    const allTools = agentConfig.capabilities.flatMap((cap) => cap.tools);
    const allowedTools = [...new Set(allTools)];

Comment on lines +721 to +820
export function getAgentConfigurations(): Record<AgentType, AgentConfig> {
const configs: Record<AgentType, AgentConfig> = {
[AgentType.PLANNING]: {
type: AgentType.PLANNING,
name: 'Planning Agent',
description: 'Creates specifications and breaks down features into tasks',
systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.PLANNING],
defaultMaxTurns: 50,
capabilities: AGENT_CAPABILITIES[AgentType.PLANNING],
autoSelectable: true,
priority: 10,
},

[AgentType.IMPLEMENTATION]: {
type: AgentType.IMPLEMENTATION,
name: 'Implementation Agent',
description: 'Writes code and implements features',
systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.IMPLEMENTATION],
defaultMaxTurns: 100,
capabilities: AGENT_CAPABILITIES[AgentType.IMPLEMENTATION],
autoSelectable: true,
priority: 8,
},

[AgentType.TESTING]: {
type: AgentType.TESTING,
name: 'Testing Agent',
description: 'Writes tests and verifies functionality',
systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.TESTING],
defaultMaxTurns: 75,
capabilities: AGENT_CAPABILITIES[AgentType.TESTING],
autoSelectable: true,
priority: 7,
},

[AgentType.REVIEW]: {
type: AgentType.REVIEW,
name: 'Review Agent',
description: 'Reviews code for quality, security, and best practices',
systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.REVIEW],
defaultMaxTurns: 50,
capabilities: AGENT_CAPABILITIES[AgentType.REVIEW],
autoSelectable: true,
priority: 5,
},

[AgentType.DEBUG]: {
type: AgentType.DEBUG,
name: 'Debug Agent',
description: 'Diagnoses and fixes bugs',
systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.DEBUG],
defaultMaxTurns: 75,
capabilities: AGENT_CAPABILITIES[AgentType.DEBUG],
autoSelectable: true,
priority: 9,
},

[AgentType.DOCUMENTATION]: {
type: AgentType.DOCUMENTATION,
name: 'Documentation Agent',
description: 'Writes and updates documentation',
systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.DOCUMENTATION],
defaultMaxTurns: 50,
capabilities: AGENT_CAPABILITIES[AgentType.DOCUMENTATION],
autoSelectable: true,
priority: 4,
},

[AgentType.REFACTORING]: {
type: AgentType.REFACTORING,
name: 'Refactoring Agent',
description: 'Improves code structure and maintainability',
systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.REFACTORING],
defaultMaxTurns: 75,
capabilities: AGENT_CAPABILITIES[AgentType.REFACTORING],
autoSelectable: true,
priority: 6,
},

[AgentType.GENERIC]: {
type: AgentType.GENERIC,
name: 'Generic Agent',
description: 'Handles general-purpose tasks',
systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.GENERIC],
defaultMaxTurns: 50,
capabilities: AGENT_CAPABILITIES[AgentType.GENERIC],
autoSelectable: true,
priority: 1,
},
};

return configs;
}

/**
* Get agent configuration by type
*/
export function getAgentConfiguration(type: AgentType): AgentConfig {
return getAgentConfigurations()[type];
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The getAgentConfigurations function re-creates a large configuration object on every call. This is inefficient, especially since getAgentConfiguration calls it every time. It's better to create this configuration object once and reuse it.

I've refactored this to create the configuration object once as a constant. The functions now just return the constant or a property from it, which is much more performant.

export const AGENT_CONFIGURATIONS: Record<AgentType, AgentConfig> = {
  [AgentType.PLANNING]: {
    type: AgentType.PLANNING,
    name: 'Planning Agent',
    description: 'Creates specifications and breaks down features into tasks',
    systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.PLANNING],
    defaultMaxTurns: 50,
    capabilities: AGENT_CAPABILITIES[AgentType.PLANNING],
    autoSelectable: true,
    priority: 10,
  },

  [AgentType.IMPLEMENTATION]: {
    type: AgentType.IMPLEMENTATION,
    name: 'Implementation Agent',
    description: 'Writes code and implements features',
    systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.IMPLEMENTATION],
    defaultMaxTurns: 100,
    capabilities: AGENT_CAPABILITIES[AgentType.IMPLEMENTATION],
    autoSelectable: true,
    priority: 8,
  },

  [AgentType.TESTING]: {
    type: AgentType.TESTING,
    name: 'Testing Agent',
    description: 'Writes tests and verifies functionality',
    systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.TESTING],
    defaultMaxTurns: 75,
    capabilities: AGENT_CAPABILITIES[AgentType.TESTING],
    autoSelectable: true,
    priority: 7,
  },

  [AgentType.REVIEW]: {
    type: AgentType.REVIEW,
    name: 'Review Agent',
    description: 'Reviews code for quality, security, and best practices',
    systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.REVIEW],
    defaultMaxTurns: 50,
    capabilities: AGENT_CAPABILITIES[AgentType.REVIEW],
    autoSelectable: true,
    priority: 5,
  },

  [AgentType.DEBUG]: {
    type: AgentType.DEBUG,
    name: 'Debug Agent',
    description: 'Diagnoses and fixes bugs',
    systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.DEBUG],
    defaultMaxTurns: 75,
    capabilities: AGENT_CAPABILITIES[AgentType.DEBUG],
    autoSelectable: true,
    priority: 9,
  },

  [AgentType.DOCUMENTATION]: {
    type: AgentType.DOCUMENTATION,
    name: 'Documentation Agent',
    description: 'Writes and updates documentation',
    systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.DOCUMENTATION],
    defaultMaxTurns: 50,
    capabilities: AGENT_CAPABILITIES[AgentType.DOCUMENTATION],
    autoSelectable: true,
    priority: 4,
  },

  [AgentType.REFACTORING]: {
    type: AgentType.REFACTORING,
    name: 'Refactoring Agent',
    description: 'Improves code structure and maintainability',
    systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.REFACTORING],
    defaultMaxTurns: 75,
    capabilities: AGENT_CAPABILITIES[AgentType.REFACTORING],
    autoSelectable: true,
    priority: 6,
  },

  [AgentType.GENERIC]: {
    type: AgentType.GENERIC,
    name: 'Generic Agent',
    description: 'Handles general-purpose tasks',
    systemPrompt: AGENT_SYSTEM_PROMPTS[AgentType.GENERIC],
    defaultMaxTurns: 50,
    capabilities: AGENT_CAPABILITIES[AgentType.GENERIC],
    autoSelectable: true,
    priority: 1,
  },
};

/**
 * Get agent configuration for all agent types
 */
export function getAgentConfigurations(): Record<AgentType, AgentConfig> {
  return AGENT_CONFIGURATIONS;
}

/**
 * Get agent configuration by type
 */
export function getAgentConfiguration(type: AgentType): AgentConfig {
  return AGENT_CONFIGURATIONS[type];
}

export interface GitHubWebhookPayload {
action?: string;
pull_request?: {
number: number;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The pull_request object in the GitHubWebhookPayload interface is missing the title property. The GitHub webhook payload for pull request events includes the title, and it's being used in handlePullRequestEvent (line 134). Adding it to the interface will improve type safety.

Suggested change
number: number;
number: number;
title?: string;

Comment on lines +630 to 633
} catch {
// File doesn't exist yet, that's okay
this.events = [];
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching a generic error can hide specific issues. It's better to catch specific error types if possible, or at least inspect the error. Here, a file-not-found error is expected, but other errors (like permission errors) would be silenced. While secureFs might not throw typed errors, you could check error.code if it's a standard Node.js fs error.

Comment on lines +644 to 647
} catch {
// File doesn't exist yet, that's okay
this.sessions = new Map();
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

Catching a generic error can hide specific issues. It's better to catch specific error types if possible, or at least inspect the error. Here, a file-not-found error is expected, but other errors (like permission errors) would be silenced. While secureFs might not throw typed errors, you could check error.code if it's a standard Node.js fs error.

@0xtsotsi
Copy link
Owner Author

This PR has been superseded. The same work was already merged via:

All three milestones (M2, M4, M5) are now in the main branch. Closing this duplicate PR to avoid merge conflicts.

@0xtsotsi 0xtsotsi closed this Dec 29, 2025
@0xtsotsi 0xtsotsi deleted the vk/b41a-m3-t4-vibe-kanba branch December 29, 2025 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant