refactor: replace deterministic supervisor lifecycle with AI-first decision engine#2206
Conversation
…cision engine Remove ~1,000 lines of deterministic shell logic (fast_path_decision, Phase 3b2 reconciliation, Phase 3c/3d/3.5/3.6, process_post_pr_lifecycle) that prevented the supervisor from solving problems autonomously. New architecture: GATHER -> DECIDE -> EXECUTE - gather_task_state(): collects facts from DB, GitHub, git (pure data) - ai_decide(): sends state to opus model, gets JSON action back - execute_action(): runs what the AI decided; complex work dispatches interactive AI workers with full tool access Key changes: - AI model (opus) makes ALL lifecycle decisions, no deterministic shortcuts - New actions: resolve_conflicts, fix_ci, fix_and_push dispatch interactive AI workers that can read code, understand context, and fix problems - Removed SUPERVISOR_AI_LIFECYCLE toggle (always AI-first now) - Removed Phase 3b2-3.6 deterministic reconciliation/rebase/escalation - process_post_pr_lifecycle deprecated (redirects to process_ai_lifecycle) - Phase 4b2 stale pr_review and 4d stuck deploying simplified
WalkthroughThe pull request restructures the supervisor's task lifecycle orchestration from deterministic branching with hard-coded gates to a fully AI-driven decision engine. It centralizes phase 3 flows into a unified Changes
Sequence DiagramsequenceDiagram
participant Pulse as Pulse Orchestrator
participant AI as AI Lifecycle Engine
participant TaskState as Task State Gatherer
participant AIModel as AI Decision Model
participant ActionExec as Action Executor
participant Worker as AI Worker Dispatcher
Pulse->>AI: process_ai_lifecycle(tasks)
loop For each task
AI->>TaskState: gather_task_state(task_id)
TaskState->>TaskState: enrich with worker status,<br/>CI summary, PR metadata
TaskState-->>AI: structured task state
AI->>AIModel: ai_decide(task_state, task_id)
AIModel->>AIModel: format prompt with<br/>CURRENT STATE &<br/>AVAILABLE ACTIONS
AIModel-->>AI: action, reason, status_tag (JSON)
AI->>ActionExec: execute_action(task_id, action,<br/>reason, status_tag)
alt action == deploy or merge or complex
ActionExec->>Worker: _dispatch_ai_worker(task_id,<br/>action_type, repo, worktree)
Worker-->>ActionExec: worker PID & health tracking
else action == simple (wait, cancel, retry)
ActionExec->>ActionExec: execute inline
end
ActionExec->>ActionExec: update TODO.md status tags,<br/>commit changes
ActionExec-->>AI: action result
end
AI-->>Pulse: all tasks processed
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested labels
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request fundamentally rearchitects the supervisor's task lifecycle management by shifting from a rule-based, deterministic system to an AI-driven decision engine. The change aims to enhance the supervisor's autonomy and problem-solving capabilities, allowing it to dynamically respond to various task states and issues without relying on hardcoded heuristics. This refactor streamlines the codebase and empowers the system to handle complex scenarios more intelligently. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Tue Feb 24 12:40:47 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
There was a problem hiding this comment.
Code Review
This is an impressive and significant refactoring that replaces a complex, deterministic shell-based lifecycle engine with a much cleaner and more powerful AI-first architecture. The GATHER -> DECIDE -> EXECUTE pattern is well-implemented, and delegating all decisions to the AI model while using shell scripts for data gathering and execution is a solid design. The introduction of specialized AI workers for complex tasks like conflict resolution and CI fixes is a major step forward in autonomous problem-solving. My review comments focus on improving the robustness and debuggability of the new implementation by adhering to the repository's general rules, specifically around not suppressing stderr for critical command-line tools. This will ensure that any underlying issues with gh, jq, or the AI CLIs are visible in the logs, which is crucial for maintaining such a system.
| if [[ "$ai_cli" == "opencode" ]]; then | ||
| ai_result=$(portable_timeout "$ai_timeout" opencode run \ | ||
| ai_result=$(portable_timeout "$AI_LIFECYCLE_TIMEOUT" opencode run \ | ||
| -m "$ai_model" \ | ||
| --format default \ | ||
| --title "lifecycle-decision-$$" \ | ||
| --title "lifecycle-${task_id}-$$" \ | ||
| "$prompt" 2>/dev/null || echo "") | ||
| # Strip ANSI codes | ||
| ai_result=$(printf '%s' "$ai_result" | sed 's/\x1b\[[0-9;]*[mGKHF]//g; s/\x1b\[[0-9;]*[A-Za-z]//g; s/\x1b\]//g; s/\x07//g') | ||
| else | ||
| local claude_model="${ai_model#*/}" | ||
| ai_result=$(portable_timeout "$ai_timeout" claude \ | ||
| ai_result=$(portable_timeout "$AI_LIFECYCLE_TIMEOUT" claude \ | ||
| -p "$prompt" \ | ||
| --model "$claude_model" \ | ||
| --output-format text 2>/dev/null || echo "") | ||
| fi |
There was a problem hiding this comment.
The calls to the AI CLIs (opencode and claude) are suppressing stderr using 2>/dev/null. This is highly risky as it will hide critical errors such as authentication failures (e.g., missing API keys), network issues, or problems with the model endpoint itself. According to the repository's general rules, blanket error suppression should be avoided to maintain debuggability. If the AI decision engine fails silently, the entire lifecycle process will be compromised. Please remove the 2>/dev/null to ensure any errors from the AI CLIs are logged.
| if [[ "$ai_cli" == "opencode" ]]; then | |
| ai_result=$(portable_timeout "$ai_timeout" opencode run \ | |
| ai_result=$(portable_timeout "$AI_LIFECYCLE_TIMEOUT" opencode run \ | |
| -m "$ai_model" \ | |
| --format default \ | |
| --title "lifecycle-decision-$$" \ | |
| --title "lifecycle-${task_id}-$$" \ | |
| "$prompt" 2>/dev/null || echo "") | |
| # Strip ANSI codes | |
| ai_result=$(printf '%s' "$ai_result" | sed 's/\x1b\[[0-9;]*[mGKHF]//g; s/\x1b\[[0-9;]*[A-Za-z]//g; s/\x1b\]//g; s/\x07//g') | |
| else | |
| local claude_model="${ai_model#*/}" | |
| ai_result=$(portable_timeout "$ai_timeout" claude \ | |
| ai_result=$(portable_timeout "$AI_LIFECYCLE_TIMEOUT" claude \ | |
| -p "$prompt" \ | |
| --model "$claude_model" \ | |
| --output-format text 2>/dev/null || echo "") | |
| fi | |
| if [[ "$ai_cli" == "opencode" ]]; then | |
| ai_result=$(portable_timeout "$AI_LIFECYCLE_TIMEOUT" opencode run \ | |
| -m "$ai_model" \ | |
| --format default \ | |
| --title "lifecycle-${task_id}-$$" \ | |
| "$prompt" || echo "") | |
| # Strip ANSI codes | |
| ai_result=$(printf '%s' "$ai_result" | sed 's/\x1b\[[0-9;]*[mGKHF]//g; s/\x1b\[[0-9;]*[A-Za-z]//g; s/\x1b\]//g; s/\x07//g') | |
| else | |
| local claude_model="${ai_model#*/}" | |
| ai_result=$(portable_timeout "$AI_LIFECYCLE_TIMEOUT" claude \ | |
| -p "$prompt" \ | |
| --model "$claude_model" \ | |
| --output-format text || echo "") | |
| fi |
References
- Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.
| base_ref=$(gh pr view "$pr_number" --repo "$pr_repo_slug" \ | ||
| --json baseRefName --jq '.baseRefName' 2>/dev/null) || base_ref="" |
There was a problem hiding this comment.
The gh pr view command is suppressing stderr with 2>/dev/null. This can hide important errors related to authentication, network connectivity, or the PR not being found. Per the repository's general rules, stderr should not be suppressed for such commands to aid in debugging. Please remove the 2>/dev/null.
| base_ref=$(gh pr view "$pr_number" --repo "$pr_repo_slug" \ | |
| --json baseRefName --jq '.baseRefName' 2>/dev/null) || base_ref="" | |
| base_ref=$(gh pr view "$pr_number" --repo "$pr_repo_slug" \ | |
| --json baseRefName --jq '.baseRefName') || base_ref="" |
References
- Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.
| base_ref=$(gh pr view "$tpr" --repo "$(detect_repo_slug "$trepo" 2>/dev/null || echo "")" \ | ||
| --json baseRefName --jq '.baseRefName' 2>/dev/null) || base_ref="main" |
There was a problem hiding this comment.
The gh pr view command here suppresses stderr via 2>/dev/null. This is problematic as it can hide errors from the gh CLI, such as authentication failures or if the PR URL is invalid. The repository's general rules advise against suppressing stderr for such commands to ensure errors are visible for debugging. Please remove 2>/dev/null.
| base_ref=$(gh pr view "$tpr" --repo "$(detect_repo_slug "$trepo" 2>/dev/null || echo "")" \ | |
| --json baseRefName --jq '.baseRefName' 2>/dev/null) || base_ref="main" | |
| base_ref=$(gh pr view "$tpr" --repo "$(detect_repo_slug "$trepo" 2>/dev/null || echo "")" \ | |
| --json baseRefName --jq '.baseRefName') || base_ref="main" |
References
- Avoid using '2>/dev/null' for blanket suppression of command errors in shell scripts to ensure that authentication, syntax, or system issues remain visible for debugging.
| pr_state=$(printf '%s' "$pr_json" | jq -r '.state // "UNKNOWN"' 2>/dev/null || echo "UNKNOWN") | ||
| pr_merge_state=$(printf '%s' "$pr_json" | jq -r '.mergeStateStatus // "UNKNOWN"' 2>/dev/null || echo "UNKNOWN") | ||
| pr_review_decision=$(printf '%s' "$pr_json" | jq -r '.reviewDecision // "NONE"' 2>/dev/null || echo "NONE") | ||
| pr_base_ref=$(printf '%s' "$pr_json" | jq -r '.baseRefName // "main"' 2>/dev/null || echo "main") | ||
|
|
||
| # Retry once if UNKNOWN (GitHub lazy-loads mergeStateStatus) | ||
| if [[ "$pr_merge_state" == "UNKNOWN" ]]; then | ||
| sleep 2 | ||
| local retry_json | ||
| retry_json=$(gh pr view "$pr_number" --repo "$pr_repo_slug" \ | ||
| --json mergeable,mergeStateStatus 2>/dev/null || echo "") | ||
| if [[ -n "$retry_json" ]]; then | ||
| pr_merge_state=$(printf '%s' "$retry_json" | jq -r '.mergeStateStatus // "UNKNOWN"' 2>/dev/null || echo "UNKNOWN") | ||
| fi | ||
| local is_draft | ||
| is_draft=$(printf '%s' "$pr_json" | jq -r '.isDraft // false' 2>/dev/null || echo "false") | ||
| if [[ "$is_draft" == "true" ]]; then | ||
| pr_state="DRAFT" | ||
| fi | ||
|
|
||
| pr_review_decision=$(printf '%s' "$pr_json" | jq -r '.reviewDecision // "NONE"' 2>/dev/null || echo "NONE") | ||
|
|
||
| # Summarize CI status | ||
| # CI summary | ||
| local check_rollup | ||
| check_rollup=$(printf '%s' "$pr_json" | jq -r '.statusCheckRollup // []' 2>/dev/null || echo "[]") | ||
| if [[ "$check_rollup" != "[]" && "$check_rollup" != "null" ]]; then | ||
| local pending failed passed | ||
| local pending failed passed total | ||
| pending=$(printf '%s' "$check_rollup" | jq '[.[] | select(.status == "IN_PROGRESS" or .status == "QUEUED" or .status == "PENDING")] | length' 2>/dev/null || echo "0") | ||
| failed=$(printf '%s' "$check_rollup" | jq '[.[] | select((.conclusion | test("FAILURE|TIMED_OUT|ACTION_REQUIRED")) or .state == "FAILURE" or .state == "ERROR")] | length' 2>/dev/null || echo "0") | ||
| passed=$(printf '%s' "$check_rollup" | jq '[.[] | select(.conclusion == "SUCCESS" or .state == "SUCCESS")] | length' 2>/dev/null || echo "0") | ||
| pr_ci_status="passed:${passed} failed:${failed} pending:${pending}" | ||
|
|
||
| # Extract names of failed checks for fix_ci routing | ||
| local failed_check_names | ||
| failed_check_names=$(printf '%s' "$check_rollup" | jq -r '[.[] | select((.conclusion | test("FAILURE|TIMED_OUT|ACTION_REQUIRED")) or .state == "FAILURE" or .state == "ERROR") | .name] | join(",")' 2>/dev/null || echo "") | ||
| if [[ -n "$failed_check_names" ]]; then | ||
| pr_ci_failed_checks="$failed_check_names" | ||
| total=$(printf '%s' "$check_rollup" | jq 'length' 2>/dev/null || echo "0") | ||
| pr_ci_summary="total:${total} passed:${passed} failed:${failed} pending:${pending}" | ||
|
|
||
| # Names of failed checks | ||
| local failed_names | ||
| failed_names=$(printf '%s' "$check_rollup" | jq -r '[.[] | select((.conclusion | test("FAILURE|TIMED_OUT|ACTION_REQUIRED")) or .state == "FAILURE" or .state == "ERROR") | .name] | join(", ")' 2>/dev/null || echo "") | ||
| if [[ -n "$failed_names" ]]; then | ||
| pr_ci_failed_names="$failed_names" | ||
| fi |
There was a problem hiding this comment.
Throughout this block, jq is called with 2>/dev/null, which suppresses standard error. According to the repository's general rules, stderr should not be suppressed for commands like jq to ensure that syntax or system errors are visible for debugging. While the || echo ... provides a fallback, hiding the actual error from jq makes it harder to diagnose issues with the JSON processing logic or malformed input from the gh command.
Please remove 2>/dev/null from these jq calls.
pr_state=$(printf '%s' "$pr_json" | jq -r '.state // "UNKNOWN"' || echo "UNKNOWN")
pr_merge_state=$(printf '%s' "$pr_json" | jq -r '.mergeStateStatus // "UNKNOWN"' || echo "UNKNOWN")
pr_review_decision=$(printf '%s' "$pr_json" | jq -r '.reviewDecision // "NONE"' || echo "NONE")
pr_base_ref=$(printf '%s' "$pr_json" | jq -r '.baseRefName // "main"' || echo "main")
local is_draft
is_draft=$(printf '%s' "$pr_json" | jq -r '.isDraft // false' || echo "false")
if [[ "$is_draft" == "true" ]]; then
pr_state="DRAFT"
fi
# CI summary
local check_rollup
check_rollup=$(printf '%s' "$pr_json" | jq -r '.statusCheckRollup // []' || echo "[]")
if [[ "$check_rollup" != "[]" && "$check_rollup" != "null" ]]; then
local pending failed passed total
pending=$(printf '%s' "$check_rollup" | jq '[.[] | select(.status == "IN_PROGRESS" or .status == "QUEUED" or .status == "PENDING")] | length' || echo "0")
failed=$(printf '%s' "$check_rollup" | jq '[.[] | select((.conclusion | test("FAILURE|TIMED_OUT|ACTION_REQUIRED")) or .state == "FAILURE" or .state == "ERROR")] | length' || echo "0")
passed=$(printf '%s' "$check_rollup" | jq '[.[] | select(.conclusion == "SUCCESS" or .state == "SUCCESS")] | length' || echo "0")
total=$(printf '%s' "$check_rollup" | jq 'length' || echo "0")
pr_ci_summary="total:${total} passed:${passed} failed:${failed} pending:${pending}"
# Names of failed checks
local failed_names
failed_names=$(printf '%s' "$check_rollup" | jq -r '[.[] | select((.conclusion | test("FAILURE|TIMED_OUT|ACTION_REQUIRED")) or .state == "FAILURE" or .state == "ERROR") | .name] | join(", ")' || echo "")
if [[ -n "$failed_names" ]]; then
pr_ci_failed_names="$failed_names"
fiReferences
- In shell scripts with 'set -e' enabled, use '|| true' to prevent the script from exiting when a command like 'jq' fails on an optional lookup. Do not suppress stderr with '2>/dev/null' so that actual syntax or system errors remain visible for debugging.
|
|
||
| # Validate required fields | ||
| local action | ||
| action=$(printf '%s' "$json_block" | jq -r '.action // ""' 2>/dev/null || echo "") |
There was a problem hiding this comment.
This jq call suppresses stderr using 2>/dev/null, which violates the repository's general rule about not hiding errors from commands. If the JSON block is malformed, this will fail silently, making it harder to debug why an action wasn't parsed correctly. Please remove 2>/dev/null to allow potential jq errors to be logged.
| action=$(printf '%s' "$json_block" | jq -r '.action // ""' 2>/dev/null || echo "") | |
| action=$(printf '%s' "$json_block" | jq -r '.action // ""' || echo "") |
References
- In shell scripts with 'set -e' enabled, use '|| true' to prevent the script from exiting when a command like 'jq' fails on an optional lookup. Do not suppress stderr with '2>/dev/null' so that actual syntax or system errors remain visible for debugging.
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/scripts/supervisor/ai-lifecycle.sh:
- Around line 397-406: The rebase_branch case only increments rebase_attempts on
success; move or duplicate the increment logic so rebase_attempts is increased
on every attempt (success or failure). Specifically, ensure the db update that
reads current_attempts and writes rebase_attempts = $((current_attempts + 1))
(the commands using current_attempts, db "$SUPERVISOR_DB" "SELECT ...", and db
"$SUPERVISOR_DB" "UPDATE ...") runs regardless of whether rebase_sibling_pr
succeeds — e.g., perform the SELECT/UPDATE immediately after calling
rebase_sibling_pr (or in a finally-style block) before returning so the counter
always increments.
- Around line 122-131: The current check uses kill -0 on tpid which can be in
remote:host:pid form and will incorrectly report remote workers as dead; update
the block that sets worker_alive to detect remote PID formats (check if tpid
contains ':' or matches a remote pattern) before calling kill -0. If tpid looks
remote (e.g., contains two colons or matches remote:host:pid), set worker_alive
to a remote indicator like "remote (tpid)" or extract the real PID after the
last ':' and handle accordingly; only run kill -0 when tpid is a plain numeric
PID. Ensure changes are applied around the tpid / worker_alive logic that
currently does kill -0.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.agents/scripts/supervisor/ai-lifecycle.sh.agents/scripts/supervisor/pulse.sh
| # Worker process state | ||
| local worker_alive="unknown" | ||
| if [[ -n "$tpid" && "$tpid" != "0" ]]; then | ||
| if kill -0 "$tpid" 2>/dev/null; then | ||
| worker_alive="yes" | ||
| else | ||
| worker_alive="no (PID $tpid dead)" | ||
| fi | ||
| else | ||
| worker_alive="no worker" |
There was a problem hiding this comment.
Handle remote worker PID formats to avoid premature AI actions.
worker_pid can be stored as remote:host:pid (see pulse.sh remote dispatch handling). kill -0 will mark those as dead and the AI may take corrective actions while the remote worker is still running.
🛠️ Suggested fix to detect remote workers
local worker_alive="unknown"
if [[ -n "$tpid" && "$tpid" != "0" ]]; then
- if kill -0 "$tpid" 2>/dev/null; then
- worker_alive="yes"
- else
- worker_alive="no (PID $tpid dead)"
- fi
+ if [[ "$tpid" == remote:* ]]; then
+ local _remote_host _remote_pid remote_helper
+ _remote_host=$(printf '%s' "$tpid" | cut -d: -f2)
+ _remote_pid=$(printf '%s' "$tpid" | cut -d: -f3)
+ remote_helper="${SCRIPT_DIR}/../remote-dispatch-helper.sh"
+ if [[ -x "$remote_helper" ]] && "$remote_helper" status "$task_id" "$_remote_host" >/dev/null 2>&1; then
+ worker_alive="yes (remote)"
+ else
+ worker_alive="no (remote worker dead)"
+ fi
+ elif kill -0 "$tpid" 2>/dev/null; then
+ worker_alive="yes"
+ else
+ worker_alive="no (PID $tpid dead)"
+ fi
else
worker_alive="no worker"
fi🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/scripts/supervisor/ai-lifecycle.sh around lines 122 - 131, The
current check uses kill -0 on tpid which can be in remote:host:pid form and will
incorrectly report remote workers as dead; update the block that sets
worker_alive to detect remote PID formats (check if tpid contains ':' or matches
a remote pattern) before calling kill -0. If tpid looks remote (e.g., contains
two colons or matches remote:host:pid), set worker_alive to a remote indicator
like "remote (tpid)" or extract the real PID after the last ':' and handle
accordingly; only run kill -0 when tpid is a plain numeric PID. Ensure changes
are applied around the tpid / worker_alive logic that currently does kill -0.
| rebase_branch) | ||
| log_info "ai-lifecycle: rebasing branch for $task_id" | ||
| update_task_status_tag "$task_id" "rebasing" "$repo_path" | ||
|
|
||
| if rebase_sibling_pr "$task_id" 2>>"$SUPERVISOR_LOG"; then | ||
| log_success "ai-lifecycle: rebase succeeded for $task_id" | ||
| update_task_status_tag "$task_id" "ci-running" "$repo_path" | ||
| # Increment rebase counter | ||
| local current_attempts | ||
| current_attempts=$(db "$SUPERVISOR_DB" "SELECT rebase_attempts FROM tasks WHERE id = '$escaped_id';" 2>/dev/null || echo "0") | ||
| db "$SUPERVISOR_DB" "UPDATE tasks SET rebase_attempts = $((current_attempts + 1)) WHERE id = '$escaped_id';" 2>/dev/null || true | ||
| return 0 | ||
| else | ||
| log_warn "ai-lifecycle: rebase failed for $task_id" | ||
| update_task_status_tag "$task_id" "has-conflicts" "$repo_path" | ||
| return 1 | ||
| fi | ||
| log_warn "ai-lifecycle: rebase failed for $task_id" | ||
| return 1 |
There was a problem hiding this comment.
Increment rebase_attempts on every try, not just success.
Right now failures don’t increment, so the “rebase_attempts > 3 → resolve_conflicts” guard may never trigger, leading to infinite rebase loops.
🛠️ Suggested fix to count every attempt
rebase_branch)
- if rebase_sibling_pr "$task_id" 2>>"$SUPERVISOR_LOG"; then
- log_success "ai-lifecycle: rebase succeeded for $task_id"
- local current_attempts
- current_attempts=$(db "$SUPERVISOR_DB" "SELECT rebase_attempts FROM tasks WHERE id = '$escaped_id';" 2>/dev/null || echo "0")
- db "$SUPERVISOR_DB" "UPDATE tasks SET rebase_attempts = $((current_attempts + 1)) WHERE id = '$escaped_id';" 2>/dev/null || true
- return 0
- fi
+ local current_attempts
+ current_attempts=$(db "$SUPERVISOR_DB" "SELECT rebase_attempts FROM tasks WHERE id = '$escaped_id';" 2>/dev/null || echo "0")
+ db "$SUPERVISOR_DB" "UPDATE tasks SET rebase_attempts = $((current_attempts + 1)) WHERE id = '$escaped_id';" 2>/dev/null || true
+ if rebase_sibling_pr "$task_id" 2>>"$SUPERVISOR_LOG"; then
+ log_success "ai-lifecycle: rebase succeeded for $task_id"
+ return 0
+ fi
log_warn "ai-lifecycle: rebase failed for $task_id"
return 1
;;📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| rebase_branch) | |
| log_info "ai-lifecycle: rebasing branch for $task_id" | |
| update_task_status_tag "$task_id" "rebasing" "$repo_path" | |
| if rebase_sibling_pr "$task_id" 2>>"$SUPERVISOR_LOG"; then | |
| log_success "ai-lifecycle: rebase succeeded for $task_id" | |
| update_task_status_tag "$task_id" "ci-running" "$repo_path" | |
| # Increment rebase counter | |
| local current_attempts | |
| current_attempts=$(db "$SUPERVISOR_DB" "SELECT rebase_attempts FROM tasks WHERE id = '$escaped_id';" 2>/dev/null || echo "0") | |
| db "$SUPERVISOR_DB" "UPDATE tasks SET rebase_attempts = $((current_attempts + 1)) WHERE id = '$escaped_id';" 2>/dev/null || true | |
| return 0 | |
| else | |
| log_warn "ai-lifecycle: rebase failed for $task_id" | |
| update_task_status_tag "$task_id" "has-conflicts" "$repo_path" | |
| return 1 | |
| fi | |
| log_warn "ai-lifecycle: rebase failed for $task_id" | |
| return 1 | |
| rebase_branch) | |
| local current_attempts | |
| current_attempts=$(db "$SUPERVISOR_DB" "SELECT rebase_attempts FROM tasks WHERE id = '$escaped_id';" 2>/dev/null || echo "0") | |
| db "$SUPERVISOR_DB" "UPDATE tasks SET rebase_attempts = $((current_attempts + 1)) WHERE id = '$escaped_id';" 2>/dev/null || true | |
| if rebase_sibling_pr "$task_id" 2>>"$SUPERVISOR_LOG"; then | |
| log_success "ai-lifecycle: rebase succeeded for $task_id" | |
| return 0 | |
| fi | |
| log_warn "ai-lifecycle: rebase failed for $task_id" | |
| return 1 |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/scripts/supervisor/ai-lifecycle.sh around lines 397 - 406, The
rebase_branch case only increments rebase_attempts on success; move or duplicate
the increment logic so rebase_attempts is increased on every attempt (success or
failure). Specifically, ensure the db update that reads current_attempts and
writes rebase_attempts = $((current_attempts + 1)) (the commands using
current_attempts, db "$SUPERVISOR_DB" "SELECT ...", and db "$SUPERVISOR_DB"
"UPDATE ...") runs regardless of whether rebase_sibling_pr succeeds — e.g.,
perform the SELECT/UPDATE immediately after calling rebase_sibling_pr (or in a
finally-style block) before returning so the counter always increments.



Summary
resolve_conflicts,fix_ci,fix_and_pushdispatch opus workers with full tool access to actually solve problems instead of just logging and waitingWhat was removed
fast_path_decision()casestatements trying to reconcile blocked/verify_failed taskscmd_pr_lifecyclewhich has its own deterministic logicprocess_post_pr_lifecycle()What replaced it
ai_decide()— a single function that sends the task's real-world state to opus and gets back a JSON action. The AI sees the same state a human would and picks the next step. No case statements, no fast-paths, no retry counters._dispatch_ai_worker()— for complex problems (conflicts, CI failures, unknown blockers), dispatches an interactive AI session with full tool access that can read code, understand context, and fix the actual problem.Testing
process_post_pr_lifecyclekept as backward-compatible redirectextract_parent_id,adopt_untracked_prs, Phase 3b (verify queue) preservedSummary by CodeRabbit
Release Notes