Skip to content

t1139: Add supervisor DB→TODO.md cancelled state consistency check (Phase 0.5c)#1735

Merged
marcusquinn merged 2 commits intomainfrom
feature/t1139
Feb 18, 2026
Merged

t1139: Add supervisor DB→TODO.md cancelled state consistency check (Phase 0.5c)#1735
marcusquinn merged 2 commits intomainfrom
feature/t1139

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Feb 18, 2026

Add Phase 0.5c to the supervisor pulse cycle that proactively syncs DB-cancelled task states to TODO.md.

Changes

  • supervisor/todo-sync.sh: New update_todo_on_cancelled() function — annotates open TODO.md tasks with a Notes line when DB status is cancelled (does NOT mark [x] — cancelled ≠ done)
  • supervisor/pulse.sh: New Phase 0.5c — runs every pulse, queries DB for cancelled tasks still open in TODO.md, calls update_todo_on_cancelled for each
  • cmd_reconcile_db_todo Gap 1: Extended to include cancelled alongside failed/blocked — Phase 7b now annotates cancelled tasks too
  • cmd_update_todo: Extended to handle cancelled state

Problem solved

When the supervisor cancels tasks (Phase 0.5 dedup, Phase 3b2 obsolete PR, etc.), the DB is updated but TODO.md still shows [ ]. This creates a persistent inconsistency where cancelled tasks appear dispatchable, causing the supervisor to waste reasoning cycles on dead tasks.

Design decisions

  • Phase 0.5c runs every pulse (not gated on idle like Phase 7b) — cancelled tasks should be cleaned up promptly, not deferred until the batch is idle
  • Annotation adds a Notes line rather than marking [x] — cancelled tasks are not completed deliverables
  • Duplicate-annotation guard: skips if Notes line already contains CANCELLED:
  • Zero ShellCheck violations

Ref #1710

Summary by CodeRabbit

  • New Features

    • Added automated consistency checks for cancelled tasks between database and task files.
    • Added task cancellation annotation capability with reason tracking.
  • Chores

    • Updated model catalog with new models and adjusted pricing/tier assignments.
    • Refreshed performance metrics and routing assignments for model selection.
  • Documentation

    • Added planning items for batch task creation, completed-task exclusion, and cost-efficiency optimization.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 18, 2026

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 4 minutes and 53 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Walkthrough

This PR introduces Phase 0.5c for cancelled task synchronization from the supervisor DB to TODO.md, adds the update_todo_on_cancelled function to annotate cancelled tasks in TODO.md with DB error messages, updates model catalog metadata with new versions and pricing data, and documents new planning items for batch operations and cost optimization.

Changes

Cohort / File(s) Summary
Cancelled Task Synchronization
.agents/scripts/supervisor/pulse.sh, .agents/scripts/supervisor/todo-sync.sh
Introduces Phase 0.5c to check DB→TODO.md consistency for cancelled tasks; adds update_todo_on_cancelled function to annotate cancelled tasks in TODO.md with reasons; integrates cancellation handling into cmd_update_todo and cmd_reconcile_db_todo flows; expands state coverage to include cancelled alongside failed/blocked in reconciliation logic.
Model Catalog Updates
MODELS.md
Updates metadata timestamp and dataset statistics (884→903 pattern data points); renames and adjusts model tiers/costs (claude-opus-4→claude-opus-4-6, claude-sonnet-4→claude-sonnet-4-6, claude-haiku-3.5→claude-haiku-4-5); refreshes performance leaderboard and task type statistics with adjusted success rates.
Planning Documentation
TODO.md
Adds three new work items: t1146 (batch-task-creation for TODO.md edits), t1148 (completed-task exclusion to prevent repeated AI actions), t1149 (model tier cost-efficiency checks to flag opus overuse).

Sequence Diagram(s)

sequenceDiagram
    participant Pulse as Phase 0.5c<br/>(pulse.sh)
    participant DB as Supervisor DB
    participant FS as Filesystem<br/>(TODO.md)
    participant Sync as todo-sync.sh<br/>(update_todo_on_cancelled)
    participant Git as Git Repo

    Pulse->>DB: Query repos with cancelled tasks
    DB-->>Pulse: Return cancelled task list
    
    Pulse->>FS: Check if TODO.md exists for repo
    alt TODO.md Found
        Pulse->>FS: Scan for open task entry
        FS-->>Pulse: Return task line + indentation
        
        Pulse->>Sync: Call update_todo_on_cancelled(task_id, reason)
        Sync->>FS: Locate/append Notes line with CANCELLED annotation
        Sync->>Sync: Sanitize reason, handle duplicates
        FS-->>Sync: Updated TODO.md
        
        Sync->>Git: Commit and push changes
        Git-->>Sync: Success confirmation
        Sync-->>Pulse: Return status (synced)
    else TODO.md Not Found
        Pulse-->>Pulse: Skip repo, log missing file
    end
    
    Pulse->>Pulse: Track sync count, report drift or success
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

Possibly related PRs

Poem

🗄️ Cancelled whispers now find their voice,
synced from the database to markdown's choice,
Phase 0.5c ensures consistency rings,
while models ascend on efficiency's wings! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly corresponds to the primary change: introducing a new Phase 0.5c consistency check between DB and TODO.md for cancelled tasks.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t1139

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical inconsistency where tasks cancelled in the database were not accurately reflected in their corresponding TODO.md entries, leading to inefficient supervisor processing. By introducing a new supervisor phase (0.5c) and enhancing existing task management commands, the system now proactively synchronizes the cancelled status from the database to TODO.md. This ensures that tasks are annotated with a cancellation reason without being marked as complete, thereby improving the accuracy of the supervisor's task list and preventing redundant work.

Highlights

  • New update_todo_on_cancelled() function: A new function was introduced to annotate open TODO.md tasks with a 'Notes: CANCELLED' line when their database status is cancelled. This function ensures that cancelled tasks are not marked as completed ([x]).
  • New Phase 0.5c in supervisor/pulse.sh: A new phase, 0.5c, was added to the supervisor's pulse cycle. This phase runs every pulse to proactively identify and synchronize DB-cancelled tasks that are still open in TODO.md, calling update_todo_on_cancelled for each.
  • Extended cmd_reconcile_db_todo (Phase 7b): The existing reconciliation process was updated to include cancelled tasks alongside failed and blocked tasks, ensuring that these tasks are also appropriately annotated in TODO.md.
  • Updated cmd_update_todo: The cmd_update_todo command was extended to handle the cancelled state, utilizing the newly introduced update_todo_on_cancelled function.
Changelog
  • .agents/scripts/supervisor/pulse.sh
    • Added new Phase 0.5c for DB-to-TODO.md cancelled task consistency.
    • Implemented logic to query for cancelled tasks in the database that are still open in TODO.md.
    • Integrated calls to update_todo_on_cancelled for inconsistent tasks.
  • .agents/scripts/supervisor/todo-sync.sh
    • Implemented update_todo_on_cancelled function to add cancellation notes to TODO.md tasks.
    • Updated cmd_update_todo to process tasks with a 'cancelled' status.
    • Extended cmd_reconcile_db_todo to include 'cancelled' tasks in its reconciliation query.
    • Modified cmd_reconcile_db_todo to conditionally call update_todo_on_cancelled for cancelled tasks.
  • MODELS.md
    • Updated model performance data and the 'Last updated' timestamp.
    • Adjusted model names and associated input/output costs.
    • Refreshed performance leaderboard statistics for various models and task types.
  • TODO.md
    • Appended new task t1149 for implementing model tier cost-efficiency checks.
Activity
  • No human activity has been recorded for this pull request.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a valuable consistency check for cancelled tasks between the database and TODO.md, which will help prevent the supervisor from wasting cycles. The new update_todo_on_cancelled function is well-implemented and adheres to the repository's style guide. I've identified a couple of minor areas for improvement related to code duplication and a small bug in a generated string, both of which should be straightforward to address. Overall, this is a solid addition that enhances the supervisor's robustness.

sed_inplace "${next_line_num}s/$/${append_text}/" "$todo_file"
else
# Insert a new Notes line after the task
local notes_line="${indent} - Notes: CANCELLED by supervisor: ${safe_reason}"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The hardcoded string by supervisor: in this line is redundant. The reason variable, which is interpolated as ${safe_reason}, already defaults to "cancelled by supervisor". This leads to a note like Notes: CANCELLED by supervisor: cancelled by supervisor. The string should be ${indent} - Notes: CANCELLED: ${safe_reason} to be consistent with the if branch on line 832 and the similar update_todo_on_blocked function.

Suggested change
local notes_line="${indent} - Notes: CANCELLED by supervisor: ${safe_reason}"
local notes_line="${indent} - Notes: CANCELLED: ${safe_reason}"

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a509c82. Removed the redundant 'by supervisor:' literal — the else branch now uses CANCELLED: ${safe_reason} consistent with the if branch append path and the update_todo_on_blocked pattern.

Comment on lines 958 to 962
cancelled)
local terror
terror=$(db "$SUPERVISOR_DB" "SELECT error FROM tasks WHERE id = '$escaped_id';")
update_todo_on_cancelled "$task_id" "${terror:-cancelled by supervisor}"
;;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This new cancelled case duplicates the database query from the failed case. To improve maintainability and avoid redundancy, consider refactoring to combine the logic for both failed and cancelled statuses. You could use a single case branch failed | cancelled) and then use an if statement inside to call the appropriate update function. This would avoid calling the database twice for similar logic.

References
  1. In shell scripts, extract repeated logic into an internal helper function to improve maintainability. This applies even for standalone scripts where external source dependencies are avoided.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in a509c82. Combined failed and cancelled into a single failed | cancelled) branch with one DB query, using an if inside to dispatch to the appropriate update function. Eliminates the duplicate query and keeps the logic co-located.

…139)

- Add update_todo_on_cancelled() to todo-sync.sh: annotates open TODO.md tasks
  with a Notes line when DB status is cancelled (does not mark [x])
- Add Phase 0.5c to pulse cycle: runs every pulse, finds DB-cancelled tasks
  still open in TODO.md, calls update_todo_on_cancelled for each
- Extend cmd_reconcile_db_todo Gap 1 to include 'cancelled' state alongside
  failed/blocked — Phase 7b now annotates cancelled tasks too
- Extend cmd_update_todo to handle 'cancelled' state via update_todo_on_cancelled
- Zero ShellCheck violations

Eliminates the persistent inconsistency where supervisor-cancelled tasks
(Phase 0.5 dedup, Phase 3b2 obsolete PR, etc.) remain open in TODO.md,
causing the supervisor to waste reasoning cycles on dead tasks.
@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 24 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 18 18:36:18 UTC 2026: Code review monitoring started
Wed Feb 18 18:36:18 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 24

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 24
  • VULNERABILITIES: 0

Generated on: Wed Feb 18 18:36:20 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@github-actions
Copy link

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 24 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Wed Feb 18 18:49:57 UTC 2026: Code review monitoring started
Wed Feb 18 18:49:58 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 24

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 24
  • VULNERABILITIES: 0

Generated on: Wed Feb 18 18:50:00 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@sonarqubecloud
Copy link

marcusquinn added a commit that referenced this pull request Feb 18, 2026
…ith dead workers (t1145)

Phase 0.7 (pulse.sh): when a stale 'evaluating' task has a pr_url, route to
'pr_review' instead of re-queuing — the work is done, only the evaluation
process died. Previously, tasks with completed PRs were wastefully re-run.

supervisor-helper.sh: add 'evaluating:pr_review' to VALID_TRANSITIONS to
support the new Phase 0.7 routing path.

cleanup.sh: include 'evaluating' in stale PID file detection alongside
'running' and 'dispatched' for consistent cleanup.

Manual recovery applied to stale tasks found at time of fix:
- t1138: evaluating (dead PID 63748) + PR #1736 → pr_review
- t1139: running (dead PID 19930) + PR #1735 → pr_review (via evaluating)
- t1146: running (dead PID 12745) no PR → queued (retry)
- t1148: running (dead PID 14125) no PR → queued (retry)
- t1149: running (dead PID 15457) no PR → queued (retry)

DB now consistent: 1 running (t1145/self), 0 evaluating, 0 stale entries.
Supersedes t1140 and t1132.
@marcusquinn marcusquinn merged commit 9cbc757 into main Feb 18, 2026
19 checks passed
@marcusquinn marcusquinn deleted the feature/t1139 branch February 18, 2026 19:00
marcusquinn added a commit that referenced this pull request Feb 18, 2026
… with dead workers (#1771)

* chore: regenerate MODELS.md leaderboard (t1012, t1129)

* fix: resolve supervisor DB inconsistency — stale running/evaluating with dead workers (t1145)

Phase 0.7 (pulse.sh): when a stale 'evaluating' task has a pr_url, route to
'pr_review' instead of re-queuing — the work is done, only the evaluation
process died. Previously, tasks with completed PRs were wastefully re-run.

supervisor-helper.sh: add 'evaluating:pr_review' to VALID_TRANSITIONS to
support the new Phase 0.7 routing path.

cleanup.sh: include 'evaluating' in stale PID file detection alongside
'running' and 'dispatched' for consistent cleanup.

Manual recovery applied to stale tasks found at time of fix:
- t1138: evaluating (dead PID 63748) + PR #1736 → pr_review
- t1139: running (dead PID 19930) + PR #1735 → pr_review (via evaluating)
- t1146: running (dead PID 12745) no PR → queued (retry)
- t1148: running (dead PID 14125) no PR → queued (retry)
- t1149: running (dead PID 15457) no PR → queued (retry)

DB now consistent: 1 running (t1145/self), 0 evaluating, 0 stale entries.
Supersedes t1140 and t1132.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant