t1284: fix create_improvement dedup window + add recent fixes context#2040
t1284: fix create_improvement dedup window + add recent fixes context#2040marcusquinn merged 1 commit intomainfrom
Conversation
…t (t1284) Investigation findings: - Feb 11-12 hangs (t311.2 x2, t311.3, t303) were caused by missing heartbeat (t1196), 30min hung timeout (t314), no per-task timeout scaling (t1199), and no graceful SIGTERM (t1222) — all fixed by Feb 18, 6 days before t1284 was created. - t1284 was created by the supervisor's AI reasoner looking at stale failure data (8-9 days old) because the create_improvement dedup only checked tasks completed within 24h, missing t314 which was merged 9 days earlier. Changes: 1. ai-actions.sh: extend completed-task dedup window from 24h to 30 days (configurable via AI_IMPROVEMENT_DEDUP_DAYS, default 30). Uses date comparison instead of today/yesterday exact match. This prevents re-creating improvement tasks for already-fixed systemic issues. 2. ai-context.sh: add Section 12 'Recently Fixed Systemic Issues' — shows the AI reasoner what bugfix/improvement tasks were completed in the last 30 days so it can avoid proposing investigations for already-fixed problems. Filters for #bugfix, #self-improvement, #automation, and hang/timeout/worker keywords.
Summary of ChangesHello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request addresses an issue where the AI supervisor was generating redundant improvement tasks for problems that had already been resolved. This occurred because the AI's deduplication mechanism only considered fixes from the last 24 hours, causing it to miss older, but still relevant, resolutions. The changes extend this deduplication window to 30 days and introduce a new context section for the AI, ensuring it is better informed about recent systemic fixes before proposing new work. Highlights
Changelog
Activity
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
WalkthroughThis pull request introduces a configurable Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
🔍 Code Quality Report�[0;35m[MONITOR]�[0m Code Review Monitoring Report �[0;34m[INFO]�[0m Latest Quality Status: �[0;34m[INFO]�[0m Recent monitoring activity: 📈 Current Quality Metrics
Generated on: Sat Feb 21 00:27:17 UTC 2026 Generated by AI DevOps Framework Code Review Monitoring |
|
There was a problem hiding this comment.
Code Review
The pull request effectively addresses the issue of redundant improvement task creation by extending the deduplication window to 30 days and providing the AI reasoner with a 'Recently Fixed Issues' context section. The implementation correctly handles date comparisons and follows established portability patterns for the date command. However, several grep operations require || true guards to prevent script termination under set -e and pipefail when no matches are found. Additionally, some redundant error suppressions should be removed to align with the repository's style guide, and head -1 should be replaced with tail -1 for more robust metadata extraction.
| # Only include if completed: timestamp is recent (today or yesterday) | ||
| # Only include if completed: timestamp is within the dedup window | ||
| local completed_date | ||
| completed_date=$(printf '%s' "$task_line" | grep -oE 'completed:[0-9]{4}-[0-9]{2}-[0-9]{2}' | head -1 | sed 's/completed://') |
There was a problem hiding this comment.
In a script with set -euo pipefail enabled, this command substitution will cause the script to exit if grep fails to find a match (returning exit code 1). Since the completed: field is optional or may not match the pattern, a || true guard is required to ensure the script continues. Additionally, use tail -1 instead of head -1 for robustness when extracting metadata fields, as the field might appear multiple times or be at the end of the line.
| completed_date=$(printf '%s' "$task_line" | grep -oE 'completed:[0-9]{4}-[0-9]{2}-[0-9]{2}' | head -1 | sed 's/completed://') | |
| completed_date=$(printf '%s' "$task_line" | grep -oE 'completed:[0-9]{4}-[0-9]{2}-[0-9]{2}' | tail -1 | sed 's/completed://' || true) |
References
- Use || true guards for commands that may fail under set -e (grep, arithmetic) (link)
- In shell scripts with 'set -e' enabled, use '|| true' to prevent the script from exiting when a command like 'jq' fails on an optional lookup.
- When extracting metadata fields from a line using 'grep' and 'sed' in shell scripts, use 'tail -1' instead of 'head -1' to ensure robustness, especially when the field might appear multiple times or be at the end of the line.
| while IFS= read -r task_line; do | ||
| # Only include if completed: timestamp is within the dedup window | ||
| local completed_date | ||
| completed_date=$(printf '%s' "$task_line" | grep -oE 'completed:[0-9]{4}-[0-9]{2}-[0-9]{2}' | head -1 | sed 's/completed://') |
There was a problem hiding this comment.
This pipeline will trigger a script exit under set -e and pipefail if grep returns no matches. Add || true to allow the script to proceed with an empty completed_date variable, which is already handled by the subsequent null check. Additionally, use tail -1 instead of head -1 for robustness when extracting metadata fields, as the field might appear multiple times or be at the end of the line.
| completed_date=$(printf '%s' "$task_line" | grep -oE 'completed:[0-9]{4}-[0-9]{2}-[0-9]{2}' | head -1 | sed 's/completed://') | |
| completed_date=$(printf '%s' "$task_line" | grep -oE 'completed:[0-9]{4}-[0-9]{2}-[0-9]{2}' | tail -1 | sed 's/completed://' || true) |
References
- Use || true guards for commands that may fail under set -e (grep, arithmetic) (link)
- When extracting metadata fields from a line using 'grep' and 'sed' in shell scripts, use 'tail -1' instead of 'head -1' to ensure robustness, especially when the field might appear multiple times or be at the end of the line.
| fi | ||
|
|
||
| local task_id | ||
| task_id=$(printf '%s' "$task_line" | grep -oE 't[0-9]+(\.[0-9]+)?' | head -1) |
There was a problem hiding this comment.
Similar to the completed_date assignment, this grep operation needs a || true guard to prevent the script from crashing if a task line somehow lacks a valid ID pattern. Additionally, use tail -1 instead of head -1 for robustness when extracting metadata fields, as the field might appear multiple times or be at the end of the line.
| task_id=$(printf '%s' "$task_line" | grep -oE 't[0-9]+(\.[0-9]+)?' | head -1) | |
| task_id=$(printf '%s' "$task_line" | grep -oE 't[0-9]+(\.[0-9]+)?' | tail -1 || true) |
References
- Use || true guards for commands that may fail under set -e (grep, arithmetic) (link)
- When extracting metadata fields from a line using 'grep' and 'sed' in shell scripts, use 'tail -1' instead of 'head -1' to ensure robustness, especially when the field might appear multiple times or be at the end of the line.
| if [[ -n "$completed_date" && "$completed_date" > "$cutoff_date" ]]; then | ||
| _score_task_line "$task_line" | ||
| fi | ||
| done < <(grep -E '^\s*- \[x\] t[0-9]' "$todo_file" 2>/dev/null) |
There was a problem hiding this comment.
The use of 2>/dev/null here is redundant because the existence of $todo_file is verified at the start of the function. Furthermore, it violates the style guide's restriction on blanket error suppression. An explicit || true guard should be used instead to handle the case where no completed tasks are found.
| done < <(grep -E '^\s*- \[x\] t[0-9]' "$todo_file" 2>/dev/null) | |
| done < <(grep -E '^\s*- \[x\] t[0-9]' "$todo_file" || true) |
References
- 2>/dev/null is acceptable ONLY when redirecting to log files, not blanket suppression (link)
- Avoid using 2>/dev/null to suppress errors on file operations if the file's existence has already been verified by a preceding check.
|
|
||
| found_fixes+="- **$task_id** (completed: $completed_date${pr_ref:+, $pr_ref}): $excerpt\n" | ||
| fix_count=$((fix_count + 1)) | ||
| done < <(grep -E '^\s*- \[x\] t[0-9]' "$todo_file" 2>/dev/null) |
There was a problem hiding this comment.
Remove the redundant 2>/dev/null suppression and add an explicit || true guard to handle cases where no completed tasks match the pattern, ensuring compliance with the style guide and robustness under set -e.
| done < <(grep -E '^\s*- \[x\] t[0-9]' "$todo_file" 2>/dev/null) | |
| done < <(grep -E '^\s*- \[x\] t[0-9]' "$todo_file" || true) |
References
- 2>/dev/null is acceptable ONLY when redirecting to log files, not blanket suppression (link)
- Avoid using 2>/dev/null to suppress errors on file operations if the file's existence has already been verified by a preceding check.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (2)
.agents/scripts/supervisor/ai-context.sh (1)
1536-1540: Keyword filter may be overly broad — common words like "fix", "worker", "dispatch" will match many non-systemic tasks.Terms like
fix,worker,supervisor, anddispatchare prevalent in this codebase's task vocabulary. A completed task like"t999 Add worker health dashboard"would match onworkereven though it's not a systemic fix. This could inflate the section with noise, diluting the signal for the AI reasoner and consuming context tokens.Consider tightening the filter to require at least one tag match (
#bugfix|#self-improvement|#automation) OR a combination of symptom keywords (e.g.,hang.*timeout|timeout.*hang|worker.*hang), rather than a flat OR across all terms.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/scripts/supervisor/ai-context.sh around lines 1536 - 1540, The current keyword filter in the lower_line check (using task_line -> lower_line and grep -qE '#bugfix|#self-improvement|#automation|fix|investigate|hang|timeout|worker|supervisor|dispatch') is too broad; tighten it by changing the condition to accept a task only if it matches at least one explicit tag (`#bugfix`|#self-improvement|#automation) OR a symptom-combination pattern (e.g., hang.*timeout|timeout.*hang|worker.*hang|supervisor.*hang|dispatch.*hang), and also make common single-word matches use word boundaries (e.g., \bfix\b) so substrings don't trigger. Update the grep/regex logic that references lower_line to implement these two alternative checks (tag OR combined-symptom) and drop the flat OR across common words..agents/scripts/supervisor/ai-actions.sh (1)
487-499: Date cutoff logic is solid; minor boundary note.The dual-fallback date computation (BSD
-v→ GNU-d) is correct and portable. The empty-string guard at line 490 gracefully skips the block if neither works.One subtlety: the
>comparison on line 495 is strictly greater-than, so a task completed exactly on the cutoff date is excluded. For a 30-day window this is effectively 29 days + fraction. In practice this is unlikely to matter, but worth being aware of if you ever reduce the window to a small number of days.If you want an inclusive boundary (≥), change to:
Optional: inclusive cutoff
- if [[ -n "$completed_date" && "$completed_date" > "$cutoff_date" ]]; then + if [[ -n "$completed_date" && ! "$completed_date" < "$cutoff_date" ]]; then🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In @.agents/scripts/supervisor/ai-actions.sh around lines 487 - 499, The current date comparison excludes tasks completed exactly on the cutoff; to make the cutoff inclusive, change the test in the loop that checks completed_date vs cutoff_date from a strict greater-than to a greater-or-equal comparison (replace if [[ -n "$completed_date" && "$completed_date" > "$cutoff_date" ]]; then with an inclusive test). For a quick fix use the Bash string-comparison operator: if [[ -n "$completed_date" && "$completed_date" >= "$cutoff_date" ]]; then which will include tasks completed on the cutoff date; alternatively, for robust chronological comparison convert completed_date and cutoff_date to epoch seconds with date -d/-v and compare numerically. Ensure the change is applied where completed_date is extracted and _score_task_line is called.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In @.agents/scripts/supervisor/ai-context.sh:
- Around line 1500-1512: build_recent_fixes_context currently only scans the
primary repo; update callers in build_ai_context to iterate over all registered
repos (use the existing all_context_repos variable) and invoke
build_recent_fixes_context for each repo, aggregating their outputs into Section
12 so TODO.md entries from secondary repos are included; ensure
build_recent_fixes_context still accepts a repo_path parameter (defaulting to
REPO_PATH) and preserve dedup_days handling when called multiple times.
- Around line 104-111: Update the stale inline comment in ai-context.sh (the
lines describing the create_improvement dedup window) to reflect that dedup is
no longer fixed at 24h but uses the configurable AI_IMPROVEMENT_DEDUP_DAYS
(default 30 days) as implemented in _keyword_prefilter_open_tasks in
ai-actions.sh; change the wording to mention the configurable dedup window
and/or default 30 days so the comment accurately matches the code behavior.
---
Nitpick comments:
In @.agents/scripts/supervisor/ai-actions.sh:
- Around line 487-499: The current date comparison excludes tasks completed
exactly on the cutoff; to make the cutoff inclusive, change the test in the loop
that checks completed_date vs cutoff_date from a strict greater-than to a
greater-or-equal comparison (replace if [[ -n "$completed_date" &&
"$completed_date" > "$cutoff_date" ]]; then with an inclusive test). For a quick
fix use the Bash string-comparison operator: if [[ -n "$completed_date" &&
"$completed_date" >= "$cutoff_date" ]]; then which will include tasks completed
on the cutoff date; alternatively, for robust chronological comparison convert
completed_date and cutoff_date to epoch seconds with date -d/-v and compare
numerically. Ensure the change is applied where completed_date is extracted and
_score_task_line is called.
In @.agents/scripts/supervisor/ai-context.sh:
- Around line 1536-1540: The current keyword filter in the lower_line check
(using task_line -> lower_line and grep -qE
'#bugfix|#self-improvement|#automation|fix|investigate|hang|timeout|worker|supervisor|dispatch')
is too broad; tighten it by changing the condition to accept a task only if it
matches at least one explicit tag (`#bugfix`|#self-improvement|#automation) OR a
symptom-combination pattern (e.g.,
hang.*timeout|timeout.*hang|worker.*hang|supervisor.*hang|dispatch.*hang), and
also make common single-word matches use word boundaries (e.g., \bfix\b) so
substrings don't trigger. Update the grep/regex logic that references lower_line
to implement these two alternative checks (tag OR combined-symptom) and drop the
flat OR across common words.
| # Section 12: Recently Fixed Systemic Issues (t1284) | ||
| # Shows improvement/bugfix tasks completed in the last 30 days so the AI | ||
| # reasoner doesn't re-create investigation tasks for already-fixed problems. | ||
| # The create_improvement dedup only checks 24h; this section provides broader | ||
| # awareness of recent fixes to prevent stale-data false positives. | ||
| if [[ "$scope" == "full" ]]; then | ||
| context+="$(build_recent_fixes_context "$repo_path")\n\n" | ||
| fi |
There was a problem hiding this comment.
Stale comment: dedup window is no longer 24h.
Lines 107–108 state "The create_improvement dedup only checks 24h" — but this very PR updates _keyword_prefilter_open_tasks in ai-actions.sh to use AI_IMPROVEMENT_DEDUP_DAYS (default 30 days). The comment is now inaccurate.
Suggested fix
# Section 12: Recently Fixed Systemic Issues (t1284)
# Shows improvement/bugfix tasks completed in the last 30 days so the AI
# reasoner doesn't re-create investigation tasks for already-fixed problems.
- # The create_improvement dedup only checks 24h; this section provides broader
- # awareness of recent fixes to prevent stale-data false positives.
+ # Complements the create_improvement dedup window (AI_IMPROVEMENT_DEDUP_DAYS)
+ # by giving the AI reasoner visibility into recent fixes before proposing actions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/scripts/supervisor/ai-context.sh around lines 104 - 111, Update the
stale inline comment in ai-context.sh (the lines describing the
create_improvement dedup window) to reflect that dedup is no longer fixed at 24h
but uses the configurable AI_IMPROVEMENT_DEDUP_DAYS (default 30 days) as
implemented in _keyword_prefilter_open_tasks in ai-actions.sh; change the
wording to mention the configurable dedup window and/or default 30 days so the
comment accurately matches the code behavior.
| build_recent_fixes_context() { | ||
| local repo_path="${1:-$REPO_PATH}" | ||
| local dedup_days="${AI_IMPROVEMENT_DEDUP_DAYS:-30}" | ||
| local output="## Recently Fixed Systemic Issues (last ${dedup_days}d)\n\n" | ||
| output+="**IMPORTANT**: Do NOT create \`create_improvement\` tasks for problems listed here — they were already fixed.\n" | ||
| output+="If the same symptom recurs, check whether the fix was effective before proposing new work.\n\n" | ||
|
|
||
| local todo_file="${repo_path}/TODO.md" | ||
| if [[ ! -f "$todo_file" ]]; then | ||
| output+="*TODO.md not found*\n" | ||
| printf '%b' "$output" | ||
| return 0 | ||
| fi |
There was a problem hiding this comment.
build_recent_fixes_context only scans the primary repo — misses fixes in other registered repos.
build_ai_context passes only $repo_path (the primary repo) on line 110. However, Section 3 (lines 50–65) iterates over all registered repos from the supervisor DB to build TODO context. A systemic fix completed in a secondary repo's TODO.md won't appear in Section 12, creating a dedup blind spot — the exact class of problem this PR aims to fix.
Consider iterating over all registered repos the same way Section 3 does:
Sketch: multi-repo support
In build_ai_context (around line 110):
if [[ "$scope" == "full" ]]; then
- context+="$(build_recent_fixes_context "$repo_path")\n\n"
+ # Include recent fixes from all registered repos (like Section 3)
+ local _fixes_ctx=""
+ if [[ -n "$all_context_repos" ]]; then
+ while IFS= read -r _fix_repo; do
+ [[ -z "$_fix_repo" || ! -d "$_fix_repo" ]] && continue
+ _fixes_ctx+="$(build_recent_fixes_context "$_fix_repo")\n\n"
+ done <<<"$all_context_repos"
+ if ! echo "$all_context_repos" | grep -qF "$repo_path"; then
+ _fixes_ctx+="$(build_recent_fixes_context "$repo_path")\n\n"
+ fi
+ else
+ _fixes_ctx="$(build_recent_fixes_context "$repo_path")\n\n"
+ fi
+ context+="$_fixes_ctx"
fiNote: $all_context_repos is already populated on line 51 and would need to be accessible here (it's currently in scope since build_ai_context is one function).
Also applies to: 104-111
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In @.agents/scripts/supervisor/ai-context.sh around lines 1500 - 1512,
build_recent_fixes_context currently only scans the primary repo; update callers
in build_ai_context to iterate over all registered repos (use the existing
all_context_repos variable) and invoke build_recent_fixes_context for each repo,
aggregating their outputs into Section 12 so TODO.md entries from secondary
repos are included; ensure build_recent_fixes_context still accepts a repo_path
parameter (defaulting to REPO_PATH) and preserve dedup_days handling when called
multiple times.
Auto-dismissed: bot review does not block autonomous pipeline



Summary
create_improvementdedup only checked tasks completed within 24h, missing t314 which was merged 9 days earlierInvestigation Findings
Why t1284 was created despite all fixes being in place: The supervisor's AI reasoner saw stale failure data (8-9 days old) and the
create_improvementdedup only checked tasks completed within 24h — missing t314 (merged 9 days earlier).Changes
ai-actions.shAI_IMPROVEMENT_DEDUP_DAYS)>) instead of exact today/yesterday matchai-context.shTesting
bash -n: PASS (both files)Summary by CodeRabbit
Release Notes