Skip to content

feat: enhance TODO.md parser with full field extraction and rich issue body (t020.1)#804

Merged
marcusquinn merged 2 commits intomainfrom
feature/t020.1
Feb 9, 2026
Merged

feat: enhance TODO.md parser with full field extraction and rich issue body (t020.1)#804
marcusquinn merged 2 commits intomainfrom
feature/t020.1

Conversation

@marcusquinn
Copy link
Owner

@marcusquinn marcusquinn commented Feb 9, 2026

Summary

  • Enhance parse_task_line() to extract 7 new fields: assignee, started, completed, actual, blocked_by, blocks, verified
  • Fix description extraction for indented subtask lines — use POSIX [[:space:]] instead of \s for macOS sed compatibility
  • Fix \s portability across 3 other sed calls (extract_notes, extract_plan_section, subtask rendering)
  • Enhance compose_issue_body() with structured metadata: status, assignee @mention, date fields, dependency links
  • Add assignee auto-assignment when pushing issues via cmd_push

Testing

Verified against real TODO.md data with issue-sync-helper.sh parse:

Task Fields Tested
t020.1 Indented subtask, assignee, started timestamp
t001 Top-level with subtasks, GH ref, estimate breakdown
t001.1 Subtask with blocked-by dependency
t001.2 Subtask with blocks dependency
t002 Completed task with actual time and completed date
t003 Declined task ([-] checkbox)
t135 Plan-linked task with many subtasks and notes

ShellCheck: zero warnings/errors (only pre-existing SC1091 info + SC2016 false positive).

Task

t020.1: Build core TODO.md parser + rich issue body composer in issue-sync-helper.sh

Summary by CodeRabbit

  • New Features
    • Enhanced task metadata parsing to extract additional fields including assignee, start/completion dates, time tracking, and dependency relationships.
    • Improved issue metadata display with structured headers and dedicated sections for blocked-by and blocking dependencies.
    • Better synchronization of assignments and metadata between task files and GitHub issues.

@coderabbitai
Copy link
Contributor

coderabbitai bot commented Feb 9, 2026

Warning

Rate limit exceeded

@marcusquinn has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 25 minutes and 22 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

Walkthrough

The issue-sync-helper.sh script is enhanced to extract and propagate additional TODO.md metadata fields (assignee, started, completed, actual, blocked_by, blocks, verified). These fields are parsed from task lines, composed into enriched GitHub issue bodies with a structured two-line metadata header, and synchronized through push/enrich/close workflows to maintain assignee and status consistency.

Changes

Cohort / File(s) Summary
Task parsing & metadata extraction
.agents/scripts/issue-sync-helper.sh
Enhanced parse_task_line() function now extracts seven additional metadata fields (assignee, started, completed, actual, blocked_by, blocks, verified) via dedicated regex captures alongside existing fields. Description extraction broadened to strip trailing metadata tokens comprehensively.
Issue body composition
.agents/scripts/issue-sync-helper.sh
compose_issue_body() now reads and renders new metadata fields with a two-line header: Task ID + Status/Estimate/Actual on line 1; Assignee, Logged, Started, Completed, Verified on line 2. Dedicated sections added for Blocked by and Blocks relationships when present.
GitHub synchronization workflows
.agents/scripts/issue-sync-helper.sh
Push/enrich/close flows now integrate assignee metadata parsing and propagation: GitHub assignee field set when available, TODO.md lines enriched with assignee if missing, and dry-run/actual calls updated to reflect new metadata in issue composition.
Subtask & anchor handling
.agents/scripts/issue-sync-helper.sh
Subtask line transformation adjusted for consistent whitespace normalization in GitHub checkbox conversion. Anchor extraction for PLANS.md headings and plan sections improved via refined regex patterns and spacing robustness.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related issues

  • t020: Issue Sync Enrichment #795 — Directly implements core parser enhancements, PLANS.md extraction, and push/enrich/pull/close integration wiring for extended metadata field handling in TODO.md synchronization.

Possibly related PRs

Poem

🔄 Metadata flows through the DevOps stream,
Parse, compose, sync—a well-oiled dream,
Assignees and blocks in harmony dance,
GitHub and TODO in sweet circumstance,
Shell scripts that whisper automation's sweet tale.

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and specifically describes the main enhancement: expanding the TODO.md parser to extract additional metadata fields and improve the GitHub issue body composition.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feature/t020.1

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link

Summary of Changes

Hello @marcusquinn, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly upgrades the issue-sync-helper.sh script by enhancing its ability to parse detailed metadata from TODO.md files and subsequently generate more comprehensive and structured GitHub issue bodies. The changes improve data extraction, ensure broader compatibility across different sed implementations, and automate the assignment of issues, leading to richer issue tracking and a more efficient workflow.

Highlights

  • Enhanced TODO.md Parser: The parse_task_line() function now extracts seven new metadata fields: assignee, started, completed, actual, blocked_by, blocks, and verified, significantly enriching the data parsed from TODO.md tasks.
  • Improved Cross-Platform Compatibility: Fixed sed regular expression portability issues by replacing \s with POSIX [[:space:]] in description extraction, note extraction, plan section extraction, and subtask rendering, ensuring consistent behavior across different operating systems like macOS.
  • Rich GitHub Issue Body Composition: The compose_issue_body() function has been updated to include structured metadata for status, assignee mentions, various date fields (logged, started, completed, verified), and dependency links (blocked-by, blocks), making generated issues more informative.
  • Automated Assignee Assignment: The cmd_push command now automatically assigns the extracted assignee to the newly created GitHub issue, streamlining the issue creation workflow.
Changelog
  • .agents/scripts/issue-sync-helper.sh
    • Updated parse_task_line() to extract new fields: assignee, started, completed, actual, blocked_by, blocks, verified.
    • Modified parse_task_line() description extraction to use [[:space:]] for portability and include new metadata fields in the exclusion list.
    • Replaced \s with [[:space:]] in sed commands within extract_notes(), extract_plan_section(), and subtask rendering logic in compose_issue_body() for improved portability.
    • Enhanced compose_issue_body() to display status, actual, assignee (with @mention), logged, started, completed, verified dates, and blocked_by/blocks dependencies.
    • Added logic to cmd_push() to extract the assignee field and use it for auto-assigning GitHub issues.
Activity
  • No human activity has been recorded on this pull request yet.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions
Copy link

github-actions bot commented Feb 9, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 60 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Mon Feb 9 20:21:22 UTC 2026: Code review monitoring started
Mon Feb 9 20:21:22 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 60

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 60
  • VULNERABILITIES: 0

Generated on: Mon Feb 9 20:21:25 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request significantly enhances the TODO.md parser by extracting several new fields and creating a richer issue body for GitHub synchronization. The changes also include important portability fixes by replacing \s with [[:space:]] in sed commands for macOS compatibility.

My review focuses on improving the efficiency of the parsing logic. While the new functionality is correct, the implementation introduces a large number of external process calls (grep, sed, cut) within loops, which can be very inefficient, especially when parsing large TODO.md files. I've provided suggestions to refactor these sections using shell built-ins like while read loops and regex matching, which will significantly improve performance and maintainability. I also identified a variable scope bug in cmd_push that could lead to incorrect issue titles.

Comment on lines +146 to +172
# Extract assignee
local assignee
assignee=$(echo "$line" | grep -oE 'assignee:[A-Za-z0-9._@-]+' | head -1 | sed 's/assignee://' || echo "")

# Extract started timestamp
local started
started=$(echo "$line" | grep -oE 'started:[0-9T:Z-]+' | head -1 | sed 's/started://' || echo "")

# Extract completed date
local completed
completed=$(echo "$line" | grep -oE 'completed:[0-9-]+' | head -1 | sed 's/completed://' || echo "")

# Extract actual time
local actual
actual=$(echo "$line" | grep -oE 'actual:[0-9.]+[hmd]' | head -1 | sed 's/actual://' || echo "")

# Extract blocked-by dependencies
local blocked_by
blocked_by=$(echo "$line" | grep -oE 'blocked-by:[A-Za-z0-9.,]+' | head -1 | sed 's/blocked-by://' || echo "")

# Extract blocks (downstream dependencies)
local blocks
blocks=$(echo "$line" | grep -oE 'blocks:[A-Za-z0-9.,]+' | head -1 | sed 's/blocks://' || echo "")

# Extract verified date
local verified
verified=$(echo "$line" | grep -oE 'verified:[0-9-]+' | head -1 | sed 's/verified://' || echo "")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

While this correctly extracts the new fields, the approach of using echo | grep | head | sed for each field is highly inefficient. For each task line parsed, this will spawn 28 new processes just for these 7 fields. This can significantly slow down the script, especially with a large TODO.md file.

A much more performant approach is to use Bash's built-in regex matching. This avoids creating any external processes for parsing.

    # Extract assignee
    local assignee=""
    if [[ "$line" =~ assignee:([A-Za-z0-9._@-]+) ]]; then
        assignee="${BASH_REMATCH[1]}"
    fi

    # Extract started timestamp
    local started=""
    if [[ "$line" =~ started:([0-9T:Z-]+) ]]; then
        started="${BASH_REMATCH[1]}"
    fi

    # Extract completed date
    local completed=""
    if [[ "$line" =~ completed:([0-9-]+) ]]; then
        completed="${BASH_REMATCH[1]}"
    fi

    # Extract actual time
    local actual=""
    if [[ "$line" =~ actual:([0-9.]+[hmd]) ]]; then
        actual="${BASH_REMATCH[1]}"
    fi

    # Extract blocked-by dependencies
    local blocked_by=""
    if [[ "$line" =~ blocked-by:([A-Za-z0-9.,]+) ]]; then
        blocked_by="${BASH_REMATCH[1]}"
    fi

    # Extract blocks (downstream dependencies)
    local blocks=""
    if [[ "$line" =~ blocks:([A-Za-z0-9.,]+) ]]; then
        blocks="${BASH_REMATCH[1]}"
    fi

    # Extract verified date
    local verified=""
    if [[ "$line" =~ verified:([0-9-]+) ]]; then
        verified="${BASH_REMATCH[1]}"
    fi

Comment on lines 595 to +609
local description tags estimate plan_link status logged
local assignee started completed actual blocked_by blocks verified
description=$(echo "$parsed" | grep '^description=' | cut -d= -f2-)
tags=$(echo "$parsed" | grep '^tags=' | cut -d= -f2-)
estimate=$(echo "$parsed" | grep '^estimate=' | cut -d= -f2-)
plan_link=$(echo "$parsed" | grep '^plan_link=' | cut -d= -f2-)
status=$(echo "$parsed" | grep '^status=' | cut -d= -f2-)
logged=$(echo "$parsed" | grep '^logged=' | cut -d= -f2-)
assignee=$(echo "$parsed" | grep '^assignee=' | cut -d= -f2-)
started=$(echo "$parsed" | grep '^started=' | cut -d= -f2-)
completed=$(echo "$parsed" | grep '^completed=' | cut -d= -f2-)
actual=$(echo "$parsed" | grep '^actual=' | cut -d= -f2-)
blocked_by=$(echo "$parsed" | grep '^blocked_by=' | cut -d= -f2-)
blocks=$(echo "$parsed" | grep '^blocks=' | cut -d= -f2-)
verified=$(echo "$parsed" | grep '^verified=' | cut -d= -f2-)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

Parsing the key=value output from parse_task_line by repeatedly calling echo | grep | cut for each of the 13+ fields is inefficient. It spawns multiple processes for every single field.

A more efficient and maintainable way to parse this is to use a single while read loop. This reads all variables in one go without spawning any external processes.

Suggested change
local description tags estimate plan_link status logged
local assignee started completed actual blocked_by blocks verified
description=$(echo "$parsed" | grep '^description=' | cut -d= -f2-)
tags=$(echo "$parsed" | grep '^tags=' | cut -d= -f2-)
estimate=$(echo "$parsed" | grep '^estimate=' | cut -d= -f2-)
plan_link=$(echo "$parsed" | grep '^plan_link=' | cut -d= -f2-)
status=$(echo "$parsed" | grep '^status=' | cut -d= -f2-)
logged=$(echo "$parsed" | grep '^logged=' | cut -d= -f2-)
assignee=$(echo "$parsed" | grep '^assignee=' | cut -d= -f2-)
started=$(echo "$parsed" | grep '^started=' | cut -d= -f2-)
completed=$(echo "$parsed" | grep '^completed=' | cut -d= -f2-)
actual=$(echo "$parsed" | grep '^actual=' | cut -d= -f2-)
blocked_by=$(echo "$parsed" | grep '^blocked_by=' | cut -d= -f2-)
blocks=$(echo "$parsed" | grep '^blocks=' | cut -d= -f2-)
verified=$(echo "$parsed" | grep '^verified=' | cut -d= -f2-)
local description="" tags="" estimate="" plan_link="" status="" logged=""
local assignee="" started="" completed="" actual="" blocked_by="" blocks="" verified=""
while IFS='=' read -r key value; do
case "$key" in
description) description="$value" ;;
tags) tags="$value" ;;
estimate) estimate="$value" ;;
plan_link) plan_link="$value" ;;
status) status="$value" ;;
logged) logged="$value" ;;
assignee) assignee="$value" ;;
started) started="$value" ;;
completed) completed="$value" ;;
actual) actual="$value" ;;
blocked_by) blocked_by="$value" ;;
blocks) blocks="$value" ;;
verified) verified="$value" ;;
esac
done <<< "$parsed"

Comment on lines 813 to +817
description=$(echo "$parsed" | grep '^description=' | cut -d= -f2-)
local tags
tags=$(echo "$parsed" | grep '^tags=' | cut -d= -f2-)
local assignee
assignee=$(echo "$parsed" | grep '^assignee=' | cut -d= -f2-)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

This block has two issues:

  1. Bug: The description variable is not declared as local within the loop. This can cause it to retain its value from a previous iteration if a task has no description, leading to incorrect issue titles.
  2. Inefficiency: The same inefficient parsing pattern of echo | grep | cut is used here.

I suggest declaring all variables as local and using a while read loop for efficient and correct parsing.

Suggested change
description=$(echo "$parsed" | grep '^description=' | cut -d= -f2-)
local tags
tags=$(echo "$parsed" | grep '^tags=' | cut -d= -f2-)
local assignee
assignee=$(echo "$parsed" | grep '^assignee=' | cut -d= -f2-)
local description="" tags="" assignee=""
while IFS='=' read -r key value; do
case "$key" in
description) description="$value" ;;
tags) tags="$value" ;;
assignee) assignee="$value" ;;
esac
done <<< "$parsed"

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @.agents/scripts/issue-sync-helper.sh:
- Line 670: The sed substitution producing gh_line for declined subtasks leaves
an opening "~~" but never closes it; update the transformation that handles "^-
\[-\]" (the sed fragment applied when building gh_line from subtask_line) so it
wraps the subtask text in matching "~~...~~" (i.e., produce "- [x]
~~<original-subtask-text>~~") rather than just prefixing "~~"; ensure the
replacement captures the rest of the line and inserts the closing "~~" after
that text so declined items render as strikethrough on GitHub.
🧹 Nitpick comments (2)
.agents/scripts/issue-sync-helper.sh (2)

1066-1081: sed_inplace replacement with unescaped $new_line is fragile with special characters.

On Line 1078, sed_inplace "${line_num}s|.*|${new_line}|" injects the raw TODO.md line content into a sed substitution. If the line contains | (the delimiter), & (sed's match-backreference), or \, the substitution will produce corrupt output or fail silently. This is a pre-existing pattern in the file, but this new code path adds another instance.

Consider escaping sed-special characters in $new_line before substitution, or using a different approach (e.g., awk with NR==line_num for line replacement).

🛡️ Proposed fix — escape sed-special characters
+            # Escape sed-special chars in replacement string
+            local escaped_line
+            escaped_line=$(printf '%s\n' "$new_line" | sed -e 's/[&\|/]/\\&/g')
-            sed_inplace "${line_num}s|.*|${new_line}|" "$todo_file"
+            sed_inplace "${line_num}s|.*|${escaped_line}|" "$todo_file"

Or, for a more robust alternative:

# Use awk for exact line replacement (no escaping needed)
awk -v ln="$line_num" -v replacement="$new_line" 'NR==ln{print replacement; next}1' "$todo_file" > "${todo_file}.tmp" && mv "${todo_file}.tmp" "$todo_file"

922-938: Consider syncing assignee during enrich as well.

cmd_push sets the GitHub issue assignee via --assignee (Line 849), and cmd_pull syncs assignees back to TODO.md (Line 1027+). However, cmd_enrich only updates the issue body — if the assignee: field changes in TODO.md after initial creation, running enrich won't update the actual GitHub assignee. The body will display the updated @mention, but the issue's assigned user remains stale.

If bi-directional assignee consistency is a goal, consider adding a gh issue edit --add-assignee call in the enrich flow when the parsed assignee differs from the current GitHub assignee.

marcusquinn added a commit that referenced this pull request Feb 9, 2026
The sed substitution for declined subtasks (- [-]) was adding opening ~~
but never closing it, producing broken strikethrough on GitHub. Now
captures the rest of the line and wraps it in matching ~~...~~.

Addresses CodeRabbit review feedback on PR #804.
…e body (t020.1)

- Add 7 new parsed fields: assignee, started, completed, actual, blocked_by, blocks, verified
- Fix description extraction for indented subtask lines (use POSIX [[:space:]] instead of \s for macOS sed compatibility)
- Fix sed \s portability in extract_notes, extract_plan_section, and subtask rendering
- Enhance compose_issue_body with status, assignee @mention, dates, and dependency metadata
- Add assignee auto-assignment when pushing issues via cmd_push
- All fields render in structured GitHub issue body with clear metadata sections
The sed substitution for declined subtasks (- [-]) was adding opening ~~
but never closing it, producing broken strikethrough on GitHub. Now
captures the rest of the line and wraps it in matching ~~...~~.

Addresses CodeRabbit review feedback on PR #804.
@github-actions
Copy link

github-actions bot commented Feb 9, 2026

🔍 Code Quality Report

�[0;35m[MONITOR]�[0m Code Review Monitoring Report

�[0;34m[INFO]�[0m Latest Quality Status:
SonarCloud: 0 bugs, 0 vulnerabilities, 61 code smells

�[0;34m[INFO]�[0m Recent monitoring activity:
Mon Feb 9 20:46:41 UTC 2026: Code review monitoring started
Mon Feb 9 20:46:42 UTC 2026: SonarCloud - Bugs: 0, Vulnerabilities: 0, Code Smells: 61

📈 Current Quality Metrics

  • BUGS: 0
  • CODE SMELLS: 61
  • VULNERABILITIES: 0

Generated on: Mon Feb 9 20:46:44 UTC 2026


Generated by AI DevOps Framework Code Review Monitoring

@marcusquinn marcusquinn merged commit 656e3e7 into main Feb 9, 2026
8 of 9 checks passed
@sonarqubecloud
Copy link

sonarqubecloud bot commented Feb 9, 2026

marcusquinn added a commit that referenced this pull request Feb 11, 2026
Mark as done with PR references:
- t020 Issue Sync Enrichment (6/6 subtasks, pr:#804)
- t109 Parallel Agents & Headless Dispatch (5/5 subtasks, pr:#348)
- t135 Code Quality Hardening (59/59 subtasks, pr:#422)
- t199 Content Creation Agent Architecture (11/11 subtasks, pr:#840)
- t236 Higgsfield Automator Production Hardening (6/6 subtasks, pr:#965)
- t245 Unified Quality Debt Sweep (4/4 subtasks, pr:#984)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant