feat(core): add edge case hunter review task#1790
feat(core): add edge case hunter review task#1790bmadcode merged 1 commit intobmad-code-org:mainfrom
Conversation
Method-driven review that exhaustively walks branching paths and boundary conditions, reporting only unhandled gaps. Orthogonal to the attitude-driven adversarial review for complementary coverage.
📝 WalkthroughWalkthroughIntroduces a new XML task definition for "Edge Case Hunter Review" that performs exhaustive path-tracing analysis on code changes to identify and report unhandled edge cases as structured JSON output. Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 11
🧹 Nitpick comments (1)
src/core/tasks/review-edge-case-hunter.xml (1)
1-64: No example input/output provided for LLM alignment.Complex structured output tasks benefit significantly from few-shot examples. Without a concrete example showing:
- Sample input (a real diff or function)
- Expected output (correctly formatted JSON array)
Different LLMs will interpret the spec differently, leading to inconsistent output formats (e.g., some might add markdown code fences despite the instruction, some might use single quotes instead of double quotes in JSON).
Add an
<example>section demonstrating expected behavior:<example> <example-input><![CDATA[ function divide(a, b) { return a / b; } ]]></example-input> <example-output><![CDATA[ [{"location":"unknown:2","trigger_condition":"b is zero causing division by zero","guard_snippet":"if (b === 0) throw new Error('Division by zero')","potential_consequence":"Runtime error or Infinity result"}] ]]></example-output> </example>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/core/tasks/review-edge-case-hunter.xml` around lines 1 - 64, The task XML lacks an <example> demonstrating expected input/output which causes inconsistent LLM behavior; add an <example> element inside the <task id="_bmad/core/tasks/review-edge-case-hunter.xml" name="Edge Case Hunter Review"> block (e.g., before </task>) containing an <example-input> CDATA with a minimal sample diff/function and an <example-output> CDATA showing the exact JSON array format required (use double quotes, no markdown, single-line JSON array), ensuring the example adheres to the declared output-format rules; keep the example concise and clearly aligned with the existing output-format so the LLM can follow it.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/core/tasks/review-edge-case-hunter.xml`:
- Line 32: Change the absolute claim on line 32 ("exhaustive path enumeration —
mechanically walk every branch") to a tempered statement and add a short
disclaimer that the tool provides best-effort heuristic exploration rather than
formal or runtime execution; e.g., replace the phrase with wording that
acknowledges LLMs estimate branches via pattern-based token prediction and may
miss dynamic or runtime-only paths, and include a one-sentence disclaimer that
results are not formal verification but heuristics to help find likely edge
cases.
- Around line 40-44: Consolidate the duplicate/contradictory empty-content
handling by choosing a single behavior and applying it to both places: update
the Step n="1" action "If content to review is empty, ask for clarification and
abort task" and the separate rule "HALT if content is empty or unreadable" so
they match (either both prompt for clarification then abort, or both immediately
HALT); ensure the canonical behavior is documented once (e.g., keep it in Step
n="1") and remove or replace the other occurrence so there is no conflicting
instruction across the flow.
- Around line 17-24: The output-format rule currently lacks limits and
truncation rules which can cause massive or truncated JSON; update the
<output-format> XML specification to (1) add a mandatory "severity" field to
each finding and a global "max_findings" integer to cap results, (2) define a
deterministic sorting/prioritization (e.g., sort by severity then location) so
truncation is predictable, (3) specify a truncation strategy that outputs a
valid JSON array when the number of findings exceeds max_findings (e.g., include
top N by priority and a final summary object if needed), and (4) document these
new constraints wherever the <output-format> element is defined so consumers
(and code that emits objects) adhere to the new fields and limit.
- Around line 17-24: Update the output-format spec (the <output-format> element)
to define fallbacks and escaping: require "location" to be "file:line" when
available and otherwise use "unknown" or "snippet" (e.g., "unknown" or
"snippet:<context>"), state that "guard_snippet" must be returned in a neutral
pseudo-code/plaintext style when language is unspecified, explicitly allow an
empty array only when a separate "halt_on_zero_findings" flag is set (otherwise
require at least one object), and add a rule that all JSON string values must be
JSON-escaped (quotes/backslashes/newlines) so code snippets remain valid JSON;
make these rules part of the output-format text so consumers and generators can
unambiguously implement them.
- Around line 4-5: The task's XML id uses a leading underscore
("id=\"_bmad/core/tasks/review-edge-case-hunter.xml\"") which implies
internal/private but the task is being registered publicly as
"bmad-review-edge-case-hunter" in module-help.csv; either remove the leading
underscore from the task id to match the public name (and update any code that
references the underscored id, e.g., task registration/lookup logic) or add a
clear comment/docs entry explaining why a private-style id is intentionally
exposed and ensure module-help.csv and the task id are consistent; make the
change in the task XML and any registration/listing locations so names align.
- Around line 6-9: Update the review-edge-case-hunter rule to avoid guaranteed
false positives by changing the hard requirements "scan only the diff hunks and
list boundaries that lack an explicit guard in the diff" and "Ignore the rest of
the codebase unless the provided content explicitly references external
functions": allow optional surrounding context input to be considered (e.g.,
accept a "context" blob) or soften the language to produce "potential gaps"
warnings that explicitly note when guards may exist outside the diff scope or be
enforced by the type system (refer to phrases "scan only the diff hunks" and
"Ignore the rest of the codebase" in the objective), and update the rule text to
mention checking for external/shared validation layers, pre-existing null
checks, or type-level guarantees before reporting missing guards.
- Around line 26-30: Update the <llm> instructional text that currently reads
"MANDATORY: Execute ALL steps in the flow section IN EXACT ORDER" to remove the
absolute "ALL" / "IN EXACT ORDER" wording and instead state that steps should be
executed sequentially but may be halted or skipped when specified
halt-conditions occur; specifically modify the string "MANDATORY: Execute ALL
steps in the flow section IN EXACT ORDER" to something like "Execute steps
sequentially; respect halt-conditions which may stop or skip remaining steps"
and ensure the surrounding directives mentioning "HALT immediately when
halt-conditions are met" and "Each action xml tag within step xml tag is a
REQUIRED action to complete that step" remain intact so the parser still
enforces step-level requirements while allowing halt-conditions to preempt
subsequent steps.
- Around line 59-62: The "HALT if zero findings - this is suspicious, re-analyze
or ask for guidance" condition in the halt-conditions block is contradictory;
change it to a single deterministic behavior: either replace that condition in
the <halt-conditions> with a clear halt-only rule (e.g., "HALT if zero findings;
report 'suspicious: zero findings' and stop") or replace it with a bounded
re-analysis rule (e.g., "RE-ANALYZE up to N times if zero findings; after N
attempts HALT and report 'suspicious: zero findings'"), and update the condition
text for the element containing that phrase so the system uses the chosen
deterministic policy consistently.
- Around line 11-15: The current task inputs (the <inputs> block with <input
name="content"> and <input name="also_consider">) lack explicit language and
content type, causing unreliable auto-detection and incorrect guard_snippet
generation; update the inputs to add a required <input name="language"> (e.g.,
"python", "c", "js") and a required <input name="content_type"> (e.g.,
"full_file", "function_snippet", "unified_diff", "git_patch") and also document
the expected format for <input name="also_consider"> (specify whether it accepts
plain prose, newline-separated bullets, or JSON) so code paths that generate
guard_snippet, boundary checks, and idiomatic null/overflow handling in
functions like the review task parser will use the explicit values instead of
auto-detection.
- Around line 46-52: Update the "Exhaustive Path Analysis" step (step n="2") to
deduplicate and group repeated edge-case findings instead of emitting one
finding per occurrence: when collecting unhandled paths (the action "Collect
only the unhandled paths as findings - discard handled ones silently"),
aggregate identical patterns (e.g., "unchecked array access in loop") into a
single finding that includes a count, representative examples, and optional
locations; additionally add validation and error handling for the also_consider
input (referenced in the action "If also_consider input was provided") so that
malformed or unparseable also_consider is detected, logged/reported as an input
parsing error and either ignored or requested for clarification, rather than
silently hallucinated.
- Around line 32-34: The boundary-condition list in the review-edge-case-hunter
XML is incomplete and overstates static-analysis capabilities; update the
enumerated boundary items (the <i> elements that contain the boundary list) to
include Encoding/BOM and invalid sequences, Locale/Timezone edge cases,
Floating-point issues (NaN, Infinity, -0), File-system path edge cases (relative
vs absolute, symlinks, permissions), Resource exhaustion (memory/file handles),
and Unicode normalization (NFC vs NFD), and mark or remove "concurrency" and
"timing" from that list (or add a clear note that these require dynamic/runtime
analysis) so the document accurately reflects what can be detected by exhaustive
static path enumeration.
---
Nitpick comments:
In `@src/core/tasks/review-edge-case-hunter.xml`:
- Around line 1-64: The task XML lacks an <example> demonstrating expected
input/output which causes inconsistent LLM behavior; add an <example> element
inside the <task id="_bmad/core/tasks/review-edge-case-hunter.xml" name="Edge
Case Hunter Review"> block (e.g., before </task>) containing an <example-input>
CDATA with a minimal sample diff/function and an <example-output> CDATA showing
the exact JSON array format required (use double quotes, no markdown,
single-line JSON array), ensuring the example adheres to the declared
output-format rules; keep the example concise and clearly aligned with the
existing output-format so the LLM can follow it.
ℹ️ Review info
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
src/core/module-help.csvis excluded by!**/*.csv
📒 Files selected for processing (1)
src/core/tasks/review-edge-case-hunter.xml
| <task id="_bmad/core/tasks/review-edge-case-hunter.xml" name="Edge Case Hunter Review" | ||
| description="Walk every branching path and boundary condition in content, report only unhandled edge cases. Orthogonal to adversarial review - method-driven not attitude-driven."> |
There was a problem hiding this comment.
Task ID uses underscore prefix suggesting internal/private naming, but is being publicly registered.
The id="_bmad/core/tasks/..." uses a leading underscore which conventionally indicates internal/private entities. However, per the PR description, this is being added to module-help.csv as bmad-review-edge-case-hunter for public use. The naming inconsistency could confuse maintainers about whether this is meant to be internal or external.
Either remove the underscore from the id or document why internal-named tasks are exposed publicly.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/tasks/review-edge-case-hunter.xml` around lines 4 - 5, The task's
XML id uses a leading underscore
("id=\"_bmad/core/tasks/review-edge-case-hunter.xml\"") which implies
internal/private but the task is being registered publicly as
"bmad-review-edge-case-hunter" in module-help.csv; either remove the leading
underscore from the task id to match the public name (and update any code that
references the underscored id, e.g., task registration/lookup logic) or add a
clear comment/docs entry explaining why a private-style id is intentionally
exposed and ensure module-help.csv and the task id are consistent; make the
change in the task XML and any registration/listing locations so names align.
| <objective>You are a pure path tracer. Never comment on whether code is good or bad; only list missing handling. | ||
| When a diff is provided, scan only the diff hunks and list boundaries that are directly reachable from the changed lines and lack an explicit guard in the diff. | ||
| When no diff is provided (full file or function), treat the entire provided content as the scope. | ||
| Ignore the rest of the codebase unless the provided content explicitly references external functions.</objective> |
There was a problem hiding this comment.
Scope rule creates false positives: guards outside diff are ignored.
Line 7 says scan only diff hunks and report boundaries "that lack an explicit guard in the diff." But defensive code often has guards in existing (unchanged) code that protect newly added paths. This rule guarantees false positives when:
- New code relies on a pre-existing null check earlier in the function
- A shared validation layer guards the input before this code runs
- Type system already enforces constraints (e.g., TypeScript
NonNullable)
Line 9 says "Ignore the rest of the codebase unless the provided content explicitly references external functions" — this is too restrictive for accurate analysis.
Consider either:
- Accepting surrounding context as an optional input to check for existing guards
- Softening the instruction to "report potential gaps, noting when guards may exist outside the diff scope"
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/tasks/review-edge-case-hunter.xml` around lines 6 - 9, Update the
review-edge-case-hunter rule to avoid guaranteed false positives by changing the
hard requirements "scan only the diff hunks and list boundaries that lack an
explicit guard in the diff" and "Ignore the rest of the codebase unless the
provided content explicitly references external functions": allow optional
surrounding context input to be considered (e.g., accept a "context" blob) or
soften the language to produce "potential gaps" warnings that explicitly note
when guards may exist outside the diff scope or be enforced by the type system
(refer to phrases "scan only the diff hunks" and "Ignore the rest of the
codebase" in the objective), and update the rule text to mention checking for
external/shared validation layers, pre-existing null checks, or type-level
guarantees before reporting missing guards.
| <inputs> | ||
| <input name="content" desc="Content to review - diff, full file, or function" /> | ||
| <input name="also_consider" required="false" | ||
| desc="Optional areas to keep in mind during review alongside normal edge-case analysis" /> | ||
| </inputs> |
There was a problem hiding this comment.
Missing critical inputs: language and explicit content type.
The task auto-detects content type (line 43) but this is unreliable—unified diffs vs context diffs vs git diffs all look different, and a function snippet might be mistaken for a full file. Without explicit language input:
- The LLM cannot reliably generate
guard_snippetin the correct syntax - Boundary conditions differ by language (e.g., integer overflow in C vs Python)
- Idiomatic patterns for null-checking vary wildly
The also_consider input has no format specification—is it prose, bullet points, code?
Proposed fix: add language and content_type inputs
<inputs>
<input name="content" desc="Content to review - diff, full file, or function" />
+ <input name="content_type" required="false"
+ desc="Explicit content type: 'diff', 'file', or 'function'. If omitted, auto-detected." />
+ <input name="language" required="false"
+ desc="Programming language of the content (e.g., 'typescript', 'python'). Required for accurate guard_snippet generation." />
<input name="also_consider" required="false"
- desc="Optional areas to keep in mind during review alongside normal edge-case analysis" />
+ desc="Optional areas to keep in mind during review, as prose or bullet points" />
</inputs>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <inputs> | |
| <input name="content" desc="Content to review - diff, full file, or function" /> | |
| <input name="also_consider" required="false" | |
| desc="Optional areas to keep in mind during review alongside normal edge-case analysis" /> | |
| </inputs> | |
| <inputs> | |
| <input name="content" desc="Content to review - diff, full file, or function" /> | |
| <input name="content_type" required="false" | |
| desc="Explicit content type: 'diff', 'file', or 'function'. If omitted, auto-detected." /> | |
| <input name="language" required="false" | |
| desc="Programming language of the content (e.g., 'typescript', 'python'). Required for accurate guard_snippet generation." /> | |
| <input name="also_consider" required="false" | |
| desc="Optional areas to keep in mind during review, as prose or bullet points" /> | |
| </inputs> |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/tasks/review-edge-case-hunter.xml` around lines 11 - 15, The current
task inputs (the <inputs> block with <input name="content"> and <input
name="also_consider">) lack explicit language and content type, causing
unreliable auto-detection and incorrect guard_snippet generation; update the
inputs to add a required <input name="language"> (e.g., "python", "c", "js") and
a required <input name="content_type"> (e.g., "full_file", "function_snippet",
"unified_diff", "git_patch") and also document the expected format for <input
name="also_consider"> (specify whether it accepts plain prose, newline-separated
bullets, or JSON) so code paths that generate guard_snippet, boundary checks,
and idiomatic null/overflow handling in functions like the review task parser
will use the explicit values instead of auto-detection.
| <output-format>Return ONLY a valid JSON array of objects. Each object must contain exactly these four fields and nothing else: | ||
| { | ||
| "location": "file:line", | ||
| "trigger_condition": "one-line description (max 15 words)", | ||
| "guard_snippet": "minimal code sketch that closes the gap", | ||
| "potential_consequence": "what could actually go wrong (max 15 words)" | ||
| } | ||
| No extra text, no explanations, no markdown wrapping.</output-format> |
There was a problem hiding this comment.
No output size constraint or truncation strategy.
If a large file has 50+ unhandled edge cases, the JSON output could become massive. There's no guidance on:
- Maximum number of findings to report
- Priority/severity field to filter by importance
- Truncation behavior if output exceeds token limits
This could cause output truncation mid-JSON, resulting in invalid output that violates the "valid JSON array" requirement.
Proposed fix: add severity and limit guidance
<output-format>Return ONLY a valid JSON array of objects. Each object must contain exactly these four fields and nothing else:
{
"location": "file:line",
"trigger_condition": "one-line description (max 15 words)",
"guard_snippet": "minimal code sketch that closes the gap",
- "potential_consequence": "what could actually go wrong (max 15 words)"
+ "potential_consequence": "what could actually go wrong (max 15 words)",
+ "severity": "high|medium|low"
}
-No extra text, no explanations, no markdown wrapping.</output-format>
+No extra text, no explanations, no markdown wrapping. Limit to 25 most severe findings if count exceeds 25.</output-format>Also applies to: 54-56
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/tasks/review-edge-case-hunter.xml` around lines 17 - 24, The
output-format rule currently lacks limits and truncation rules which can cause
massive or truncated JSON; update the <output-format> XML specification to (1)
add a mandatory "severity" field to each finding and a global "max_findings"
integer to cap results, (2) define a deterministic sorting/prioritization (e.g.,
sort by severity then location) so truncation is predictable, (3) specify a
truncation strategy that outputs a valid JSON array when the number of findings
exceeds max_findings (e.g., include top N by priority and a final summary object
if needed), and (4) document these new constraints wherever the <output-format>
element is defined so consumers (and code that emits objects) adhere to the new
fields and limit.
Output format has multiple undefined edge cases.
"location": "file:line"— What if file info is unavailable (e.g., piped function snippet, stdin)? What if line numbers don't exist in the input format? No fallback specified."guard_snippet"— What programming language should this be written in? The task doesn't receive a language input.- Is an empty array
[]valid output? Line 60 says zero findings is "suspicious" and to halt, but the format implies JSON array output. Contradicts itself. - No escaping rules for JSON strings containing code with quotes/backslashes.
Proposed fix: add explicit handling for unknowns
<output-format>Return ONLY a valid JSON array of objects. Each object must contain exactly these four fields and nothing else:
{
- "location": "file:line",
+ "location": "file:line (use 'unknown:N' if file unavailable, where N is the relative position in content)",
"trigger_condition": "one-line description (max 15 words)",
- "guard_snippet": "minimal code sketch that closes the gap",
+ "guard_snippet": "minimal code sketch in the same language as the input content",
"potential_consequence": "what could actually go wrong (max 15 words)"
}
-No extra text, no explanations, no markdown wrapping.</output-format>
+No extra text, no explanations, no markdown wrapping. If zero findings after thorough analysis, return empty array [].</output-format>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/tasks/review-edge-case-hunter.xml` around lines 17 - 24, Update the
output-format spec (the <output-format> element) to define fallbacks and
escaping: require "location" to be "file:line" when available and otherwise use
"unknown" or "snippet" (e.g., "unknown" or "snippet:<context>"), state that
"guard_snippet" must be returned in a neutral pseudo-code/plaintext style when
language is unspecified, explicitly allow an empty array only when a separate
"halt_on_zero_findings" flag is set (otherwise require at least one object), and
add a rule that all JSON string values must be JSON-escaped
(quotes/backslashes/newlines) so code snippets remain valid JSON; make these
rules part of the output-format text so consumers and generators can
unambiguously implement them.
| <llm critical="true"> | ||
| <i>MANDATORY: Execute ALL steps in the flow section IN EXACT ORDER</i> | ||
| <i>DO NOT skip steps or change the sequence</i> | ||
| <i>HALT immediately when halt-conditions are met</i> | ||
| <i>Each action xml tag within step xml tag is a REQUIRED action to complete that step</i> |
There was a problem hiding this comment.
Mandatory step execution contradicts halt conditions.
Line 27 says "Execute ALL steps in the flow section IN EXACT ORDER" but:
- Halt conditions can trigger mid-flow (e.g., empty content aborts at step 1)
- Zero findings at step 2 triggers a halt before step 3
The word "ALL" is misleading—steps can be skipped due to halts. This creates cognitive dissonance for the executing LLM.
Proposed fix
<llm critical="true">
- <i>MANDATORY: Execute ALL steps in the flow section IN EXACT ORDER</i>
+ <i>MANDATORY: Execute steps in the flow section IN EXACT ORDER unless a halt-condition is triggered</i>
<i>DO NOT skip steps or change the sequence</i>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <llm critical="true"> | |
| <i>MANDATORY: Execute ALL steps in the flow section IN EXACT ORDER</i> | |
| <i>DO NOT skip steps or change the sequence</i> | |
| <i>HALT immediately when halt-conditions are met</i> | |
| <i>Each action xml tag within step xml tag is a REQUIRED action to complete that step</i> | |
| <llm critical="true"> | |
| <i>MANDATORY: Execute steps in the flow section IN EXACT ORDER unless a halt-condition is triggered</i> | |
| <i>DO NOT skip steps or change the sequence</i> | |
| <i>HALT immediately when halt-conditions are met</i> | |
| <i>Each action xml tag within step xml tag is a REQUIRED action to complete that step</i> |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/tasks/review-edge-case-hunter.xml` around lines 26 - 30, Update the
<llm> instructional text that currently reads "MANDATORY: Execute ALL steps in
the flow section IN EXACT ORDER" to remove the absolute "ALL" / "IN EXACT ORDER"
wording and instead state that steps should be executed sequentially but may be
halted or skipped when specified halt-conditions occur; specifically modify the
string "MANDATORY: Execute ALL steps in the flow section IN EXACT ORDER" to
something like "Execute steps sequentially; respect halt-conditions which may
stop or skip remaining steps" and ensure the surrounding directives mentioning
"HALT immediately when halt-conditions are met" and "Each action xml tag within
step xml tag is a REQUIRED action to complete that step" remain intact so the
parser still enforces step-level requirements while allowing halt-conditions to
preempt subsequent steps.
| <i>HALT immediately when halt-conditions are met</i> | ||
| <i>Each action xml tag within step xml tag is a REQUIRED action to complete that step</i> | ||
|
|
||
| <i>Your method is exhaustive path enumeration — mechanically walk every branch, not hunt by intuition</i> |
There was a problem hiding this comment.
"Mechanically walk every branch" is aspirational, not achievable.
Line 32 states: "Your method is exhaustive path enumeration — mechanically walk every branch"
LLMs don't execute code—they predict tokens based on patterns. They cannot truly "mechanically walk" paths; they estimate and may miss subtle branches (especially in dynamic languages with metaprogramming, eval, or runtime reflection). This instruction sets unrealistic expectations and could lead to overconfidence in the results.
Consider softening to acknowledge limitations or adding a disclaimer that results are best-effort heuristics, not formal verification.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/tasks/review-edge-case-hunter.xml` at line 32, Change the absolute
claim on line 32 ("exhaustive path enumeration — mechanically walk every
branch") to a tempered statement and add a short disclaimer that the tool
provides best-effort heuristic exploration rather than formal or runtime
execution; e.g., replace the phrase with wording that acknowledges LLMs estimate
branches via pattern-based token prediction and may miss dynamic or runtime-only
paths, and include a one-sentence disclaimer that results are not formal
verification but heuristics to help find likely edge cases.
| <i>Your method is exhaustive path enumeration — mechanically walk every branch, not hunt by intuition</i> | ||
| <i>Trace each branching path: conditionals, switches, early returns, guard clauses, loops, error handlers</i> | ||
| <i>Trace each boundary condition: null, undefined, empty, zero, negative, overflow, max-length, type coercion, concurrency, timing</i> |
There was a problem hiding this comment.
Boundary condition list is incomplete and some items are unrealistic for static analysis.
Line 34 lists: null, undefined, empty, zero, negative, overflow, max-length, type coercion, concurrency, timing
Missing common boundaries:
- Encoding issues (UTF-8 BOM, invalid sequences)
- Locale/timezone edge cases
- Floating-point precision (NaN, Infinity, -0)
- File system paths (relative vs absolute, symlinks, permissions)
- Resource exhaustion (memory, file handles)
- Unicode normalization (NFC vs NFD)
Unrealistic expectations:
- Concurrency and timing issues are notoriously difficult to detect via static path enumeration—they require understanding runtime scheduling, which an LLM reading code cannot reliably determine. This sets false expectations.
Proposed improvement
- <i>Trace each boundary condition: null, undefined, empty, zero, negative, overflow, max-length, type coercion, concurrency, timing</i>
+ <i>Trace each boundary condition: null, undefined, empty, zero, negative, overflow, max-length, type coercion, floating-point edge values (NaN, Infinity), encoding, locale/timezone</i>
+ <i>Note: concurrency and timing issues are difficult to detect statically—flag obvious races but do not guarantee detection</i>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/tasks/review-edge-case-hunter.xml` around lines 32 - 34, The
boundary-condition list in the review-edge-case-hunter XML is incomplete and
overstates static-analysis capabilities; update the enumerated boundary items
(the <i> elements that contain the boundary list) to include Encoding/BOM and
invalid sequences, Locale/Timezone edge cases, Floating-point issues (NaN,
Infinity, -0), File-system path edge cases (relative vs absolute, symlinks,
permissions), Resource exhaustion (memory/file handles), and Unicode
normalization (NFC vs NFD), and mark or remove "concurrency" and "timing" from
that list (or add a clear note that these require dynamic/runtime analysis) so
the document accurately reflects what can be detected by exhaustive static path
enumeration.
| <step n="1" title="Receive Content"> | ||
| <action>Load the content to review from provided input or context</action> | ||
| <action>If content to review is empty, ask for clarification and abort task</action> | ||
| <action>Identify content type (diff, full file, or function) to determine scope rules</action> | ||
| </step> |
There was a problem hiding this comment.
Duplicate and contradictory handling for empty content.
Line 42 says: If content to review is empty, ask for clarification and abort task
Line 61 says: HALT if content is empty or unreadable
One says "ask for clarification," the other just says "HALT." Which takes precedence? The flow step implies interaction; the halt condition implies termination. This inconsistency will confuse the LLM executor.
Proposed fix: consolidate empty-content handling in one place
<step n="1" title="Receive Content">
<action>Load the content to review from provided input or context</action>
- <action>If content to review is empty, ask for clarification and abort task</action>
+ <action>If content to review is empty or unreadable, HALT immediately per halt-conditions</action>
<action>Identify content type (diff, full file, or function) to determine scope rules</action>
</step>Also applies to: 59-62
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/tasks/review-edge-case-hunter.xml` around lines 40 - 44, Consolidate
the duplicate/contradictory empty-content handling by choosing a single behavior
and applying it to both places: update the Step n="1" action "If content to
review is empty, ask for clarification and abort task" and the separate rule
"HALT if content is empty or unreadable" so they match (either both prompt for
clarification then abort, or both immediately HALT); ensure the canonical
behavior is documented once (e.g., keep it in Step n="1") and remove or replace
the other occurrence so there is no conflicting instruction across the flow.
| <step n="2" title="Exhaustive Path Analysis" critical="true"> | ||
| <mandate>Walk every branching path and boundary condition within scope - report only unhandled ones</mandate> | ||
| <action>If also_consider input was provided, incorporate those areas into the analysis</action> | ||
| <action>Enumerate all branching paths and boundary conditions within scope: conditionals, switches, early returns, guard clauses, loops, error handlers, null/empty states, overflow, type edges, concurrency, timing</action> | ||
| <action>For each path: determine whether the content handles it</action> | ||
| <action>Collect only the unhandled paths as findings - discard handled ones silently</action> | ||
| </step> |
There was a problem hiding this comment.
No deduplication or grouping guidance for repeated patterns.
If the same edge case pattern appears 10 times (e.g., unchecked array access in a loop), should the LLM report it 10 times or once with a note? The current spec implies 10 separate findings, which adds noise without value.
Similarly, if also_consider is malformed or unparseable, there's no error handling—the LLM might silently ignore it or hallucinate interpretations.
Proposed fix
<step n="2" title="Exhaustive Path Analysis" critical="true">
<mandate>Walk every branching path and boundary condition within scope - report only unhandled ones</mandate>
- <action>If also_consider input was provided, incorporate those areas into the analysis</action>
+ <action>If also_consider input was provided and parseable, incorporate those areas into the analysis; if malformed, note in findings and continue</action>
<action>Enumerate all branching paths and boundary conditions within scope: conditionals, switches, early returns, guard clauses, loops, error handlers, null/empty states, overflow, type edges, concurrency, timing</action>
<action>For each path: determine whether the content handles it</action>
- <action>Collect only the unhandled paths as findings - discard handled ones silently</action>
+ <action>Collect only the unhandled paths as findings - discard handled ones silently. Group identical patterns under a single finding with location listing all occurrences.</action>
</step>🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/tasks/review-edge-case-hunter.xml` around lines 46 - 52, Update the
"Exhaustive Path Analysis" step (step n="2") to deduplicate and group repeated
edge-case findings instead of emitting one finding per occurrence: when
collecting unhandled paths (the action "Collect only the unhandled paths as
findings - discard handled ones silently"), aggregate identical patterns (e.g.,
"unchecked array access in loop") into a single finding that includes a count,
representative examples, and optional locations; additionally add validation and
error handling for the also_consider input (referenced in the action "If
also_consider input was provided") so that malformed or unparseable
also_consider is detected, logged/reported as an input parsing error and either
ignored or requested for clarification, rather than silently hallucinated.
| <halt-conditions> | ||
| <condition>HALT if zero findings - this is suspicious, re-analyze or ask for guidance</condition> | ||
| <condition>HALT if content is empty or unreadable</condition> | ||
| </halt-conditions> |
There was a problem hiding this comment.
Contradictory halt condition: "HALT" vs "re-analyze" creates ambiguous behavior.
Line 60 says HALT if zero findings but then immediately says re-analyze or ask for guidance. These are mutually exclusive instructions. Does the LLM stop (HALT) or continue (re-analyze)? This will cause unpredictable behavior—some models will infinite-loop re-analyzing, others will halt immediately.
Pick one: either halt with a message explaining zero findings is suspicious, or specify a maximum re-analysis count before halting.
Proposed fix
<halt-conditions>
- <condition>HALT if zero findings - this is suspicious, re-analyze or ask for guidance</condition>
+ <condition>If zero findings after initial analysis, re-analyze once with increased scrutiny. If still zero, HALT and output empty array with a single-line comment explaining the content appears fully guarded.</condition>
<condition>HALT if content is empty or unreadable</condition>
</halt-conditions>📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| <halt-conditions> | |
| <condition>HALT if zero findings - this is suspicious, re-analyze or ask for guidance</condition> | |
| <condition>HALT if content is empty or unreadable</condition> | |
| </halt-conditions> | |
| <halt-conditions> | |
| <condition>If zero findings after initial analysis, re-analyze once with increased scrutiny. If still zero, HALT and output empty array with a single-line comment explaining the content appears fully guarded.</condition> | |
| <condition>HALT if content is empty or unreadable</condition> | |
| </halt-conditions> |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/core/tasks/review-edge-case-hunter.xml` around lines 59 - 62, The "HALT
if zero findings - this is suspicious, re-analyze or ask for guidance" condition
in the halt-conditions block is contradictory; change it to a single
deterministic behavior: either replace that condition in the <halt-conditions>
with a clear halt-only rule (e.g., "HALT if zero findings; report 'suspicious:
zero findings' and stop") or replace it with a bounded re-analysis rule (e.g.,
"RE-ANALYZE up to N times if zero findings; after N attempts HALT and report
'suspicious: zero findings'"), and update the condition text for the element
containing that phrase so the system uses the chosen deterministic policy
consistently.
🤖 Augment PR SummarySummary: Adds a new reusable “Edge Case Hunter Review” task that mechanically traces branching paths/boundaries and reports missing handling.
🤖 Was this summary useful? React with 👍 or 👎 |
| desc="Optional areas to keep in mind during review alongside normal edge-case analysis" /> | ||
| </inputs> | ||
|
|
||
| <output-format>Return ONLY a valid JSON array of objects. Each object must contain exactly these four fields and nothing else: |
There was a problem hiding this comment.
<output-format> says to return a JSON array, but the example shown is a single JSON object (no surrounding []), which can lead to non-array output. Consider explicitly showing [ { ... } ] (and stating that [] is valid when there are no findings).
Severity: medium
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| </flow> | ||
|
|
||
| <halt-conditions> | ||
| <condition>HALT if zero findings - this is suspicious, re-analyze or ask for guidance</condition> |
There was a problem hiding this comment.
This HALT if zero findings condition conflicts with the JSON-only contract—[] should be a valid completion state and can be common on small diffs. Otherwise the task may halt/ask questions and break callers that strictly expect JSON output.
Severity: high
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
| but no context except the content to review --> | ||
|
|
||
| <task id="_bmad/core/tasks/review-edge-case-hunter.xml" name="Edge Case Hunter Review" | ||
| description="Walk every branching path and boundary condition in content, report only unhandled edge cases. Orthogonal to adversarial review - method-driven not attitude-driven."> |
There was a problem hiding this comment.
The description says “report only unhandled edge cases”, but the objective intentionally limits analysis to diff hunks (and may ignore guards outside the diff). Consider clarifying that findings are “missing guards within the provided scope” so consumers don’t interpret results as globally unhandled.
Severity: low
🤖 Was this useful? React with 👍 or 👎, or 🚀 if it prevented an incident/outage.
Review ResponseThanks for the thorough review. Addressed 4 findings, dismissed the rest with rationale. Fixed (commit 5103be4)
Dismissed
|
What
New reusable review task (
review-edge-case-hunter.xml) that exhaustively traces branching paths and boundary conditions in code, reporting only unhandled gaps. Registered inmodule-help.csvasbmad-review-edge-case-hunter.Why
The existing adversarial review is attitude-driven (cynical persona). Pairing it with a method-driven reviewer that mechanically walks every path produces orthogonal findings with minimal overlap. This second perspective is proven useful but had no dedicated task definition.
How
src/core/tasks/review-edge-case-hunter.xmlfollowing the same XML task structure as the adversarial reviewsrc/core/module-help.csvTesting
npm test)