-
-
Notifications
You must be signed in to change notification settings - Fork 4
Description
Problem
When chunks fail to analyze or finding extractions fail, the user sees only an aggregate count in the summary:
⚠ 1 chunk failed to analyze
⚠ 10 finding extractions failed
There is no way to see which chunks or extractions failed, why they failed, or what the raw output was. The failure details exist internally (the SDK tracks them, Sentry gets breadcrumbs and metrics) but none of it is surfaced to the user.
This makes it impossible to tell whether failures are:
- Transient API issues (retries exhausted)
- Malformed skill output (extraction regex + LLM fallback both failed)
- Prompt too large for context window
- Something else entirely
Current State
What's tracked internally:
analyzeHunk()logs toconsole.error()on failure (SDK returned no result, execution error, retries exhausted)parseHunkOutput()returnsextractionFailed: truewithextractionErrorandextractionPreview(first 200 chars of raw output)onExtractionFailurecallback exists in analyze.ts but is not wired to CLI output intasks.ts- Sentry gets spans with
hunk.failed,extraction.failed_countattributes
What the user sees: Just the count. No file paths, no error messages, no raw output.
Proposed Behavior
Verbose mode (-v)
Show per-failure details inline as they occur:
⚠ Chunk failed: src/utils/parser.ts:45-82 — SDK execution failed: context length exceeded
⚠ Extraction failed: src/cli/main.ts:120-155 — Could not parse findings from output
Debug mode (-vv)
Additionally show the raw output preview (first ~200 chars) and extraction error details for extraction failures.
Normal mode (default)
Keep current behavior (aggregate counts in summary) but add a hint:
⚠ 1 chunk failed to analyze (use -v for details)
⚠ 10 finding extractions failed (use -v for details)
Log file (optional, future)
Consider writing failure details to a debug log file (e.g. .warden/debug.log) that persists across runs, so users can inspect failures after the fact without needing to re-run in verbose mode.
Relevant Code
- Failure tracking:
src/sdk/analyze.ts(lines 384-417 for hunk failures, 433-438 for extraction callback) - Extraction logic:
src/sdk/extract.ts(two-tier: regex then LLM fallback) - Summary rendering:
src/cli/output/reporter.ts(lines 210-231) - Per-skill CI output:
src/cli/terminal.ts(lines 199-203) - Progress callbacks:
src/cli/output/tasks.ts(onExtractionFailure callback exists but unused)