Skip to content

Commit ec60d5c

Browse files
committed
It worked better before this change.
1 parent 6b17851 commit ec60d5c

File tree

1 file changed

+74
-36
lines changed

1 file changed

+74
-36
lines changed

.claude/commands/summarize_ci.md

Lines changed: 74 additions & 36 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,10 @@ Steps:
1111
- **If no failed runs found:** Check if tests are still in progress
1212
- If in progress: Report "Tests still running - wait for completion"
1313
- If all passed: Report "No failures - all tests passed!" and exit
14-
5. For each failed run, list artifacts: `gh api repos/galaxyproject/galaxy/actions/runs/<RUN_ID>/artifacts --jq '.artifacts[] | {name: .name, id: .id, size_in_bytes: .size_in_bytes}'`
15-
- **If run has no artifacts:** Report "Run <RUN_ID> has no artifacts - may be too old (artifacts expire after 90 days)"
14+
5. For each failed run, categorize by artifact availability:
15+
- List artifacts: `gh api repos/galaxyproject/galaxy/actions/runs/<RUN_ID>/artifacts --jq '.artifacts[] | {name: .name, id: .id, size_in_bytes: .size_in_bytes}'`
16+
- **If run has test artifacts (HTML/JSON):** Mark for download (test failures)
17+
- **If run has no artifacts:** Mark for log extraction (likely linting, build, or startup failures)
1618
6. **Download all test artifacts to review directory**:
1719
- Prefer JSON artifacts (e.g., "Playwright test results JSON", "Integration test results JSON")
1820
- Download to `database/pr_reviews/{{arg}}/`
@@ -26,51 +28,71 @@ Steps:
2628
- If yes, retry with longer timeout (300s)
2729
- If no or second failure, STOP and report incomplete analysis
2830
- DO NOT proceed with partial data
29-
7. **Validate downloads succeeded:**
30-
- Check if `database/pr_reviews/{{arg}}/` has artifact directories
31-
- If empty: STOP and report "No artifacts downloaded - download may have failed silently"
31+
7. **Extract logs from runs without artifacts:**
32+
- For each run marked for log extraction:
33+
- Get failed job IDs: `gh api repos/galaxyproject/galaxy/actions/runs/<RUN_ID>/jobs --jq '.jobs[] | select(.conclusion == "failure") | {id: .id, name: .name}'`
34+
- For each failed job, extract relevant error info:
35+
- Get job logs: `gh api repos/galaxyproject/galaxy/actions/jobs/<JOB_ID>/logs`
36+
- Parse for common failure patterns:
37+
- Python linting: Look for "isort", "flake8", "black", "ruff" errors
38+
- TypeScript: Look for "tsc", "eslint", "prettier" errors
39+
- Build failures: Look for "error:", "failed", compilation errors
40+
- Extract last 20-50 lines of relevant errors
41+
- Save to `database/pr_reviews/{{arg}}/<RUN_ID>_<JOB_NAME>.log`
42+
- Include job name and extracted errors in summary
43+
44+
8. **Validate downloads succeeded:**
45+
- Check if `database/pr_reviews/{{arg}}/` has artifact directories OR log files
46+
- If completely empty: STOP and report "No artifacts or logs extracted - analysis failed"
3247
- Count expected vs actual artifact directories
3348
- If mismatch: WARN user about missing artifacts
3449

35-
8. Parse test results from all downloaded artifacts:
50+
9. Parse test results from all downloaded artifacts:
3651
- Find all JSON files: `find database/pr_reviews/{{arg}}/ -name "*.json" -type f`
3752
- For each JSON file:
3853
```python
3954
data = json.load(open(json_file))
4055
failures = [
41-
{'test': test_id, 'duration': run['duration'], 'log': run.get('log', ''), 'artifact': artifact_name}
56+
{'test': test_id, 'duration': run['duration'], 'log': run.get('log', ''), 'artifact': artifact_name, 'result': run['result']}
4257
for test_id, runs in data['tests'].items()
43-
for run in runs if run['result'] == 'Failed'
58+
for run in runs if run['result'] in ['Failed', 'Error']
4459
]
4560
```
4661
- Fall back to HTML if no JSON found:
4762
- Find HTML files in artifact directories
4863
- Extract embedded JSON from `data-jsonblob="..."`
49-
- Parse and extract failures
64+
- Parse and extract failures (both 'Failed' and 'Error' results)
5065
- **If no JSON or HTML found:** STOP and report "No test result files found in artifacts"
66+
- **Note:** pytest distinguishes 'Failed' (assertion failed) from 'Error' (exception during setup/execution) - both are test failures
5167

52-
9. **Categorize failures** by checking error messages:
68+
10. **Categorize failures** by checking error messages:
5369
- **Transient**: Look for `TRANSIENT FAILURE [Issue #` in error log/message
5470
- Extract issue number from pattern
5571
- **New**: All other failures
5672
57-
10. Generate markdown summary with:
73+
11. Generate markdown summary with:
5874
- Run IDs
59-
- Artifact names and sizes (indicate JSON vs HTML)
60-
- List artifacts by name
61-
- **Known transient failures** (✅):
62-
- Test name
63-
- Artifact/test type
64-
- Issue number (with link)
65-
- Duration
66-
- **New failures requiring investigation** (❌):
67-
- Test name
68-
- Artifact/test type
69-
- Duration
70-
- Error preview
71-
- Total counts
72-
73-
11. **Write summary to file** `database/pr_reviews/{{arg}}/summary`:
75+
- **For runs with artifacts:**
76+
- Artifact names and sizes (indicate JSON vs HTML)
77+
- **Known transient failures** (✅):
78+
- Test name
79+
- Artifact/test type
80+
- Issue number (with link)
81+
- Duration
82+
- **New test failures requiring investigation** (❌):
83+
- Test name
84+
- Artifact/test type
85+
- Result type (Failed vs Error)
86+
- Duration
87+
- Error preview
88+
- **For runs without artifacts (linting/build):**
89+
- Job name (e.g., "Python linting", "client / build-client")
90+
- Failure type (isort, eslint, build error, etc.)
91+
- Error count or preview of first few errors
92+
- Indicate these are NOT test failures
93+
- Total counts (separate test failures from linting/build failures)
94+
95+
12. **Write summary to file** `database/pr_reviews/{{arg}}/summary`:
7496
- Write the complete markdown summary
7597
- This file is used by `/summarize_ci_post` to post to PR
7698
- Format: Same markdown as displayed to user
@@ -79,47 +101,63 @@ Steps:
79101
```
80102
Analyzing PR #21218...
81103
Backed up previous review to 21218_backup_20251031_143022
82-
Found 2 failed workflow run(s)
104+
Found 3 failed workflow run(s)
83105
84-
Run 18975780470:
106+
Run 18975780470 (test artifacts):
85107
- Playwright test results JSON (0.1 MB) ⚡
86108
- Playwright test results JSON (shard 2) (0.1 MB) ⚡
87109
88-
Run 18975780416:
110+
Run 18975780416 (test artifacts):
89111
- Integration test results JSON (0.5 MB) ⚡
90112
113+
Run 18975780500 (no artifacts - extracted logs):
114+
- Python linting
115+
91116
================================================================================
92117
FAILURE SUMMARY
93118
================================================================================
94119
95-
✅ Known transient failures (2):
96-
• test_history_sharing.py::test_sharing_private_history - Issue #12345
120+
🔧 **Linting/Build failures (1):**
121+
• Python linting
122+
Type: isort import ordering
123+
Files affected: 3
124+
Example: lib/galaxy/managers/users.py - imports not sorted
125+
126+
**Known transient test failures (2):**
127+
• test_history_sharing.py::test_sharing_private_history
97128
From: Playwright test results JSON
129+
Issue: https://github.com/galaxyproject/galaxy/issues/12345
98130
Duration: 00:01:30
99-
• test_tool_discovery.py::test_tool_discovery_landing - Issue #67890
131+
• test_tool_discovery.py::test_tool_discovery_landing
100132
From: Integration test results JSON
133+
Issue: https://github.com/galaxyproject/galaxy/issues/67890
101134
Duration: 00:00:54
102135
103-
❌ New failures requiring investigation (1):
136+
**New test failures requiring investigation (1):**
104137
• test_workflow.py::test_save_workflow
105138
From: Playwright test results JSON (shard 2)
139+
Type: Failed
106140
Duration: 00:01:15
107141
Error: AssertionError: Expected element to be visible
108142
109-
Total: 2 transient, 1 new (requires attention)
143+
**Total:** 1 linting/build failure, 2 transient tests, 1 new test failure
110144
111145
Summary and artifacts saved to database/pr_reviews/21218/
112146
```
113147
114-
12. **Display and save:**
148+
13. **Display and save:**
115149
- Print summary to user
116150
- Write same content to `database/pr_reviews/{{arg}}/summary`
117151
- Create/update symlink: `ln -sfn {{arg}} database/pr_reviews/latest`
118152
- Notify user: "Summary and artifacts saved to database/pr_reviews/{{arg}}/"
119153
120154
Output concise summary showing categorized failures. Transient failures indicate "safe to re-run", new failures indicate "requires investigation".
121155
122-
**Note:** The summary and downloaded artifacts are saved to `database/pr_reviews/{{arg}}/` for use by `/summarize_ci_post`.
156+
**Notes:**
157+
- The summary and downloaded artifacts are saved to `database/pr_reviews/{{arg}}/` for use by `/summarize_ci_post`
158+
- Linting/build failures are extracted from job logs since these jobs don't produce test artifacts
159+
- Common patterns: isort, black, flake8, ruff, eslint, prettier, tsc, build errors
160+
- Log extraction focuses on last 20-50 lines and specific error markers to keep output concise
123161
124162
**Marking tests as transient failures:**
125163
To mark a test as a known transient failure, manually add the `@transient_failure(issue=N)` decorator:

0 commit comments

Comments
 (0)