fix: scope cleanup-legacy.py skill discovery and preserve config.yaml#2205
fix: scope cleanup-legacy.py skill discovery and preserve config.yaml#2205don-petry wants to merge 2 commits intobmad-code-org:mainfrom
Conversation
…ig.yaml
The cleanup-legacy.py script used an overly broad rglob("SKILL.md") that
matched template and asset files nested deep in the directory tree (e.g.
bmad-module-builder/assets/setup-skill-template/SKILL.md). This caused
cleanup to abort when it couldn't verify non-installable templates at the
skills directory.
Scopes find_skill_dirs() to only match SKILL.md at recognized installable
positions: direct children ({name}/SKILL.md) and skills subfolder
(skills/{name}/SKILL.md). Also adds config.yaml backup/restore around
shutil.rmtree() so per-module configs needed by bmad-init are preserved.
Fixes bmad-code-org#2175
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
Adds a shared cleanup-legacy.py utility under tools/scripts/ to safely remove legacy _bmad/ module directories after config/help migrations, with optional verification that skills are already installed elsewhere and preservation of per-module config.yaml.
Changes:
- Introduces scoped skill discovery to only treat
SKILL.mdin installable locations as skills. - Adds an optional safety check (
--skills-dir) to validate discovered skills exist at the installed location before deletion. - Preserves
config.yamlacross directory removal by backing it up beforeshutil.rmtree()and restoring it afterward.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # Preserve config.yaml if present (bmad-init needs per-module configs) | ||
| config_path = target / "config.yaml" | ||
| config_backup = None | ||
| if config_path.exists(): | ||
| config_backup = config_path.read_bytes() | ||
| if verbose: | ||
| print(f"Preserving config.yaml in {dirname}/", file=sys.stderr) | ||
|
|
||
| file_count = count_files(target) | ||
| if config_backup: | ||
| file_count -= 1 # Don't count the preserved file | ||
| if verbose: |
There was a problem hiding this comment.
config_backup uses truthiness checks (if config_backup:). If config.yaml exists but is empty, read_bytes() returns b'' which is falsy, so the script will not restore the file and will miscount removed files. Track preservation with an explicit boolean (e.g., config_exists = config_path.is_file()) or check config_backup is not None instead of truthiness.
| for dirname in dirs_to_remove: | ||
| target = Path(bmad_dir) / dirname | ||
| if not target.exists(): |
There was a problem hiding this comment.
dirname values come from CLI (--module-code / --also-remove) and are directly appended to bmad_dir before shutil.rmtree(). A crafted value like ../ could delete directories outside _bmad/. Consider validating names (reject path separators / ..) or resolving target and asserting it is within Path(bmad_dir).resolve() before removal.
tools/scripts/cleanup-legacy.py
Outdated
| for skill_md in root.rglob("SKILL.md"): | ||
| rel = skill_md.parent.relative_to(root) | ||
| parts = rel.parts | ||
| # Direct child: {name}/SKILL.md | ||
| if len(parts) == 1: | ||
| skills.append(parts[0]) | ||
| # Skills subfolder: skills/{name}/SKILL.md | ||
| elif len(parts) == 2 and parts[0] == "skills": | ||
| skills.append(parts[1]) |
There was a problem hiding this comment.
find_skill_dirs() still walks the entire tree via root.rglob('SKILL.md') and then filters by depth. To avoid unnecessary deep traversal (and speed up large modules), consider using targeted globs for the only supported installable locations (e.g., root.glob('*/SKILL.md') and root.glob('skills/*/SKILL.md')) instead of scanning everything.
| for skill_md in root.rglob("SKILL.md"): | |
| rel = skill_md.parent.relative_to(root) | |
| parts = rel.parts | |
| # Direct child: {name}/SKILL.md | |
| if len(parts) == 1: | |
| skills.append(parts[0]) | |
| # Skills subfolder: skills/{name}/SKILL.md | |
| elif len(parts) == 2 and parts[0] == "skills": | |
| skills.append(parts[1]) | |
| # Direct child: {name}/SKILL.md | |
| for skill_md in root.glob("*/SKILL.md"): | |
| skills.append(skill_md.parent.name) | |
| # Skills subfolder: skills/{name}/SKILL.md | |
| skills_root = root / "skills" | |
| if skills_root.exists(): | |
| for skill_md in skills_root.glob("*/SKILL.md"): | |
| skills.append(skill_md.parent.name) |
| file_count = count_files(target) | ||
| if config_backup: | ||
| file_count -= 1 # Don't count the preserved file | ||
| if verbose: | ||
| print( | ||
| f"Removing {target} ({file_count} files)", | ||
| file=sys.stderr, | ||
| ) |
There was a problem hiding this comment.
count_files() is called unconditionally for every target directory. For large legacy trees this adds a full extra traversal even when --verbose is off. If files_removed_count is not strictly required, consider only counting when verbose (or make it opt-in) to keep cleanup fast.
📝 WalkthroughWalkthroughAdds a new executable Python script that removes legacy module directories from a BMad installation path. The script scans for installable skills at recognized locations, verifies they exist in the target skills directory, removes legacy directories while preserving Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~40 minutes 🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
tools/scripts/cleanup-legacy.py (1)
61-87: Prefer a single source of truth for skill discovery.
tools/installer/core/manifest-generator.js:collectSkills()still discovers installableSKILL.mdentrypoints recursively, while this helper hardcodes only direct children andskills/{name}. That makes the pre-delete safety check easy to drift from install behavior and can silently under-verify nested skill layouts if they ever appear in a legacy_bmadtree. Reusing the installer's discovery rules here, or explicitly failing on deeperSKILL.mdpaths, would make this guard safer.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tools/scripts/cleanup-legacy.py` around lines 61 - 87, The current find_skill_dirs function only accepts direct children and skills/{name} layouts, which diverges from the installer's recursive discovery in collectSkills; update find_skill_dirs to use the same discovery rules as the installer by either calling the installer function collectSkills (from tools/installer/core/manifest-generator.js or its Python wrapper) or by making the function reject/deem unsafe any SKILL.md nested deeper than the installer's allowed entrypoints (i.e., explicitly fail or raise an error on deeper paths), ensuring the guard's behavior exactly matches collectSkills' logic so pre-delete checks can't drift from install behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@tools/scripts/cleanup-legacy.py`:
- Around line 188-220: The code incorrectly treats zero-byte preserved configs
as absent and allows restoration errors to escape handling; update the guards
around config_backup in the cleanup routine to use "if config_backup is not
None:" (affecting the file_count adjustment and the restore condition) so empty
b"" files are preserved and restored, and move the restoration steps
(target.mkdir(...) and config_path.write_bytes(...)) into the same try/except
that wraps shutil.rmtree(target) so any errors during restore are caught and
reported via the existing error_result (which references removed and dirname)
and trigger sys.exit(2).
---
Nitpick comments:
In `@tools/scripts/cleanup-legacy.py`:
- Around line 61-87: The current find_skill_dirs function only accepts direct
children and skills/{name} layouts, which diverges from the installer's
recursive discovery in collectSkills; update find_skill_dirs to use the same
discovery rules as the installer by either calling the installer function
collectSkills (from tools/installer/core/manifest-generator.js or its Python
wrapper) or by making the function reject/deem unsafe any SKILL.md nested deeper
than the installer's allowed entrypoints (i.e., explicitly fail or raise an
error on deeper paths), ensuring the guard's behavior exactly matches
collectSkills' logic so pre-delete checks can't drift from install behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 41b08d6c-cbc6-49db-95c7-6c48ddbfece1
📒 Files selected for processing (1)
tools/scripts/cleanup-legacy.py
tools/scripts/cleanup-legacy.py
Outdated
| # Preserve config.yaml if present (bmad-init needs per-module configs) | ||
| config_path = target / "config.yaml" | ||
| config_backup = None | ||
| if config_path.exists(): | ||
| config_backup = config_path.read_bytes() | ||
| if verbose: | ||
| print(f"Preserving config.yaml in {dirname}/", file=sys.stderr) | ||
|
|
||
| file_count = count_files(target) | ||
| if config_backup: | ||
| file_count -= 1 # Don't count the preserved file | ||
| if verbose: | ||
| print( | ||
| f"Removing {target} ({file_count} files)", | ||
| file=sys.stderr, | ||
| ) | ||
|
|
||
| try: | ||
| shutil.rmtree(target) | ||
| except OSError as e: | ||
| error_result = { | ||
| "status": "error", | ||
| "error": f"Failed to remove {target}: {e}", | ||
| "directories_removed": removed, | ||
| "directories_failed": dirname, | ||
| } | ||
| print(json.dumps(error_result, indent=2)) | ||
| sys.exit(2) | ||
|
|
||
| # Restore preserved config.yaml | ||
| if config_backup: | ||
| target.mkdir(parents=True, exist_ok=True) | ||
| config_path.write_bytes(config_backup) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
printf 'Truthiness of empty bytes in Python:\n'
python - <<'PY'
print(bool(b""))
PY
printf '\nCurrent config preservation block:\n'
sed -n '188,220p' tools/scripts/cleanup-legacy.pyRepository: bmad-code-org/BMAD-METHOD
Length of output: 1325
🏁 Script executed:
# Get the full function containing these lines to understand complete control flow
grep -n "def " tools/scripts/cleanup-legacy.py | tail -5Repository: bmad-code-org/BMAD-METHOD
Length of output: 233
🏁 Script executed:
# View the complete function from the beginning to lines 220 plus context
sed -n '150,225p' tools/scripts/cleanup-legacy.pyRepository: bmad-code-org/BMAD-METHOD
Length of output: 2450
🏁 Script executed:
# Check if there's any error handling or try-except wrapper around the function call
grep -B5 -A10 "for dirname in" tools/scripts/cleanup-legacy.py | head -40Repository: bmad-code-org/BMAD-METHOD
Length of output: 1041
Fix empty-byte config handling and restore error safety.
read_bytes() returns b"" for zero-byte files, which is falsy, so both if config_backup: guards (lines 196 and 207) skip empty configs during file counting and restoration. Additionally, the restore operations (lines 208–209) run outside the try block wrapping rmtree(), so mkdir() or write_bytes() errors escape error handling and produce a traceback instead of the documented JSON error response and sys.exit(2).
Fix: Move preservation and restoration into the try block and replace if config_backup: with if config_backup is not None: to safely handle empty files.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tools/scripts/cleanup-legacy.py` around lines 188 - 220, The code incorrectly
treats zero-byte preserved configs as absent and allows restoration errors to
escape handling; update the guards around config_backup in the cleanup routine
to use "if config_backup is not None:" (affecting the file_count adjustment and
the restore condition) so empty b"" files are preserved and restored, and move
the restoration steps (target.mkdir(...) and config_path.write_bytes(...)) into
the same try/except that wraps shutil.rmtree(target) so any errors during
restore are caught and reported via the existing error_result (which references
removed and dirname) and trigger sys.exit(2).
- Fix config restoration for zero-byte files by using `is not None`
instead of truthiness checks on `config_backup` (empty bytes `b""` is
falsy, causing silent loss of empty config.yaml files)
- Move config restore into the try/except block so mkdir/write_bytes
errors are caught and reported as structured JSON instead of tracebacks
- Add logging.error() call on failure for observability
- Replace rglob("SKILL.md") with targeted glob() calls to avoid
unnecessary deep traversal — only the two canonical installable
layouts are checked
- Add docstring note explaining why find_skill_dirs() is intentionally
stricter than the installer's recursive collectSkills()
- Add path traversal validation rejecting "..", "/", "\\" in dir names
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
|
All review feedback from CodeRabbit and Copilot has been addressed in the latest push (
All reviews above were submitted against commit |
Summary
cleanup-legacy.pyscript with two fixes from bmad-suno-band-manager commiteb226a7:find_skill_dirs()to only match SKILL.md at installable positions: direct children ({name}/SKILL.md) and skills subfolder (skills/{name}/SKILL.md). Deeply nested SKILL.md files (e.g.bmad-module-builder/assets/setup-skill-template/SKILL.md) are no longer incorrectly matched.config.yamlbeforeshutil.rmtree()and restores it after, so per-module configs needed bybmad-initsurvive cleanup.Fixes #2175
Test plan
find_skill_dirs()with direct child, skills/ subfolder, and deeply nested SKILL.md -- only installable positions matchednpm run qualitypasses (204 tests, lint, format)🤖 Generated with Claude Code