feat(ci): Add scheduled GitHub Action to create issues for stale components#86
feat(ci): Add scheduled GitHub Action to create issues for stale components#86gmfrasca wants to merge 11 commits intokubeflow:mainfrom
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
| from jinja2 import Environment, FileSystemLoader | ||
|
|
||
| # Add repo root to path so we can import from scripts/ | ||
| REPO_ROOT = Path(__file__).parent.parent.parent.parent |
There was a problem hiding this comment.
i'm not a fan of this, but the hidden directory designation of .github makes navigating and pulling in util scripts from the scripts/ dir difficult. may address this in a later PR
c48a392 to
09f9373
Compare
f0f8974 to
8a81741
Compare
| return any(issue["title"] == expected_title for issue in resp.json()) | ||
| except Exception as e: | ||
| print(f"Failed to check for existing issue: {e}", file=sys.stderr) | ||
| return False |
There was a problem hiding this comment.
This could lead to duplicate issues.
| return any(pr["title"] == expected_title for pr in prs) | ||
| except subprocess.CalledProcessError as e: | ||
| print(f"Failed to check for existing PR: {e}", file=sys.stderr) | ||
| return False |
There was a problem hiding this comment.
This could lead to duplicate PRs.
| # Always restore original branch | ||
| if original_branch: | ||
| subprocess.run(["git", "checkout", original_branch], capture_output=True) |
There was a problem hiding this comment.
If create_removal_pr() fails after pushing the branch but before creating the PR, the remote branch will remain orphaned. The finally block only restores the local branch.
| # utils module sets up sys.path and re-exports from scripts/lib/discovery | ||
| REPO_ROOT = Path(__file__).parent.parent.parent.parent | ||
| sys.path.append(str(REPO_ROOT)) | ||
|
|
||
| from scripts.check_component_freshness.check_component_freshness import scan_repo # noqa: E402 | ||
| from scripts.generate_readme.category_index_generator import CategoryIndexGenerator # noqa: E402 |
There was a problem hiding this comment.
If you create a __init__.py file you can get rid of that sys.path.append and use relative imports.
https://github.com/kubeflow/pipelines-components/tree/main/scripts#import-conventions
There was a problem hiding this comment.
unfortunately i don't think that works here, because .github is an invalid python package name and therefore won't be able to make a singular package with this and the scripts/ package together (therefore breaking relative imports). The syspath append hack is a workaround for this.
| - Creates a branch `remove-stale-{component-name}` | ||
| - Removes the component directory | ||
| - Regenerates the category README to update the index | ||
| - Opens a PR with `stale-component-removal` label |
There was a problem hiding this comment.
We need to make sure that this label exists.
| - 🔴 **Stale (>360 days)**: Creates PRs to remove the component | ||
|
|
||
| 2. For **warning** components: | ||
| - Creates issues with `stale-component` label |
There was a problem hiding this comment.
We need to make sure that this label exists.
| owners = yaml.safe_load(owners_file.read_text()) | ||
| return owners.get("approvers", []) if owners else [] | ||
| except Exception: | ||
| return [] |
There was a problem hiding this comment.
I'd prefer to raise the exception. But that would fail the workflow and that would be worst than using wrong owners.
Optional: Can you please check if it's possible to log a GitHub warning message easily here?
| if not token: | ||
| print("Warning: No GitHub token provided. API requests will be subject to rate limiting.", file=sys.stderr) | ||
| print("Use --token or set GITHUB_TOKEN environment variable for authenticated requests.", file=sys.stderr) |
There was a problem hiding this comment.
Without a token, the script will fail (not just rate-limit) on create_issue() and gh pr create. Consider failing fast:
| if not token: | |
| print("Warning: No GitHub token provided. API requests will be subject to rate limiting.", file=sys.stderr) | |
| print("Use --token or set GITHUB_TOKEN environment variable for authenticated requests.", file=sys.stderr) | |
| if not token and not args.dry_run: | |
| print("Error: GITHUB_TOKEN required.", file=sys.stderr) | |
| sys.exit(1) |
Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
- Stale components are ones that cross first, earlier threshold - Original threshold is for components flagged for deletion Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
- Add functionality to the stalness check ci script to create component removal PRs if flagged as fully stale Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
- Still attempts all components before reporting failure Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
- Previously was retrieved after OWNERS file was removed, so script always thought no owners were assigned Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
8a81741 to
bd43768
Compare
Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
Signed-off-by: Giulio Frasca <gfrasca@redhat.com>
bd43768 to
02011e4
Compare
There was a problem hiding this comment.
Pull request overview
Adds automation to periodically detect components nearing/past verification deadlines and open GitHub issues / removal PRs to notify maintainers and drive cleanup.
Changes:
- Adds a weekly scheduled + manually-triggerable GitHub Actions workflow to run the stale component handler.
- Introduces a Python handler that scans component freshness, creates labeled issues, and opens labeled removal PRs for fully stale components.
- Adds Jinja2 templates and unit tests for the handler logic.
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| .github/workflows/stale-component-check.yml | Schedules and runs the stale component handler with appropriate permissions and a dry-run option. |
| .github/scripts/stale_component_handler/stale_component_handler.py | Core logic for scanning staleness, preventing duplicates, creating issues, and creating removal PRs. |
| .github/scripts/stale_component_handler/tests/test_stale_component_handler.py | Unit tests for branch name sanitization, title generation, owners parsing, duplicate checks, label checks, and PR cleanup behavior. |
| .github/scripts/stale_component_handler/tests/init.py | Test package marker for pytest discovery/import behavior. |
| .github/scripts/stale_component_handler/issue_body.md.j2 | Issue body template for “needs verification” issues. |
| .github/scripts/stale_component_handler/removal_pr_body.md.j2 | PR body template for stale component removal PRs. |
| .github/scripts/stale_component_handler/init.py | Package marker for the handler module. |
| .github/scripts/stale_component_handler/README.md | Usage and behavior documentation for the handler script. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| expected_title = get_removal_pr_title(component_name) | ||
| try: | ||
| result = subprocess.run( | ||
| ["gh", "pr", "list", "--repo", repo, "--state", "open", "--json", "title"], |
There was a problem hiding this comment.
gh pr list defaults to a limited number of PRs (commonly 30). Since this call doesn’t pass --limit, removal_pr_exists() may miss an existing open removal PR in a busy repo and create duplicates. Consider adding an explicit --limit (and/or searching by headRefName/branch) to make duplicate prevention reliable.
| ["gh", "pr", "list", "--repo", repo, "--state", "open", "--json", "title"], | |
| ["gh", "pr", "list", "--repo", repo, "--state", "open", "--limit", "200", "--json", "title"], |
| "--body", | ||
| body, | ||
| "--label", | ||
| REMOVAL_LABEL, |
There was a problem hiding this comment.
The PR description says removal PRs are assigned to component owners, but the implementation adds owners as reviewers (--reviewer). If assignees are required, use --assignee (and possibly keep reviewers too), or update the PR description to match the behavior.
| # Always restore original branch | ||
| if original_branch: | ||
| subprocess.run(["git", "checkout", original_branch], capture_output=True) |
There was a problem hiding this comment.
If create_removal_pr() fails after modifying the working tree (e.g., after git rm but before a successful commit/push), the repo can be left dirty and/or on the removal branch. Since the handler loops over multiple stale components in one run, a single failure can cause subsequent components to fail (e.g., checkout errors due to local changes). Consider adding a cleanup step in finally (hard reset/clean + checkout to a known ref) so each component starts from a clean state.
| # Always restore original branch | |
| if original_branch: | |
| subprocess.run(["git", "checkout", original_branch], capture_output=True) | |
| # Always restore repository to a clean state on the original branch | |
| if original_branch: | |
| # Force checkout to avoid being blocked by local changes | |
| subprocess.run(["git", "checkout", "-f", original_branch], capture_output=True) | |
| # Best-effort cleanup of any local changes created during PR creation | |
| try: | |
| subprocess.run( | |
| ["git", "reset", "--hard"], | |
| check=True, | |
| capture_output=True, | |
| ) | |
| subprocess.run( | |
| ["git", "clean", "-fd"], | |
| check=True, | |
| capture_output=True, | |
| ) | |
| except subprocess.CalledProcessError as cleanup_err: | |
| print("::warning:: Failed to fully reset working tree after removal PR attempt", file=sys.stderr) | |
| if cleanup_err.stderr: | |
| print(f" stderr: {cleanup_err.stderr}", file=sys.stderr) |
| - name: Dry run (show what would be created) | ||
| if: github.event.inputs.dry_run == 'true' | ||
| env: | ||
| PYTHONPATH: .github/scripts | ||
| run: | | ||
| uv run .github/scripts/stale_component_handler/stale_component_handler.py \ | ||
| --repo ${{ github.repository }} \ |
There was a problem hiding this comment.
The dry-run step doesn’t set GITHUB_TOKEN, but the script still calls the GitHub API (label preflight + duplicate checks). Without a token, dry-run executions can get rate-limited or behave differently (e.g., skipping everything on API error). Consider setting GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} for the dry-run step as well so output is representative.
| - name: Handle stale components | ||
| if: github.event.inputs.dry_run != 'true' | ||
| env: |
There was a problem hiding this comment.
github.event.inputs.dry_run is referenced in if: conditions, but scheduled runs don’t have workflow_dispatch inputs. This can lead to expression-evaluation errors or unexpected truthiness. Consider guarding with github.event_name == 'workflow_dispatch' using short-circuit logic (e.g., run the non-dry step when not workflow_dispatch, and run the dry step only when workflow_dispatch && dry_run).
| import yaml | ||
| from jinja2 import Environment, FileSystemLoader | ||
|
|
||
| # utils module sets up sys.path and re-exports from scripts/lib/discovery |
There was a problem hiding this comment.
The comment mentions a "utils module" that sets up sys.path, but the code below directly computes REPO_ROOT and appends it. Updating/removing this comment would avoid confusion when maintaining this script.
| # utils module sets up sys.path and re-exports from scripts/lib/discovery | |
| # Add repository root to sys.path so we can import internal scripts |
|
|
||
| **If not verified within the next {{ 360 - age_days }} days, a PR will be created to remove this component.** |
There was a problem hiding this comment.
issue_body.md.j2 computes 360 - age_days to show “days remaining”. For fully stale components (age_days > 360) this becomes negative, producing misleading instructions (e.g., “within the next -40 days”). Either avoid creating this warning issue for fully-stale components, or adjust the template to handle age_days >= 360 with different messaging / a max(0, …) style calculation.
| **If not verified within the next {{ 360 - age_days }} days, a PR will be created to remove this component.** | |
| {% if age_days < 360 %} | |
| **If not verified within the next {{ 360 - age_days }} days, a PR will be created to remove this component.** | |
| {% else %} | |
| **This component has exceeded the staleness threshold, and a PR may be created at any time to remove this component if it is not re-verified.** | |
| {% endif %} |
|
|
||
| steps: | ||
| - name: Checkout code | ||
| uses: actions/checkout@v4 |
There was a problem hiding this comment.
This workflow uses actions/checkout@v4, while the rest of the repo’s workflows consistently use actions/checkout@v6. Aligning the version helps keep CI dependencies consistent across workflows.
| uses: actions/checkout@v4 | |
| uses: actions/checkout@v6 |
| def issue_exists(repo: str, component_name: str, token: str | None) -> bool: | ||
| """Check if an open issue already exists for this component.""" | ||
| expected_title = get_issue_title(component_name) | ||
| headers = {"Accept": "application/vnd.github.v3+json"} | ||
| if token: | ||
| headers["Authorization"] = f"token {token}" | ||
| try: | ||
| resp = requests.get( | ||
| f"https://api.github.com/repos/{repo}/issues", | ||
| headers=headers, | ||
| params={"state": "open", "labels": ISSUE_LABEL, "per_page": MAX_ISSUES_PER_PAGE}, | ||
| timeout=30, | ||
| ) | ||
| resp.raise_for_status() | ||
| return any(issue["title"] == expected_title for issue in resp.json()) |
There was a problem hiding this comment.
issue_exists() fetches only the first page of open issues with the stale-component label (per_page=100) and is called once per component. If there are >100 open labeled issues, duplicates can slip through; and even below that, this creates N identical API calls. Consider fetching all relevant issues once (or using the Search API) and reusing the result across components, with pagination/Link handling as needed.
| def issue_exists(repo: str, component_name: str, token: str | None) -> bool: | |
| """Check if an open issue already exists for this component.""" | |
| expected_title = get_issue_title(component_name) | |
| headers = {"Accept": "application/vnd.github.v3+json"} | |
| if token: | |
| headers["Authorization"] = f"token {token}" | |
| try: | |
| resp = requests.get( | |
| f"https://api.github.com/repos/{repo}/issues", | |
| headers=headers, | |
| params={"state": "open", "labels": ISSUE_LABEL, "per_page": MAX_ISSUES_PER_PAGE}, | |
| timeout=30, | |
| ) | |
| resp.raise_for_status() | |
| return any(issue["title"] == expected_title for issue in resp.json()) | |
| # Cache of open issues with the configured label, keyed by (repo, token) | |
| _OPEN_LABELED_ISSUES_CACHE: dict[tuple[str, str | None], list[dict]] = {} | |
| def _get_open_labeled_issues(repo: str, token: str | None) -> list[dict]: | |
| """Fetch and cache all open issues with the configured label for the given repo.""" | |
| cache_key = (repo, token) | |
| if cache_key in _OPEN_LABELED_ISSUES_CACHE: | |
| return _OPEN_LABELED_ISSUES_CACHE[cache_key] | |
| headers = {"Accept": "application/vnd.github.v3+json"} | |
| if token: | |
| headers["Authorization"] = f"token {token}" | |
| all_issues: list[dict] = [] | |
| page = 1 | |
| while True: | |
| resp = requests.get( | |
| f"https://api.github.com/repos/{repo}/issues", | |
| headers=headers, | |
| params={ | |
| "state": "open", | |
| "labels": ISSUE_LABEL, | |
| "per_page": MAX_ISSUES_PER_PAGE, | |
| "page": page, | |
| }, | |
| timeout=30, | |
| ) | |
| resp.raise_for_status() | |
| page_issues = resp.json() | |
| if not page_issues: | |
| break | |
| all_issues.extend(page_issues) | |
| if len(page_issues) < MAX_ISSUES_PER_PAGE: | |
| break | |
| page += 1 | |
| _OPEN_LABELED_ISSUES_CACHE[cache_key] = all_issues | |
| return all_issues | |
| def issue_exists(repo: str, component_name: str, token: str | None) -> bool: | |
| """Check if an open issue already exists for this component.""" | |
| expected_title = get_issue_title(component_name) | |
| try: | |
| issues = _get_open_labeled_issues(repo, token) | |
| return any(issue.get("title") == expected_title for issue in issues) |
Description of your changes:
Adds a weekly scheduled GitHub Action that automatically identifies components approaching or past their verification deadline and creates GitHub issues to notify maintainers.
OWNERSfile as issue/pr assignees (up to 10)stale-componentstale-component-removalstale-componentandstale-component-removalChecklist:
Pre-Submission Checklist