Skip to content

CI: add host-side stale containerlab topology sweep #188

@lance0

Description

@lance0

Context

PR #185 added bounded retry for containerlab interop jobs and the shared .github/actions/run-interop-test action now destroys the named topology before each attempt and on exit. That protects normal job flow, but it does not cover a runner reboot, cancelled job, or process crash that leaves stale clab-* containers/networks behind outside a job's control.

Expected direction

Add a host-side cleanup mechanism for the self-hosted/kernel-dataplane runner, preferably a small systemd service + timer documented in docs/kernel-dataplane-runner.md, that periodically detects and removes stale rustbgpd-owned containerlab topologies without touching active jobs.

Acceptance criteria

  • Document the timer/service install path and operator commands.
  • Scope cleanup to rustbgpd/containerlab-owned stale resources only; avoid broad Docker pruning.
  • Include a dry-run or inspect mode so the operator can verify what would be removed.
  • Preserve live soak/cloudbox safety; this issue is for runner hygiene, not touching currently running soaks.
  • Link the cleanup guidance from the CI/kernel dataplane runner docs.

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions