Skip to content

Conversation

@yzewei
Copy link

@yzewei yzewei commented Oct 16, 2025

Background

Currently, the runc delete command only supports deleting a single container. In high-frequency container creation and destruction environments, manually deleting stopped containers one by one is inefficient and may cause resource contention (e.g., cgroup locks or DBus delays).

The community has discussed the need for batch deletion of stopped containers in runc discussion #4935.

Changes in this PR

  • Modified delete.go to support deleting multiple containers at once.
  • Updated utils_linux.go to handle batch deletion safely.
  • Updated tests/integration/delete.bats to include tests for batch deletion.
  • Added delete multi test case to ensure the feature works as expected.

OCI Runtime Specification Alignment

Crucially, this change aligns with the updated OCI Runtime Specification. I have updated the delete operation in the specification to officially support multiple container IDs, clarifying that each container is deleted independently, and errors are reported per container without affecting others.

Comprehensive Batch Test Coverage

The integration test suite (tests/integration/delete.bats) has been significantly expanded to ensure the batch delete feature is robust across all supported container states and Cgroup environments:

  1. Container States: Added tests for batch deletion of containers that are stopped, running, and paused (the latter two using --force).
  2. Complex Cleanup Scenarios: Included tests for batch deletion of containers in the Host PID Namespace where the init process is already gone, ensuring all remaining processes are correctly killed and reaped.
  3. Cgroup Recursion: Added tests for batch deletion and recursive cleanup of containers that have created their own sub-Cgroups in both Cgroup V1 and Cgroup V2 environments.
    • Note on Stability: This required stabilizing the V2 cleanup tests by ensuring Cgroup paths are validated accurately on the host and that non-forced delete tests correctly kill running containers beforehand.

Notes

  • The new feature only affects containers in the "stopped" state for non-forced deletion.
  • Existing single-container delete behavior is unchanged.
  • This PR references community discussion for context: #4935

@yzewei
Copy link
Author

yzewei commented Oct 16, 2025

@lifubang
Copy link
Member

@yzewei Perhaps we should first update the description of the Delete operation in the runtime-spec: https://github.com/opencontainers/runtime-spec/blob/main/runtime.md#delete

@yzewei
Copy link
Author

yzewei commented Oct 16, 2025

@yzewei Perhaps we should first update the description of the Delete operation in the runtime-spec: https://github.com/opencontainers/runtime-spec/blob/main/runtime.md#delete

Thanks! I’ve updated the runtime-spec delete operation to support multiple container IDs.
Each container is deleted independently, and errors are reported per container without affecting others.
1299

@yzewei
Copy link
Author

yzewei commented Oct 16, 2025

Test Coverage Update

I have supplemented this PR with comprehensive multi-container (batch) integration tests for runc delete, ensuring the reliability and stability of the batch deletion feature across various scenarios.

Key areas of enhanced testing include:

  • Batch deletion of containers in running, paused, and stopped states.
  • Batch handling of complex scenarios involving Host PIDNS where the container's init process has already exited.
  • Batch recursive cleanup for containers with subordinate Cgroups (in both Cgroup V1 and V2 environments).

Two issues regarding test stability were resolved during this process:

  • Fixed an inaccuracy in the Cgroup V2 test where the host validation of the sub-Cgroup path was incorrect.
  • Corrected the logic for non-forced runc delete, which now executes runc kill on running containers before attempting deletion (as required by runc behavior).

@cyphar
Copy link
Member

cyphar commented Oct 16, 2025

@lifubang I don't think we should do that -- the runtime-spec stuff is describing the more general operations, and supporting multiple arguments doesn't really belong there IMHO.

@AkihiroSuda
Copy link
Member

In high-frequency container creation and destruction environments, manually deleting stopped containers one by one is inefficient and may cause resource contention (e.g., cgroup locks or DBus delays).

Do you have a benchmark result?

Signed-off-by: yzewei <yangzewei@loongson.cn>
Copy link
Member

@cyphar cyphar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of interest, was this code LLM-generated?

path := filepath.Join(context.GlobalString("root"), id)
if e := os.RemoveAll(path); e != nil {
fmt.Fprintf(os.Stderr, "remove %s: %v\n", path, e)
errs = append(errs, e)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This removes the "--force ignores errors" case.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I used AI to help me get this done. I still made sure to review and test everything myself.

}
if force {
return nil
errs = append(errs, err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

fmt.Fprintf(os.Stderr, "remove %s: %v\n", path, e)
var errs []error

for _, id := range context.Args() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the body of this loop should be a separate function and just have this loop do the error collection logic -- it's very easy to miss a continue and accidentally return an error.

Comment on lines -68 to -72
// When --force is given, we kill all container processes and
// then destroy the container. This is done even for a stopped
// container, because (in case it does not have its own PID
// namespace) there may be some leftover processes in the
// container's cgroup.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment has been removed, please add it back.

Comment on lines +81 to 92
switch s {
case libcontainer.Stopped:
if err := container.Destroy(); err != nil {
errs = append(errs, err)
}
case libcontainer.Created:
if err := killContainer(container); err != nil {
errs = append(errs, err)
}
default:
errs = append(errs, fmt.Errorf("cannot delete container %s that is not stopped: %s", id, s))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something more like

Suggested change
switch s {
case libcontainer.Stopped:
if err := container.Destroy(); err != nil {
errs = append(errs, err)
}
case libcontainer.Created:
if err := killContainer(container); err != nil {
errs = append(errs, err)
}
default:
errs = append(errs, fmt.Errorf("cannot delete container %s that is not stopped: %s", id, s))
}
switch s {
case libcontainer.Stopped:
err = container.Destroy()
case libcontainer.Created:
err = killContainer(container)
default:
err = fmt.Errorf("cannot delete container %s that is not stopped: %s", id, s)
}
if err != nil {
errs = append(errs, err)
}

would be nicer.

Comment on lines -56 to -57
// if there was an aborted start or something of the sort then the container's directory could exist but
// libcontainer does not see it because the state.json file inside that directory was never created.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment has been changed, please just keep the old one if there isn't a good reason to change it.

@cyphar
Copy link
Member

cyphar commented Oct 16, 2025

@AkihiroSuda I also find that claim implausible without seeing benchmarks -- the only thing this PR is really saving is the overhead of fork+exec. This whole PR feels like it was generated by an LLM.

@yzewei
Copy link
Author

yzewei commented Oct 16, 2025

Hi @cyphar,

Thanks for the feedback. I might not be fully ready to address this yet. Please allow me some time to review the AI-generated code before I follow up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants