Fallback to bind mount for sysfs in user namespaces #5016
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is the code implementation in the runc project for KEP: kubernetes/enhancements#5607. In the PR for this KEP, the feature has been discussed with runc maintainer @rata and has received @rata's acknowledgment.
When running a container within a user namespace (userns),
runccurrently fails to mount/sys. This is because a standard mount of thesysfsfilesystem is a privileged operation.For a process that is root only within a new user namespace but is unprivileged on the host, the kernel correctly denies this request. This prevents containers from starting successfully in environments that rely on user namespaces, such as rootless containers.
This change implements a fallback mechanism to address this issue, aligning
runc's behavior with other runtimes likecrunand improving support for user namespaces.The corresponding code path in
crun: