|
| 1 | +# RawProc Option |
| 2 | + |
| 3 | +## Background |
| 4 | + |
| 5 | +Currently the way docker and most other container runtimes work is by masking |
| 6 | +and setting as read-only certain paths in `/proc`. This is to prevent data |
| 7 | +from being exposed into a container that should not be. However, there are |
| 8 | +certain use-cases where it is necessary to turn this off. |
| 9 | + |
| 10 | +## Motivation |
| 11 | + |
| 12 | +For end-users who would like to run unprivileged containers using user namespaces |
| 13 | +_nested inside_ CRI containers, we need an option to have a `RawProc`. That is, |
| 14 | +to explicitly turn off masking and setting read-only of paths so that we can |
| 15 | +mount `/proc` in the nested container as an unprivileged user. |
| 16 | + |
| 17 | +Please see the following filed issues for more information: |
| 18 | +- [opencontainers/runc#1658](https://github.com/opencontainers/runc/issues/1658#issuecomment-373122073) |
| 19 | +- [moby/moby#36597](https://github.com/moby/moby/issues/36597) |
| 20 | +- [moby/moby#36644](https://github.com/moby/moby/pull/36644) |
| 21 | + |
| 22 | +Please also see the [use case for building images securely in kubernetes](https://github.com/jessfraz/blog/blob/master/content/post/building-container-images-securely-on-kubernetes.md). |
| 23 | + |
| 24 | +This option really only makes sense for when a user is nesting |
| 25 | +unprivileged containers with user namespaces as it will allow more information |
| 26 | +than is necessary to the program running in the container spawned by |
| 27 | +kubernetes. |
| 28 | + |
| 29 | +The main use case for this option is to run |
| 30 | +[genuinetools/img](https://github.com/genuinetools/img) inside a kubernetes |
| 31 | +container. That program then launches sub-containers that take advantage of |
| 32 | +user namespaces and re-mask /proc and set /proc as read-only. So therefore |
| 33 | +there is no concern with having a raw proc open in the top level container. |
| 34 | + |
| 35 | +Since the only use case for this option is to run unprivileged nested |
| 36 | +containers, |
| 37 | +this option should only be allowed if the user in the container is not `root`. |
| 38 | +Since the user inside is still unprivileged, |
| 39 | +doing things to `/proc` would be off limits regardless, since linux user |
| 40 | +support already prevents this. |
| 41 | + |
| 42 | +## Existing SecurityContext objects |
| 43 | + |
| 44 | +Kubernetes defines `SecurityContext` for `Container` and `PodSecurityContext` |
| 45 | +for `PodSpec`. `SecurityContext` objects define the related security options |
| 46 | +for Kubernetes containers, e.g. selinux options. |
| 47 | + |
| 48 | +To support "rawProc" options in Kubernetes, it is proposed to make |
| 49 | +the following changes: |
| 50 | + |
| 51 | +## Changes of SecurityContext objects |
| 52 | + |
| 53 | +Add a new `bool` type field named `rawProc` to the `SecurityContext` |
| 54 | +definition. |
| 55 | + |
| 56 | +By default,`rawProc` is `false`. |
| 57 | + |
| 58 | +The API will reject as invalid `rawProc=true` and `user=0/root`, since `rawProc` |
| 59 | +only makes sense if you want to nest unprivileged user namespaces. |
| 60 | + |
| 61 | +This then means that no root user can exploit the unmasked/read-write paths in |
| 62 | +`/proc` since it will rely on the already implemented linux user support for |
| 63 | +this. |
| 64 | + |
| 65 | +This requires changes to the CRI runtime integrations so that |
| 66 | +kubelet will add the specific `raw_access` or `whatever_it_is_named` option. |
| 67 | + |
| 68 | +## Pod Security Policy changes |
| 69 | + |
| 70 | +A new `bool` field named `allowRawProc` will be added to the Pod |
| 71 | +Security Policy as well to gate whether or not a user is allowed to set the |
| 72 | +security context to `rawProc=true`. This field will default to |
| 73 | +false. |
0 commit comments