-
Notifications
You must be signed in to change notification settings - Fork 5.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for no_new_privs
via allowPrivilegeEscalation
#639
Conversation
Create no-new-privs support proposal doc under contributors/design-proposals.
a77485c
to
2fc349c
Compare
no_new_privs
no_new_privs
via allowPrivilegeEscalation
Add a new bool type field named `allowPrivilegeEscalation` to both `SecurityContext` | ||
definition and `PodSecurityContext` definition: | ||
|
||
* For example, `allowPrivilegeEscalation=true` in `PodSecurityContext` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you see the pod level option being used when a container level option is also available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As a fast override for all pods of they know they will need it, not sure how common that will be, is it standard to have both pod and container options or can we just do one, seems a bit inconsistent currently so open to whatever feels best
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, it's a bit inconsistent currently. More knobs confuses users though IMHO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I think just container override is good
|
||
- when a container is `privileged` | ||
- when `CAP_SYS_ADMIN` is added to a container | ||
- when a container is not run as root, uid `0` (to prevent breaking suid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bit unfortunate. We want users to not run as uid 0
ideally. Will cluster admins have an option to break setuid binaries if they want to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah they could explicitly set to false I guess but that behavior feels
weird since false is a default...
Maybe a Boolean for this is wrong
I thought about adding another option like "allowSetuid" I think you had mentioned it in the other thread as well, maybe there is a way to find one config toggle to cover everything that will also allow a sane default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wish we knew how many people use suid binaries lol because I'd like to just have that be explicit, and not only set it when it's 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. Maybe we can have a less secure default, but a more secure option that can be added using a single config change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could it be a kubelet config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I feel like you are combining "am I allowed to do this" and "what is the default " all into allowPrivilegeEscalation and I think that is going to cause a lot of confusion for users and add a lot of complexity to one Boolean causing us to potentially shoot ourselves in the foot trying to get a certain behavior. I suggest the following if we absolutely need a gate for "am I allowed to set the securityContext" then we should have a defaultAllowPrivilegeEscalation field AND a allowPrivilegeEscalation field, of course these could be named differently but I don't think we should bundle all this behavior into one field
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no problem with that if folks like it. We've done similar things with the annotation fields that were added. Gives control at both levels which is the only thing I care about.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with defaultAllow* and allow* (two fields, one to control the default, and one to describe whether it is allowed). Naming wise, the double allow is confusing - I don't have a strong opinion, but we should ensure users don't get confused about which field does what. Since "privileged" as defined in PSP as "user is allowed to request privileged", we can probably be consistent and use allowPrivilegeEscalation
in the same sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
updated, thanks for your patience! I will be sure to include docs that make it explicit which field does what :) but makes sense to be consistent with privileged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Don't have to thank me, you're the one who has to suffer through writing this :)
|
||
## Changes of SecurityContext objects | ||
|
||
Add a new bool type field named `allowPrivilegeEscalation` to the `SecurityContext` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*bool
, and then we can let PSP manage the defaulting behavior. Generally that avoids the problem you have below of introducing the new field - PSP is by definition opt in and allows us to control this.
On the kubelet, we can default allowPrivilegeEscalation to on for uid = 0 and uid unset, while on the apiserver the PSP rules can control defaulting. Generally, we say that PSP should always be defaulting so the API is as accurate as possible, which leaves the door open for someone not using PSP who doesn't specify those fields to have the kubelet impose the legacy default.
a way to override the default behavior so users _can_ set `no_new_privs` for | ||
uids that are not `0`. | ||
|
||
TODO: should this be another option? or should `allowPrivilegeEscalation` not |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
*bool
is probably required (user didn't set, vs user said false explicitly)
I think we can say "if you want this secure defaulting, use PSP" which we want to do anyway (regardless if we tighten the default policy for the cluster).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All other comments considered, LGTM.
for that container. | ||
|
||
`allowPrivilegeEscalation` in `SecurityContext` provides container level | ||
control of the `no_new_privs` flag and can override the default |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
override in both directions?
Addressed comments and added information on setting the default via Pod Security Policy, although it might need to be more verbose... thanks for the review!! EDIT: I made a table to show the behavior... hopefully it's clearer now :) |
adf8f66
to
7e86f0e
Compare
LGTM - @smarterclayton had most comments so he gets to merge when he's LGTM |
in a Pod Security Policy. | ||
This would allow users to set `allowPrivilegeEscalation=false`, overriding the | ||
kubelet default behavior of `allowPrivilegeEscalation=true` for containers | ||
whose uids are not 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most (all?) fields covered by the pod security policy have a setting to control the default, and also a setting to control the allowed values. Do we want to add something to say pods must run with allowPrivilegeEscalation=false
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry a but confused, the pod security policy does control the default at this point, maybe I did not explain well enough
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we must have been typing at the same time 😄 #639 (comment) is my take
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pweil-'s explanation cleared it up for me. Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
in a Pod Security Policy. | ||
This would allow users to set `allowPrivilegeEscalation=false`, overriding the | ||
kubelet default behavior of `allowPrivilegeEscalation=true` for containers | ||
whose uids are not 0. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pweil-'s explanation cleared it up for me. Thanks!
- seccomp2 as a non root user: requires `no_new_privs` | ||
- seccomp2 with dropped `CAP_SYS_ADMIN`: requires `no_new_privs` | ||
- ambient capabilities: requires `no_new_privs` | ||
- selinux transactions: bugs that were fixed documented [here](https://github.com/moby/moby/issues/23981#issuecomment-233121969) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
s/transactions/transitions/
@php-coder FYI |
3d96e91
to
ee9c7e7
Compare
|
||
- when a container is `privileged` | ||
- when `CAP_SYS_ADMIN` is added to a container | ||
- when a container is not run as root, uid `0` (to prevent breaking suid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, PSP has some mixed responsibilities (I deserve all the blame for that one). It is both "am I allowed to do it" and "what is the default value". Fields are governed by a boolean, an allowable set of values, or a strategy for complex actions. For booleans it takes on the behavior of "can you set it or not" while allowing the boolean field to default as normal (to false).
I liken this to the Privileged
setting. If you're allowed to use priv containers by PSP.Privileged
it will let you specifically request privileged=true
but will never default that value to true.
So in that vein, in the scenario of PSP.allowPrivilegeEscalation
being true would mean you can set the value in SecurityContext
if you want (or leave it nil to get kubelet defaulting). If PSP.allowPrivilegeEscalation
is false then it should deny any attempt to set it and ensure the kubelet never defaults it to allow escalation. It probably also means we should validate the PSP for conflicting settings: you can run priv or add the cap but not escalate...whammy!
I will also caveat that if I'm the lone hold out here I should be overridden. I'm very used to the way it works now and it's obviously confusing.
Signed-off-by: Jess Frazelle <acidburn@google.com>
No further comments from me, @pweil- you want to sign off? |
LGTM |
definition. | ||
|
||
By default, ie when `allowPrivilegeEscalation=nil`, we will set `no_new_privs=true` | ||
with the following exceptions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which cases are not covered by these exceptions? It seems like they cover all the cases (uid = 0, uid != 0).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if uid=0 then no_new_privs is true, if uid!=0 the no_new_privs is not set, does that tanble at the bottom clear it up for you... confused as to what you mean
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
confused as to what you mean
Perhaps I got confused because there are 2 fields named allowPrivilegeEscalation
and table bellow explains the 2nd in the PSP while here we're talking about SC.
By default, ie when
allowPrivilegeEscalation=nil
, we will setno_new_privs=true
with the following exceptions:
In other words, if allowPrivilegeEscalation=nil
, we will set no_new_privs=true
when SecurityContext.RunAsUser=nil
or SecurityContext.RunAsUser=0
.
All other cases are exceptions and no_new_privs
will be set to false
.
Is this correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes except if the container is run as privileged or adds CAP_SYS_ADMIN
It is the first case in the table... when allowPrivilegeEscalation=nil
Ok, LGTM. We may want to discuss in the API PR whether a strategy works better, but this is good enough to merge. Thanks Jess |
\o/ code timeeee
…On Thu, May 25, 2017 at 2:10 PM, Clayton Coleman ***@***.***> wrote:
Merged #639 <#639>.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#639 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/ABYNbLwfLeaIoCUciFtWgJKOgCzUkQjJks5r9cQrgaJpZM4Nf-a1>
.
--
Jessie Frazelle
4096R / D4C4 DD60 0D66 F65A 8EFC 511E 18F3 685C 0022 BFF3
pgp.mit.edu <http://pgp.mit.edu/pks/lookup?op=get&search=0x18F3685C0022BFF3>
|
Automatic merge from submit-queue (batch tested with PRs 49651, 49707, 49662, 47019, 49747) Add support for `no_new_privs` via AllowPrivilegeEscalation **What this PR does / why we need it**: Implements kubernetes/community#639 Fixes #38417 Adds `AllowPrivilegeEscalation` and `DefaultAllowPrivilegeEscalation` to `PodSecurityPolicy`. Adds `AllowPrivilegeEscalation` to container `SecurityContext`. Adds the proposed behavior to `kuberuntime`, `dockershim`, and `rkt`. Adds a bunch of unit tests to ensure the desired default behavior and that when `DefaultAllowPrivilegeEscalation` is explicitly set. Tests pass locally with docker and rkt runtimes. There are also a few integration tests with a `setuid` binary for sanity. **Release note**: ```release-note Adds AllowPrivilegeEscalation to control whether a process can gain more privileges than it's parent process ```
## Existing SecurityContext objects | ||
|
||
Kubernetes defines `SecurityContext` for `Container` and `PodSecurityContext` | ||
for `PodSpec`. `SecurityContext` objects define the related security options |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW why we didn't implement allowPrivilegeEscalation
for PodSecurityContext
?
|
||
The API will reject as invalid `privileged=true` and | ||
`allowPrivilegeEscalation=false`, as well as `capAdd=CAP_SYS_ADMIN` and | ||
`allowPrivilegeEscalation=false.` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jessfraz I don't see that this was implemented. Do we still need this check or we don't?
Automatic merge from submit-queue (batch tested with PRs 49651, 49707, 49662, 47019, 49747) Add support for `no_new_privs` via AllowPrivilegeEscalation **What this PR does / why we need it**: Implements kubernetes/community#639 Fixes #38417 Adds `AllowPrivilegeEscalation` and `DefaultAllowPrivilegeEscalation` to `PodSecurityPolicy`. Adds `AllowPrivilegeEscalation` to container `SecurityContext`. Adds the proposed behavior to `kuberuntime`, `dockershim`, and `rkt`. Adds a bunch of unit tests to ensure the desired default behavior and that when `DefaultAllowPrivilegeEscalation` is explicitly set. Tests pass locally with docker and rkt runtimes. There are also a few integration tests with a `setuid` binary for sanity. **Release note**: ```release-note Adds AllowPrivilegeEscalation to control whether a process can gain more privileges than it's parent process ```
Add support for `no_new_privs` via allowPrivilegeEscalation
Carried from #180 by @xingzhou closes #180
Addresses kubernetes/kubernetes#38417
Addresses comments left on #180 with regard to interactions of
no_new_privs
and other existing linux primitives, including suid binaries.Changes the proposal to default to
no_new_privs
being enabled, with exceptions documented so suid binaries and existing privileged/cap_sys_admin containers do not break.It renames the initial proposal to
allowPrivilegeEscalation
Also if anyone is worried about the whole "defaulting to true" thing, when we turned on seccomp by default in docker, I tested all the public Dockerfiles on github, we can do the same for this.... just to see what breaks ;)