-
Notifications
You must be signed in to change notification settings - Fork 14.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding docs for node allocatable #2649
Conversation
Signed-off-by: Vishnu kannan <vishnuk@google.com>
docs/admin/node-allocatable.md
Outdated
|
||
### Kube Reserved | ||
|
||
**Kubelet Flag**: `--kube-reserved=[cpu=100mi][,][memory=100Mi]` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100m,100Mi
docs/admin/node-allocatable.md
Outdated
### Kube Reserved | ||
|
||
**Kubelet Flag**: `--kube-reserved=[cpu=100mi][,][memory=100Mi]` | ||
**Kubelet Flag**: `--kube-reserved-cgroup=`/runtime.slice` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this the name you are using in your images?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should make clear that /runtime.slice is not the kubelet default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point on the defaults. I hope I have made the defaults clear this time around. PTAL
docs/admin/node-allocatable.md
Outdated
[This performance dashboard](http://node-perf-dash.k8s.io/#/builds) exposes `cpu` and `memory` usage profiles of `kubelet` and `docker engine` at multiple levels of pod density. | ||
[This blog post](http://blog.kubernetes.io/2016/11/visualize-kubelet-performance-with-node-dashboard.html) explains how the dashboard can be interpreted to come up with a suitable `kube-reserved` reservation. | ||
|
||
It is recommended that the kubernetes system daemons are placed under a top level control group (`system.slice` on systemd machines for example). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think this text should be in system reserved section.
you should have text specific to kube daemons here..
docs/admin/node-allocatable.md
Outdated
### System Reserved | ||
|
||
**Kubelet Flag**: `--system-reserved=[cpu=100mi][,][memory=100Mi]` | ||
**Kubelet Flag**: `--system-reserved-cgroup=`/system.slice` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make clear this flag has no default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to make clear that the kubelet doesnt create either of these two cgroups.
docs/admin/node-allocatable.md
Outdated
Evictions are supported for `memory` and `storage` only. | ||
By reserving some memory via `--eviction-hard` flag, the `kubelet` attempts to `evict` pods whenever memory availability on the node drops below the reserved value. | ||
Hypothetically, if system daemons did not exist on a node, pods cannot use more than `capacity - eviction-hard`. | ||
For this reason, resources reserved for evictions will not be available for pods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to schedule against?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scheduling is meant to be implicit since pods can be placed directly on nodes bypassing the scheduler
docs/admin/node-allocatable.md
Outdated
|
||
**Kubelet Flag**: `--enforce-node-allocatable=[pods][,][system-reserved][,][kube-reserved]` | ||
|
||
The scheduler will treat `Allocatable` as the available `capacity` for pods. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i would remove the use of will style phrasing in the document as we are describing the present in this doc.
The scheduler treats 'Allocatable'...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments throughout.
docs/admin/node-allocatable.md
Outdated
The scheduler will treat `Allocatable` as the available `capacity` for pods. | ||
|
||
`kubelet` will enforce `Allocatable` across pods by default. | ||
This enforcement is controlled by specifying `pods` value to the kubelet flag `--enforce-node-allocatable`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
note that this is the default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack
docs/admin/node-allocatable.md
Outdated
### System Reserved | ||
|
||
**Kubelet Flag**: `--system-reserved=[cpu=100mi][,][memory=100Mi]` | ||
**Kubelet Flag**: `--system-reserved-cgroup=`/system.slice` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need to make clear that the kubelet doesnt create either of these two cgroups.
docs/admin/node-allocatable.md
Outdated
However, Kubelet cannot burst and use up all available Node resources if `kube-reserved` is enforced. | ||
|
||
Be extra careful while enforcing `system-reserved` reservation since it can lead to critical system services being CPU starved or OOM killed on the node. | ||
The recommendation is to enforce `system-reserved` only if a user has profiled their nodes exhaustively to come up with precise estimates. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and is confident in their ability to recover if any item in that group is oom_killed.
docs/admin/node-allocatable.md
Outdated
|
||
* To begin with enforce `Allocatable` on `pods`. | ||
* Once adequate monitoring and alerting is in place to track kube system daemons, attempt to enforce `kube-reserved` based on usage heuristics. | ||
* If aboslutely necessary, enforce `system-reserved` over time. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo on absolutely
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@derekwaynecarr PTAL
94c3ef3
to
a617a42
Compare
Signed-off-by: Vishnu kannan <vishnuk@google.com>
@kubernetes/sig-docs-maintainers this PR is meant for v1.6 Can I get a docs review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one typo, and maybe some more clarifying text.
while not needed on this pr, i feel like this should be linked to from somewhere centrally on how to administer a kubernetes node. maybe a doc team member can assist there.
docs/admin/node-allocatable.md
Outdated
|
||
Memory pressure at the node level leads to System OOMs which affects the entire node and all pods running on it. | ||
Nodes can go offline temporarily until memory has been reclaimed. | ||
To avoid (or reduce the probabilty) system OOMs kubelet provides [`Out of Resource`](./out-of-resource.md) management. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo: probability
docs/admin/node-allocatable.md
Outdated
The scheduler treats `Allocatable` as the available `capacity` for pods. | ||
|
||
`kubelet` enforce `Allocatable` across pods by default. | ||
This enforcement is controlled by specifying `pods` value to the kubelet flag `--enforce-node-allocatable`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe explain what enforcement means? for example, by enforcing at this level, we ensure pods cannot consume more memory and cpu time than allocated?
Signed-off-by: Vishnu kannan <vishnuk@google.com>
@derekwaynecarr PTAL |
/lgtm |
User facing documentation for https://github.com/kubernetes/community/blob/master/contributors/design-proposals/node-allocatable.md#node-allocatable-resources
cc @derekwaynecarr @dashpole @dchen1107
This change is