Cluster Autoscaler: align workload-level APIs with Karpenter #6648

towca · 2024-03-21T15:39:31Z

Which component are you using?: Cluster Autoscaler

Is your feature request designed to solve a problem? If so describe the problem this feature should solve.:

Recently, Karpenter officially joined sig-autoscaling, and we now have 2 Node autoscalers officially supported by Kubernetes.
Both autoscalers provide workload-level APIs that a workload owner can use to influence autoscaling behavior related to the workloads. Some of these APIs have identical semantics, but different naming. Because of this, workloads taking advantage of such APIs aren't portable between clusters using different autoscalers (e.g. in a multi-cloud setting).

Cluster Autoscaler provides the following workload-level APIs:

Configure a pod not to be disrupted by scale-down: cluster-autoscaler.kubernetes.io/safe-to-evict: false
Configure a pod to never block scale-down (even if it normally would): cluster-autoscaler.kubernetes.io/safe-to-evict: false
Configure a pod to not block scale-down because of specific local volumes (while other blocking conditions still apply): cluster-autoscaler.kubernetes.io/safe-to-evict-local-volumes: "volume-1,volume-2,.."
Configure a pod to delay triggering scale-up by some duration (e.g. to allow scheduler more time to schedule the pod): cluster-autoscaler.kubernetes.io/pod-scale-up-delay: <duration>
Configure a DaemonSet pod to be/not be evicted during scale-down (regardless of the global CA setting controling this behavior): cluster-autoscaler.kubernetes.io/enable-ds-eviction: true/false
Configure a non-DaemonSet pod to be treated like a DaemonSet pod by CA: cluster-autoscaler.kubernetes.io/daemonset-pod: true

To my knowledge, right now Karpenter only provides the following workload-level API:

Configure a pod not to be disrupted by consolidation (i.e. scale-down): karpenter.sh/do-not-disrupt: true

Describe the solution you'd like.:

Introduce a new API prefix for concepts related specifically to Node autoscaling: node-autoscaling.kubernetes.io. Going forward, any new APIs using this prefix would have to be approved by both CA and Karpenter owners. Note that this doesn't prevent the autoscalers from adding new autoscaler-specific APIs, but the goal should be to use the common prefix if possible.
Add support for node-autoscaling.kubernetes.io/do-not-disrupt: true to CA and Karpenter, while still honoring cluster-autoscaler.kubernetes.io/safe-to-evict: false and karpenter.sh/do-not-disrupt: true for backwards compatibility.
Ideally we'd also add support for node-autoscaling.kubernetes.io/do-not-disrupt: false, mapping to safe-to-evict: true in CA. Not sure what the semantics for that in Karpenter would be (need to check if it has any consolidation-blocking conditions triggered by a pod).
Align with Karpenter on if they're interested in implementing any of the other workload-level APIs that CA uses, and if so - migrate them to the common API prefix as well.

Describe any alternative solutions you've considered.:

The CA/Karpenter alignment AEP also mentions aligning on Node-level APIs related to scale-down/consolidation. However, the scope of these APIs will likely be Node lifecycle altogether, not just Node autoscaling. IMO we shouldn't mix the two API prefixes together, and the Node-level API migration should be handled separately. Taking do-not-disrupt: true as an example: if we put it in a node-autoscaling.kubernetes.io prefix, all we need to guarantee is that the 2 supported autoscalers handle it correctly. If we were to put it into a broader node-lifecycle.kubernetes.io prefix, every component interacting with node lifecycle through this API would have to honor it going forward, or break the expectations. Honoring do-not-disrupt: true might not be an option for certain components (e.g. a component Upgrading nodes with strict FedRAMP requirements has to violate it at some point), limiting the usefulness of that broader node-lifecycle API.

Additional context.:

Doc describing the alignment between CA and Karpenter: CA/Karpenter alignment AEP
Existing CA issue for renaming the Node-level APIs: Align Cluster Autoscaler taints with k8s naming #5433
Discussion on the need to define node lifecycle: Add node lifecycle documentation website#45074
I want to bring this up for discussion during the sig-autoscaling meeting on ~~2024-03-25~~ TBD.

The text was updated successfully, but these errors were encountered:

towca · 2024-03-21T15:47:27Z

@MaciekPytel @gjtempleton @jonathan-innis I want to discuss this during the next sig meeting if possible, could you take a look?

sftim · 2024-03-25T15:28:22Z

BTW, if you're looking at the key prefix for annotations and / or labels, these things aren't called “API groups”. We use the term API group purely for resource kinds that you find within Kubernetes' HTTP API.

sftim · 2024-03-25T15:29:12Z

This feels accurate:
/retitle Align annotations and labels between Cluster Autoscaler and Karpenter

towca · 2024-03-25T15:52:29Z

Good point about the "API group" being a precisely defined term, changed to "API prefix". Unless we have a name for such concept as well?

I'm struggling to understand how the new title is accurate. "Aligning labels and annotations" could mean many things in the Cluster Autoscaler/Karpenter context, since labels and annotations are important for various parts of the logic (e.g. something around node templates would probably be my first guess for something related to "aligning labels and annotations"). "Workload-level APIs", on the other hand, should be pretty clear in Cluster Autoscaler/Karpenter context.

/retitle Cluster Autoscaler: align workload-level APIs with Karpenter

sftim · 2024-03-25T16:32:11Z

If you mean the cluster-autoscaler.kubernetes.io in cluster-autoscaler.kubernetes.io/safe-to-evict-local-volumes @towca, I tend to call that a label name prefix or annotation name prefix. See https://kubernetes.io/docs/concepts/overview/working-with-objects/labels/#syntax-and-character-set for some more detail.

elmiko · 2024-03-25T19:52:32Z

thanks for bringing this up @towca , i think that if we make an api around this it will be beneficial to the wider community. i don't have super strong opinions on the naming part, but i think de-emphasizing the "autoscaling" part of it would be nice. that said, i like the distinction you call about what would be expected from something with the node-lifecycle as opposed to node-autoscaling part of its prefix.i would have thought node-lifecycle would be a little better, but i like your point about other lifecycle tooling having to then obey them.

k8s-triage-robot · 2024-06-23T20:48:58Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

towca · 2024-06-24T16:32:02Z

The kubernetes/kubernetes#124800 PR defining the first common annotation is in review. The review has stalled a bit, I bumped it for the reviewers during the sig-autoscaling meeting today.

towca · 2024-06-24T16:33:09Z

/remove-lifecycle stale

k8s-triage-robot · 2024-09-22T17:04:24Z

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle stale
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

k8s-triage-robot · 2024-10-22T17:42:46Z

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

After 90d of inactivity, lifecycle/stale is applied
After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

Mark this issue as fresh with /remove-lifecycle rotten
Close this issue with /close
Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

towca · 2024-10-29T00:58:11Z

/remove-lifecycle stale

towca · 2024-10-29T00:58:41Z

/remove-lifecycle rotten

towca added area/cluster-autoscaler kind/feature Categorizes issue or PR as related to a new feature. area/core-autoscaler Denotes an issue that is related to the core autoscaler and is not specific to any provider. labels Mar 21, 2024

towca self-assigned this Mar 21, 2024

k8s-ci-robot changed the title ~~Cluster Autoscaler: align workload-level APIs with Karpenter~~ Align annotations and labels between Cluster Autoscaler and Karpenter Mar 25, 2024

k8s-ci-robot changed the title ~~Align annotations and labels between Cluster Autoscaler and Karpenter~~ Cluster Autoscaler: align workload-level APIs with Karpenter Mar 25, 2024

towca mentioned this issue Apr 8, 2024

Concepts/ClusterAdministration: Expand Node Autoscaling documentation kubernetes/website#45802

Open

jonathan-innis mentioned this issue May 5, 2024

[RFC] Karpenter v1 API & Roadmap Proposal kubernetes-sigs/karpenter#1222

Merged

towca mentioned this issue May 10, 2024

Define a common Node autoscaling safe-to-evict/do-not-disrupt annotation kubernetes/kubernetes#124800

Open

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 23, 2024

k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 24, 2024

k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 22, 2024

k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Oct 22, 2024

k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Oct 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster Autoscaler: align workload-level APIs with Karpenter #6648

Cluster Autoscaler: align workload-level APIs with Karpenter #6648

towca commented Mar 21, 2024 •

edited

Loading

towca commented Mar 21, 2024

sftim commented Mar 25, 2024

sftim commented Mar 25, 2024

towca commented Mar 25, 2024

sftim commented Mar 25, 2024

elmiko commented Mar 25, 2024

k8s-triage-robot commented Jun 23, 2024

towca commented Jun 24, 2024

towca commented Jun 24, 2024

k8s-triage-robot commented Sep 22, 2024

k8s-triage-robot commented Oct 22, 2024

towca commented Oct 29, 2024

towca commented Oct 29, 2024

Cluster Autoscaler: align workload-level APIs with Karpenter #6648

Cluster Autoscaler: align workload-level APIs with Karpenter #6648

Comments

towca commented Mar 21, 2024 • edited Loading

towca commented Mar 21, 2024

sftim commented Mar 25, 2024

sftim commented Mar 25, 2024

towca commented Mar 25, 2024

sftim commented Mar 25, 2024

elmiko commented Mar 25, 2024

k8s-triage-robot commented Jun 23, 2024

towca commented Jun 24, 2024

towca commented Jun 24, 2024

k8s-triage-robot commented Sep 22, 2024

k8s-triage-robot commented Oct 22, 2024

towca commented Oct 29, 2024

towca commented Oct 29, 2024

towca commented Mar 21, 2024 •

edited

Loading