Skip to content

Conversation

@macsko
Copy link
Member

@macsko macsko commented Nov 18, 2025

Description

This PR adds feature gates docs and a new Workload Aware Scheduling tab to the scheduling docs based on KEP-4671.

Issue

KEP: kubernetes/enhancements#4671

@k8s-ci-robot k8s-ci-robot added this to the 1.35 milestone Nov 18, 2025
@netlify
Copy link

netlify bot commented Nov 18, 2025

👷 Deploy Preview for kubernetes-io-vnext-staging processing.

Name Link
🔨 Latest commit 451e915
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-io-vnext-staging/deploys/692b4aaca680ec00086761ea

@k8s-ci-robot k8s-ci-robot added the language/en Issues or PRs related to English language label Nov 18, 2025
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Nov 18, 2025
@macsko macsko mentioned this pull request Nov 18, 2025
@macsko
Copy link
Member Author

macsko commented Nov 18, 2025

@netlify
Copy link

netlify bot commented Nov 18, 2025

Pull request preview available for checking

Built without sensitive environment variables

Name Link
🔨 Latest commit 451e915
🔍 Latest deploy log https://app.netlify.com/projects/kubernetes-io-main-staging/deploys/692b4aac8be0c90008cb505f
😎 Deploy Preview https://deploy-preview-53296--kubernetes-io-main-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

Copy link
Contributor

@erictune erictune left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Text looks good.

Should the new concepts page be linked from somewhere?

Copy link
Member

@lmktfy lmktfy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR.

Because Pod is a stable API, you also need to update the Pod documentation. You need to do this work even though the new APIs are only alpha.
Explain that the behavior of Pod depends on whether the reader, a cluster administrator, has or has not enabled the relevant feature gates.


Watch out for putting new documentation in one page. It's tempting to do that because what you are documenting is part of one package of improvements; however, readers learn about different elements of Kubernetes in different pages, and these improvements touch on several of those (not just scheduling).

I would put most of the new content into the Workloads
section of the docs, for example by adding a section about Pod groups, at one of:
https://kubernetes.io/docs/concepts/workloads/pod-groups/
https://kubernetes.io/docs/concepts/workloads/pods/groups/

(I prefer the former, personally; PodGroup is an API separate from Pod).

Gang scheduling, however, I would place at
https://kubernetes.io/docs/concepts/scheduling-eviction/gang-scheduling/

You can also, either for alpha or beta, work with SIG Docs to add a new tutorial. If you do, various other pages can and should link there.

spec:
# controllerRef provides a link to the object that manages this Workload,
# such as a Kubernetes Job. This is for tooling and observability.
controllerRef:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we may need to explain the difference between "the Job controller" (which is a controller) and "a Job" (which represents a desired and observed state that the Job controller operates on)

because no single node has enough capacity for them. The job cannot run,
but the scheduled Pods waste expensive resources that other applications could use.

Workload Aware Scheduling introduces a mechanism for the scheduler to identify and manage a group of Pods as a single, atomic workload.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aim to write the documentation mostly as if the feature is already generally available, and then garnish it with caveats about it actually being alpha.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good documentation is often timeless

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wouldn't add this file at all.

@lmktfy
Copy link
Member

lmktfy commented Nov 18, 2025

/sig scheduling node

@k8s-ci-robot k8s-ci-robot added sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. sig/node Categorizes an issue or PR as relevant to SIG Node. labels Nov 18, 2025

## What is Workload Aware Scheduling?

The default Kubernetes scheduler makes decisions for one Pod at a time. This model works sufficiently good for stateless applications,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't exactly true. The default scheduler's behavior, at the time this doc is live, depends on whether you have enabled the GangScheduling feature gate.

v1.35 K8s will, of course, support gang scheduling (as alpha), in-tree.

@helayoty helayoty moved this to Needs Review in SIG Scheduling Nov 19, 2025
@macsko
Copy link
Member Author

macsko commented Nov 20, 2025

@lmktfy thank you for your valuable review. Just to be on the same page:

Because Pod is a stable API, you also need to update the Pod documentation.

What Pod documentation are you referring to? Are you talking about mentioning WorkloadReference somewhere in the “https://kubernetes.io/docs/concepts/workloads/pods/” section, or somewhere else?

I would put most of the new content into the Workloads section of the docs

So I should split the documentation page into two parts: move the part about the PodGroups to https://kubernetes.io/docs/concepts/workloads/pods-groups/, and the part about Gang Scheduling to https://kubernetes.io/docs/concepts/scheduling-eviction/gang-scheduling/, right? Should I describe the part about (whole) Workload API in the PodGroups docs or somewhere else?

You can also, either for alpha or beta, work with SIG Docs to add a new tutorial.

Good idea, let's do that for the beta.

@lmktfy
Copy link
Member

lmktfy commented Nov 20, 2025

What Pod documentation are you referring to? Are you talking about mentioning WorkloadReference somewhere in the “https://kubernetes.io/docs/concepts/workloads/pods/” section, or somewhere else?

Yes, when I talk about the documentation for the Pod API, I mean https://kubernetes.io/docs/concepts/workloads/pods/ and contents. There is also an API reference, but we generate that from the OpenAPI.

You will need to update Pod to tell people that Pods can be put into groups.


So I should split the documentation page into two parts: move the part about the PodGroups to https://kubernetes.io/docs/concepts/workloads/pods-groups/, and the part about Gang Scheduling to https://kubernetes.io/docs/concepts/scheduling-eviction/gang-scheduling/, right? Should I describe the part about (whole) Workload API in the PodGroups docs or somewhere else?

Yes, that's the split, but I might (only might) document the Workload API in its own section / page, somewhere within https://kubernetes.io/docs/concepts/workloads/
You need to make a call on that. I don't have a sense of whether https://kubernetes.io/docs/concepts/workloads/pods-groups/ is a good home for describing Workload, but once there is a new draft, we can offer feedback.

@Urvashi0109
Copy link
Contributor

Hello @macsko 👋! I'm reaching out from the Docs team. Just checking in as we approach Docs Freeze on 3rd December 2025, 12:00 UTC.
This documentation appears to still be under review. To meet the Docs Freeze, this PR must have a technical review as well as lgtm and approve labels applied, without any unaddressed comments or concerns from SIG Docs. The status of this enhancement is marked as at risk for docs freeze. Thank you!

@macsko
Copy link
Member Author

macsko commented Nov 26, 2025

@lmktfy I've updated the docs based on your comments. PTAL whether the current structure make sense

@macsko macsko force-pushed the gang_scheduling_docs branch 2 times, most recently from bb4f3f2 to bdb4b70 Compare November 26, 2025 14:28
Copy link
Member

@wojtek-t wojtek-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor comment - other than that it LGTM from technical POV.

@macsko macsko force-pushed the gang_scheduling_docs branch from bdb4b70 to fda060d Compare November 26, 2025 15:44
@wojtek-t
Copy link
Member

/lgtm

LGTM from technical POV.

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 27, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

DetailsGit tree hash: f7a8e4c3bbe59bde46e52c28e0686c51db66d358

Copy link
Member

@dom4ha dom4ha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maciek, very well written, so LGTM from me.
I have just a few minor suggestions.

---

Enables the support for [Workload API](/docs/concepts/workloads/workload-api/) to express scheduling requirements
at the workload level. Pods can now reference a specific Workload PodGroup using the spec.workloadRef field.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
at the workload level. Pods can now reference a specific Workload PodGroup using the spec.workloadRef field.
at the workload level. Pods can now reference a specific Workload PodGroup they belong to using the spec.workloadRef field.

The [Workload API](/docs/concepts/workloads/workload-api/) allows you to define a group of Pods
and apply advanced scheduling policies to them, such as [gang scheduling](/docs/concepts/scheduling-eviction/gang-scheduling/).
This is particularly useful for batch processing and machine learning workloads
where "all-or-nothing" placement is required.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
where "all-or-nothing" placement is required.
where "all-or-nothing" scheduling is required.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think placement is OK, TBH.

### Gang policy
The `gang` policy enforces "all-or-nothing" scheduling. This is essential for tightly-coupled workloads
where partial startup results in deadlocks or wasted resources.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
where partial startup results in deadlocks or wasted resources.
need a group of Pods to be scheduled simultaneously to function correctly. Partial startup results in resource waste and may even lead to deadlocks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still update the merged docs, even after docs freeze

The key thing about the deadline is that we must have docs that are at least good enough ahead of the upcoming release.

2. Once the quorum is met, the scheduler attempts to find placements for all Pods in the group.
All assigned Pods wait at the `WaitOnPermit` gate during this process.
Note that in the Alpha phase of this feature, finding a placement is based on pod-by-pod scheduling,
rather than a single-cycle approach.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
rather than a single-cycle approach.
rather than a more sophisticated logic capable of scheduling all required pods at once.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still update the merged docs, even after docs freeze

The key thing about the deadline is that we must have docs that are at least good enough ahead of the upcoming release.


If a Pod references a Workload that does not exist, or a pod group that is not defined within that Workload,
the Pod will remain pending. It is not considered for placement until you create the missing Workload object
or recreate it to include the missing `PodGroup` definition.
Copy link
Member

@lmktfy lmktfy Nov 29, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For beta, try for this:

Suggested change
or recreate it to include the missing `PodGroup` definition.
or recreate it to include the missing pod group definition.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can still update the merged docs, even after docs freeze

The key thing about the deadline is that we must have docs that are at least good enough ahead of the upcoming release.

@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 29, 2025
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should update https://kubernetes.io/docs/concepts/policy/ to hyperlink here

Copy link
Member

@lmktfy lmktfy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm
/approve

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 29, 2025
@k8s-ci-robot
Copy link
Contributor

LGTM label has been added.

DetailsGit tree hash: 739c4b88c894c8d06cfe33d52e02f5f5444fe469

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: erictune, lmktfy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 29, 2025
@k8s-ci-robot k8s-ci-robot merged commit 2a40dd2 into kubernetes:dev-1.35 Nov 29, 2025
2 checks passed
@github-project-automation github-project-automation bot moved this from Needs Review to Done in SIG Scheduling Nov 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. language/en Issues or PRs related to English language lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/node Categorizes an issue or PR as relevant to SIG Node. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

7 participants