KEP-4671 Add docs for Workload API and Gang scheduling #53296

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

k8s-ci-robot merged 3 commits into kubernetes:dev-1.35 from macsko:gang_scheduling_docs

Nov 29, 2025

Member

macsko commented Nov 18, 2025

Description

This PR adds feature gates docs and a new Workload Aware Scheduling tab to the scheduling docs based on KEP-4671.

Issue

KEP: kubernetes/enhancements#4671

k8s-ci-robot added this to the 1.35 milestone

netlify bot commented Nov 18, 2025 •

edited

Loading

👷 Deploy Preview for kubernetes-io-vnext-staging processing.

Name	Link
🔨 Latest commit	`451e915`
🔍 Latest deploy log	https://app.netlify.com/projects/kubernetes-io-vnext-staging/deploys/692b4aaca680ec00086761ea

k8s-ci-robot added the language/en label

k8s-ci-robot requested review from katcosgrove and shannonxtreme

November 18, 2025 11:51

k8s-ci-robot added cncf-cla: yes size/L labels

macsko mentioned this pull request

Placeholder for KEP4671 #52898

Closed

Member Author

macsko commented Nov 18, 2025

/cc @dom4ha @sanposhiho @wojtek-t @erictune

k8s-ci-robot requested review from dom4ha, erictune, sanposhiho and wojtek-t

November 18, 2025 11:51

netlify bot commented Nov 18, 2025 •

edited

Loading

✅ Pull request preview available for checking

Built without sensitive environment variables

Name	Link
🔨 Latest commit	`451e915`
🔍 Latest deploy log	https://app.netlify.com/projects/kubernetes-io-main-staging/deploys/692b4aac8be0c90008cb505f
😎 Deploy Preview	https://deploy-preview-53296--kubernetes-io-main-staging.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

erictune approved these changes

View reviewed changes

Contributor

erictune left a comment

Text looks good.

Should the new concepts page be linked from somewhere?

lmktfy reviewed

View reviewed changes

Member

lmktfy left a comment •

edited

Loading

Thanks for the PR.

Because Pod is a stable API, you also need to update the Pod documentation. You need to do this work even though the new APIs are only alpha.
Explain that the behavior of Pod depends on whether the reader, a cluster administrator, has or has not enabled the relevant feature gates.

Watch out for putting new documentation in one page. It's tempting to do that because what you are documenting is part of one package of improvements; however, readers learn about different elements of Kubernetes in different pages, and these improvements touch on several of those (not just scheduling).

I would put most of the new content into the Workloads
section of the docs, for example by adding a section about Pod groups, at one of:
• https://kubernetes.io/docs/concepts/workloads/pod-groups/
• https://kubernetes.io/docs/concepts/workloads/pods/groups/

(I prefer the former, personally; PodGroup is an API separate from Pod).

Gang scheduling, however, I would place at
• https://kubernetes.io/docs/concepts/scheduling-eviction/gang-scheduling/

You can also, either for alpha or beta, work with SIG Docs to add a new tutorial. If you do, various other pages can and should link there.

content/en/docs/concepts/scheduling-eviction/workload-aware-scheduling.md Outdated Show resolved Hide resolved

content/en/docs/concepts/scheduling-eviction/workload-aware-scheduling.md Outdated Show resolved Hide resolved

content/en/docs/concepts/scheduling-eviction/workload-aware-scheduling.md Outdated Show resolved Hide resolved

content/en/docs/reference/command-line-tools-reference/feature-gates/GangScheduling.md Outdated Show resolved Hide resolved

content/en/docs/reference/command-line-tools-reference/feature-gates/GenericWorkload.md Outdated Show resolved Hide resolved

content/en/docs/reference/command-line-tools-reference/feature-gates/GenericWorkload.md Outdated Show resolved Hide resolved

content/en/docs/concepts/scheduling-eviction/workload-aware-scheduling.md Outdated

    
              spec:

                # controllerRef provides a link to the object that manages this Workload,

                # such as a Kubernetes Job. This is for tooling and observability.

                controllerRef:

Member

lmktfy Nov 18, 2025

we may need to explain the difference between "the Job controller" (which is a controller) and "a Job" (which represents a desired and observed state that the Job controller operates on)

content/en/docs/concepts/scheduling-eviction/workload-aware-scheduling.md Outdated

    
              because no single node has enough capacity for them. The job cannot run,

              but the scheduled Pods waste expensive resources that other applications could use.

              Workload Aware Scheduling introduces a mechanism for the scheduler to identify and manage a group of Pods as a single, atomic workload.

Member

lmktfy Nov 18, 2025

Aim to write the documentation mostly as if the feature is already generally available, and then garnish it with caveats about it actually being alpha.

Member

lmktfy Nov 18, 2025

Good documentation is often timeless

lmktfy reviewed

View reviewed changes

content/en/docs/concepts/scheduling-eviction/workload-aware-scheduling.md Outdated

Member

lmktfy Nov 18, 2025

I wouldn't add this file at all.

Member

lmktfy commented Nov 18, 2025

/sig scheduling node

k8s-ci-robot added sig/scheduling sig/node labels

github-project-automation bot added this to SIG Scheduling and SIG Node: code and documentation PRs

github-project-automation bot moved this to Triage in SIG Node: code and documentation PRs

lmktfy reviewed

View reviewed changes

content/en/docs/concepts/scheduling-eviction/workload-aware-scheduling.md Outdated

    
              ## What is Workload Aware Scheduling?

              The default Kubernetes scheduler makes decisions for one Pod at a time. This model works sufficiently good for stateless applications,

Member

lmktfy Nov 18, 2025

This isn't exactly true. The default scheduler's behavior, at the time this doc is live, depends on whether you have enabled the GangScheduling feature gate.

v1.35 K8s will, of course, support gang scheduling (as alpha), in-tree.

helayoty moved this to Needs Review in SIG Scheduling

Member Author

macsko commented Nov 20, 2025

@lmktfy thank you for your valuable review. Just to be on the same page:

Because Pod is a stable API, you also need to update the Pod documentation.

What Pod documentation are you referring to? Are you talking about mentioning WorkloadReference somewhere in the “https://kubernetes.io/docs/concepts/workloads/pods/” section, or somewhere else?

I would put most of the new content into the Workloads section of the docs

So I should split the documentation page into two parts: move the part about the PodGroups to https://kubernetes.io/docs/concepts/workloads/pods-groups/, and the part about Gang Scheduling to https://kubernetes.io/docs/concepts/scheduling-eviction/gang-scheduling/, right? Should I describe the part about (whole) Workload API in the PodGroups docs or somewhere else?

You can also, either for alpha or beta, work with SIG Docs to add a new tutorial.

Good idea, let's do that for the beta.

Member

lmktfy commented Nov 20, 2025 •

edited

Loading

What Pod documentation are you referring to? Are you talking about mentioning WorkloadReference somewhere in the “https://kubernetes.io/docs/concepts/workloads/pods/” section, or somewhere else?

Yes, when I talk about the documentation for the Pod API, I mean https://kubernetes.io/docs/concepts/workloads/pods/ and contents. There is also an API reference, but we generate that from the OpenAPI.

You will need to update Pod to tell people that Pods can be put into groups.

So I should split the documentation page into two parts: move the part about the PodGroups to https://kubernetes.io/docs/concepts/workloads/pods-groups/, and the part about Gang Scheduling to https://kubernetes.io/docs/concepts/scheduling-eviction/gang-scheduling/, right? Should I describe the part about (whole) Workload API in the PodGroups docs or somewhere else?

Yes, that's the split, but I might (only might) document the Workload API in its own section / page, somewhere within https://kubernetes.io/docs/concepts/workloads/
You need to make a call on that. I don't have a sense of whether https://kubernetes.io/docs/concepts/workloads/pods-groups/ is a good home for describing Workload, but once there is a new draft, we can offer feedback.

Contributor

Urvashi0109 commented Nov 21, 2025

Hello @macsko 👋! I'm reaching out from the Docs team. Just checking in as we approach Docs Freeze on 3rd December 2025, 12:00 UTC.
This documentation appears to still be under review. To meet the Docs Freeze, this PR must have a technical review as well as lgtm and approve labels applied, without any unaddressed comments or concerns from SIG Docs. The status of this enhancement is marked as at risk for docs freeze. Thank you!

macsko mentioned this pull request

Gang Scheduling Support in Kubernetes kubernetes/enhancements#4671

Open

13 tasks

macsko force-pushed the gang_scheduling_docs branch from 65f0e63 to 71fb0d6 Compare

November 26, 2025 13:54

Member Author

macsko commented Nov 26, 2025

@lmktfy I've updated the docs based on your comments. PTAL whether the current structure make sense

macsko force-pushed the gang_scheduling_docs branch 2 times, most recently from bb4f3f2 to bdb4b70 Compare

November 26, 2025 14:28

wojtek-t reviewed

View reviewed changes

Member

wojtek-t left a comment

Just one minor comment - other than that it LGTM from technical POV.

content/en/docs/concepts/workloads/pods/workload-reference.md Outdated Show resolved Hide resolved


          KEP-4671 Add docs for Workload API and Gang scheduling

fda060d

macsko force-pushed the gang_scheduling_docs branch from bdb4b70 to fda060d Compare

November 26, 2025 15:44

Member

wojtek-t commented Nov 27, 2025

/lgtm

LGTM from technical POV.

k8s-ci-robot assigned wojtek-t

k8s-ci-robot added the lgtm label

Contributor

k8s-ci-robot commented Nov 27, 2025

LGTM label has been added.

Details

Git tree hash: f7a8e4c3bbe59bde46e52c28e0686c51db66d358

dom4ha reviewed

View reviewed changes

Member

dom4ha left a comment

Maciek, very well written, so LGTM from me.
I have just a few minor suggestions.

content/en/docs/reference/command-line-tools-reference/feature-gates/GenericWorkload.md Outdated

    
              ---

              Enables the support for [Workload API](/docs/concepts/workloads/workload-api/) to express scheduling requirements

              at the workload level. Pods can now reference a specific Workload PodGroup using the spec.workloadRef field.

Member

dom4ha Nov 27, 2025

Suggested change

      
            at the workload level. Pods can now reference a specific Workload PodGroup using the spec.workloadRef field.
          
            at the workload level. Pods can now reference a specific Workload PodGroup they belong to using the spec.workloadRef field.

content/en/docs/concepts/workloads/_index.md

    
              The [Workload API](/docs/concepts/workloads/workload-api/) allows you to define a group of Pods

              and apply advanced scheduling policies to them, such as [gang scheduling](/docs/concepts/scheduling-eviction/gang-scheduling/).

              This is particularly useful for batch processing and machine learning workloads

              where "all-or-nothing" placement is required.

Member

dom4ha Nov 27, 2025

Suggested change

      
            where "all-or-nothing" placement is required.
          
            where "all-or-nothing" scheduling is required.

Member

lmktfy Nov 29, 2025

I think placement is OK, TBH.

content/en/docs/concepts/workloads/workload-api/policies.md

    
              ### Gang policy

              The `gang` policy enforces "all-or-nothing" scheduling. This is essential for tightly-coupled workloads

              where partial startup results in deadlocks or wasted resources.

Member

dom4ha Nov 28, 2025

Suggested change

      
            where partial startup results in deadlocks or wasted resources.
          
            need a group of Pods to be scheduled simultaneously to function correctly. Partial startup results in resource waste and may even lead to deadlocks.

Member

lmktfy Nov 29, 2025

We can still update the merged docs, even after docs freeze

The key thing about the deadline is that we must have docs that are at least good enough ahead of the upcoming release.

content/en/docs/concepts/scheduling-eviction/gang-scheduling.md

    
              2. Once the quorum is met, the scheduler attempts to find placements for all Pods in the group.

                 All assigned Pods wait at the `WaitOnPermit` gate during this process.

                 Note that in the Alpha phase of this feature, finding a placement is based on pod-by-pod scheduling,

                 rather than a single-cycle approach.

Member

dom4ha Nov 28, 2025

Suggested change

      
               rather than a single-cycle approach.
          
               rather than a more sophisticated logic capable of scheduling all required pods at once.

Member

lmktfy Nov 29, 2025

We can still update the merged docs, even after docs freeze

The key thing about the deadline is that we must have docs that are at least good enough ahead of the upcoming release.

lmktfy reviewed

View reviewed changes

content/en/docs/concepts/workloads/pods/workload-reference.md

    
              If a Pod references a Workload that does not exist, or a pod group that is not defined within that Workload,

              the Pod will remain pending. It is not considered for placement until you create the missing Workload object

              or recreate it to include the missing `PodGroup` definition.

Member

lmktfy Nov 29, 2025 •

edited

Loading

For beta, try for this:

Suggested change

      
            or recreate it to include the missing `PodGroup` definition.
          
            or recreate it to include the missing pod group definition.

Member

lmktfy Nov 29, 2025

We can still update the merged docs, even after docs freeze

The key thing about the deadline is that we must have docs that are at least good enough ahead of the upcoming release.

lmktfy reviewed

View reviewed changes

content/en/docs/reference/command-line-tools-reference/feature-gates/GenericWorkload.md Outdated Show resolved Hide resolved


          Tweak wording for GenericWorkload feature gate

b15d84a

k8s-ci-robot removed the lgtm label

k8s-ci-robot requested a review from wojtek-t

November 29, 2025 19:31

lmktfy reviewed

View reviewed changes

content/en/docs/concepts/workloads/workload-api/policies.md

Member

lmktfy Nov 29, 2025

nit: should update https://kubernetes.io/docs/concepts/policy/ to hyperlink here

lmktfy reviewed

View reviewed changes

content/en/docs/concepts/workloads/workload-api/_index.md Outdated Show resolved Hide resolved


          Tweak advice about API group for Workload

451e915

lmktfy reviewed

View reviewed changes

Member

lmktfy left a comment

/lgtm
/approve

k8s-ci-robot assigned lmktfy

k8s-ci-robot added the lgtm label

Contributor

k8s-ci-robot commented Nov 29, 2025

LGTM label has been added.

Details

Git tree hash: 739c4b88c894c8d06cfe33d52e02f5f5444fe469

Contributor

k8s-ci-robot commented Nov 29, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: erictune, lmktfy

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~content/en/docs/OWNERS~~ [lmktfy]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

k8s-ci-robot added the approved label

k8s-ci-robot merged commit 2a40dd2 into kubernetes:dev-1.35

2 checks passed

github-project-automation bot moved this from Needs Review to Done in SIG Scheduling

github-project-automation bot moved this from Triage to Done in SIG Node: code and documentation PRs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

lmktfy lmktfy left review comments

katcosgrove Awaiting requested review from katcosgrove

shannonxtreme Awaiting requested review from shannonxtreme

sanposhiho Awaiting requested review from sanposhiho

wojtek-t Awaiting requested review from wojtek-t

+2 more reviewers

dom4ha dom4ha left review comments

erictune erictune approved these changes

Labels

approved cncf-cla: yes language/en lgtm sig/node sig/scheduling size/L