Skip to content

Conversation

@helayoty
Copy link
Member

/sig scheduling
/sig apps
/stage alpha

  • Other comments:

@k8s-ci-robot k8s-ci-robot added stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. sig/apps Categorizes an issue or PR as relevant to SIG Apps. labels Sep 19, 2025
@github-project-automation github-project-automation bot moved this to Needs Triage in SIG Apps Sep 19, 2025
@k8s-ci-robot k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory labels Sep 19, 2025
@k8s-ci-robot k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Sep 19, 2025
@helayoty helayoty moved this from Needs Triage to In Progress in SIG Apps Sep 19, 2025
@helayoty helayoty moved this to In Progress in SIG Scheduling Sep 19, 2025
@kannon92
Copy link
Contributor

To save paperwork, why would we not include this as part of the gang scheduling KEP?

These KEPs would need to go in tandem and if one goes without the other we would have to revert.

@helayoty
Copy link
Member Author

helayoty commented Sep 19, 2025

To save paperwork, why would we not include this as part of the gang scheduling KEP?

These KEPs would need to go in tandem and if one goes without the other we would have to revert.

After discussion with SIG Apps, they explicitly requested the changes to the Job API (or any other API) be handled through a dedicated KEP. This aligns with how API changes are typically tracked, even if the field is optional and tied to a feature gate shared with the broader gang scheduling effort.

To keep the implementations in sync and avoid divergence, we’re planning to use the same feature gate as the gang scheduling KEP. This way, the feature behaves atomically at runtime (one can’t go live without the other), even though it’s tracked in two KEPs for process and ownership clarity.

I've added link to both KEPs to each other and clearly document the dependency if that helps reviewers track the connection.

@helayoty helayoty force-pushed the helayoty/workload-job-api branch from 61fdfb2 to 6e6e5fb Compare September 19, 2025 20:21
@helayoty helayoty force-pushed the helayoty/workload-job-api branch 2 times, most recently from c33947a to a22d09c Compare September 30, 2025 12:12
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: helayoty
Once this PR has been reviewed and has the lgtm label, please assign soltysh for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@helayoty helayoty force-pushed the helayoty/workload-job-api branch 4 times, most recently from 635181f to fe7e012 Compare September 30, 2025 12:32
Signed-off-by: Heba Elayoty <heelayot@microsoft.com>
@helayoty helayoty force-pushed the helayoty/workload-job-api branch from fe7e012 to b9f9156 Compare September 30, 2025 14:47
@helayoty helayoty changed the title [WIP] KEP-5547: Add workloadRef in the Job API KEP-5547: Add workloadRef in the Job API Sep 30, 2025
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 30, 2025
Copy link
Contributor

@soltysh soltysh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've left several comments, but the biggest one that's currently blocking the further progress on this document is the one about workloadRef vs the workload field in pod.spec. I believe before going further with this proposal we should identify the direction in the light of future evolution of [KEP-4671](https://github.com/kubernetes/enhancements/pull/5558]

@@ -0,0 +1,3 @@
kep-number: 5547
alpha:
approver: "@soltysh"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on PRR dashboard this will be done by @johnbelamaric . Ideally we should avoid having SIG approver and PRR approver be the same person. So please change accordingly.

- [ ] (R) Production readiness review approved
- [ ] "Implementation History" section is up-to-date for milestone
- [ ] User-facing documentation has been created in [kubernetes/website], for publication to [kubernetes.io]
- [ ] Supporting documentation—e.g., additional design documents, links to mailing list discussions/SIG meetings, relevant PRs/issues, release notes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check the following boxes:

  • Design details are appropriately documented
  • Test plan is in place...
  • Production readiness review completed
  • Implementation History...


### Goals

- Introduce a new optional `workloadRef` field in the `JobSpec`, allowing a Job to declare an explicit association with a higher-level workload object.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've asked similar question in the gang scheduling KEP, where there's a proposal to expand pod.spec with workload information. This means Job's pod template will have the ability to set this field as well, why do we want to introduce workloadRef in that case?

Basically with that new workload field in pod.spec an example Job could look like:

spec:
  template:
    spec:
+      workload:        
+        name: job-1    
...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that other doc, there is a statement discussing potential future evolution where a controller would infer intentions from a workload and create the Workload resource accordingly. Although I've expressed some concerns around that idea.

Copy link
Member Author

@helayoty helayoty Oct 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Linking to my reply on the main KEP so we have one place for the whole conversation.
#5558 (comment)

- Will enabling / disabling the feature require downtime of the control
plane?
- Will enabling / disabling the feature require downtime or reprovisioning
of a node?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm... I don't think re-using the same feature gate as KEP-4671 is advisable. I'd expect separate one just for that field. You can express that this FG relies on the other ones, we've done that in the past (see most of DRA).

owning-sig: sig-apps
participating-sigs:
- sig-scheduling
- sig-apps
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no need to mention sig-apps again.

disable-supported: true

metrics:
- TDB
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

job_sync_duration_seconds as mentioned in SLI section?

components:
- kube-apiserver
- kube-scheduler
- kube-controller-manager
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section must match what you have in the main document.

milestone:
alpha: "v1.35"
beta: TDB
stable: TDB
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo: TBD


## Summary

Introduce a new optional field in the Job API spec to explicitly associate a Job with a Workload object, enabling safe coordination between workload-aware (Gang) scheduling and job controllers without introducing race conditions or forcing the scheduler to perform controller duties.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that Workload is widely used in SIG Apps, i.e. core workloads API, suggest briefly explaining and clarifying what you mean by Workload here.


NA

[^1]: The Kubernetes community uses the term "gang scheduling" to mean "all-or-nothing scheduling of a set of pods" [1,2,3,4,5,6,7,8,9,10,11,12,13]. In the Kubernetes context, it does not imply time-multiplexing (in contrast to prior academic work such as [Feitelson and Rudolph](https://doi.org/10.1016/0743-7315(92)90014-E), and in contrast to [Slurm Gang Scheduling](https://slurm.schedmd.com/gang_scheduling.html)).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does "1,2,3,4,5,6,7,8,9,10,11,12,13" mean?

@janetkuo
Copy link
Member

janetkuo commented Oct 1, 2025

/hold

Holding to ensure that this is merged after #5558 is merged

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Oct 1, 2025
@soltysh
Copy link
Contributor

soltysh commented Oct 8, 2025

@helayoty after all the discussions that happened in https://github.com/kubernetes/enhancements/pull/5558/files#r2410167467 I believe we can remove this KEP from 1.35. Eventual future changes to Job API will happen at a later stage, but we're not sure what those fields are, yet. So either we can close #5547 entirely, or you'll rename it to something more appropriate like "WAS: extend Job API to support workload" and we'll pick it up once we know what we actually want to achieve.

@soltysh
Copy link
Contributor

soltysh commented Oct 13, 2025

Based on my previous comment, and discussions with @helayoty we're not going to pursuit this option for 1.35. So I'm going to close this PR and will opt-out the feature from 1.35 entirely. We can revisit it at a later point in time, and based on the work happening in #4671 we'll reconsider our options.

/close

@k8s-ci-robot
Copy link
Contributor

@soltysh: Closed this PR.

Details

In response to this:

Based on my previous comment, and discussions with @helayoty we're not going to pursuit this option for 1.35. So I'm going to close this PR and will opt-out the feature from 1.35 entirely. We can revisit it at a later point in time, and based on the work happening in #4671 we'll reconsider our options.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

@github-project-automation github-project-automation bot moved this from In Progress to Closed in SIG Apps Oct 13, 2025
@helayoty helayoty moved this from In Progress to Closed in SIG Scheduling Oct 16, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/apps Categorizes an issue or PR as relevant to SIG Apps. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. stage/alpha Denotes an issue tracking an enhancement targeted for Alpha status

Projects

Archived in project
Archived in project

Development

Successfully merging this pull request may close these issues.

5 participants