Skip to content

Comments

fix(operator): Use Patch to update TrainJob status#3009

Merged
google-oss-prow[bot] merged 1 commit intokubeflow:masterfrom
astefanutti:pr-41
Nov 27, 2025
Merged

fix(operator): Use Patch to update TrainJob status#3009
google-oss-prow[bot] merged 1 commit intokubeflow:masterfrom
astefanutti:pr-41

Conversation

@astefanutti
Copy link
Contributor

What this PR does / why we need it:

It happens some status update are missed for a TrainJob when the owned JobSet rapidly updates because of conflicts.

This PR changes to patch TrainJob status instead of update to avoid those conflicts.

Checklist:

  • Docs included if any changes are user facing

Signed-off-by: Antonin Stefanutti <antonin@stefanutti.fr>
@coveralls
Copy link

coveralls commented Nov 27, 2025

Pull Request Test Coverage Report for Build 19738925925

Details

  • 0 of 5 (0.0%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.04%) to 51.435%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/controller/trainjob_controller.go 0 5 0.0%
Totals Coverage Status
Change from base Build 19735962968: -0.04%
Covered Lines: 1237
Relevant Lines: 2405

💛 - Coveralls

@astefanutti astefanutti changed the title fix(trainer): Use Patch to update TrainJob status fix(operator): Use Patch to update TrainJob status Nov 27, 2025
Comment on lines +140 to +141
// TODO(astefanutti): Consider using SSA once controller-runtime client has SSA support
// for sub-resources. See: https://github.com/kubernetes-sigs/controller-runtime/issues/3183
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AFAIK, the SSA sub resource support has already been done: kubernetes-sigs/controller-runtime#3321

Is this comment right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, this has not been released, yet.

@tenzen-y
Copy link
Member

Thank you!

/lgtm
/approve

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: tenzen-y

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@google-oss-prow google-oss-prow bot merged commit 588f2ae into kubeflow:master Nov 27, 2025
28 of 30 checks passed
@google-oss-prow google-oss-prow bot added this to the v2.2 milestone Nov 27, 2025
@tenzen-y
Copy link
Member

/cherry-pick release-2.1

@google-oss-robot
Copy link

@tenzen-y: new pull request created: #3012

Details

In response to this:

/cherry-pick release-2.1

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@astefanutti astefanutti deleted the pr-41 branch November 27, 2025 16:01
mahdikhashan pushed a commit to mahdikhashan/trainer that referenced this pull request Dec 29, 2025
Signed-off-by: Antonin Stefanutti <antonin@stefanutti.fr>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants