Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support more actions for volcano job failure scenario #3813

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

bibibox
Copy link
Contributor

@bibibox bibibox commented Nov 11, 2024

Implement issue #3812

fix failed ci pipeline after the related pr at apis repo volcano-sh/apis#140 merged

@volcano-sh-bot volcano-sh-bot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Nov 11, 2024
@bibibox bibibox force-pushed the add_job_policy_timeout branch from 8206c69 to dfb4a4f Compare November 11, 2024 07:55
@bibibox bibibox changed the title support more actions for volcano job failure scenario [WIP] support more actions for volcano job failure scenario Nov 11, 2024
@volcano-sh-bot volcano-sh-bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Nov 11, 2024
@bibibox bibibox force-pushed the add_job_policy_timeout branch from dfb4a4f to fc3fa27 Compare November 11, 2024 09:20
@william-wang william-wang added this to the v2.0 milestone Nov 12, 2024
for podName, delayAct := range taskMap {
shouldCancel := false

if podName == req.PodName {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why use for key := range taskMap if key == xxx rather than val, exists := taskMap[key]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

}
}
}
cc.delayActionMapLock.Unlock()

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refactor into a function and unlock by defer

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

req.Namespace, req.JobName, err)
}

func (cc *jobcontroller) cleanupDelayActions(jobKey string, excludePod string) {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter excludePod can be misunderstood. Literally, it suggests "cleanup actions exclude this pod," but in reality, it merely skips the execution of the CancelFunc.
Additionally, the CancelFunc can be executed multiple times. So it is unnecessary to check it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

@bibibox bibibox force-pushed the add_job_policy_timeout branch 2 times, most recently from b45b6c4 to a09a6bb Compare December 5, 2024 03:40
@volcano-sh-bot volcano-sh-bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 19, 2024
@bibibox bibibox force-pushed the add_job_policy_timeout branch from a09a6bb to fd4f3b0 Compare December 26, 2024 13:06
@volcano-sh-bot volcano-sh-bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 26, 2024
@bibibox bibibox force-pushed the add_job_policy_timeout branch 2 times, most recently from ca96c2b to 60aa5f1 Compare December 27, 2024 09:09
@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign thor-wl
You can assign the PR to them by writing /assign @thor-wl in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@bibibox bibibox force-pushed the add_job_policy_timeout branch 4 times, most recently from 5baff31 to 573b09a Compare December 27, 2024 14:27
Signed-off-by: Box Zhang <wszwbsddbk@gmail.com>
@bibibox bibibox force-pushed the add_job_policy_timeout branch from 573b09a to af5dbf7 Compare December 27, 2024 15:39
@bibibox bibibox changed the title [WIP] support more actions for volcano job failure scenario Support more actions for volcano job failure scenario Dec 28, 2024
@volcano-sh-bot volcano-sh-bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Dec 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants