Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add enqueue-action.md doc #2664

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

zbbkeepgoing
Copy link
Contributor

Enqueue action doc not exist in design doc, so add it.

Signed-off-by: Binbin Zou binbin.zou@kyligence.io

@volcano-sh-bot volcano-sh-bot added the size/M Denotes a PR that changes 30-99 lines, ignoring generated files. label Feb 1, 2023

### 1. QueueOrderFn:
#### Priority:
Compares queuePriority set in Spec(using PriorityClass) and returns the decision of comparison between two priorities.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related to #2656


When the minimum number of resource requests under a job can not be met, even if the scheduling action is
performed for pod under a Job, pod will not be schedule because the "Gang" constraint is not reached.
state of job only allowed change from `Pending` to `Inqueue` if the minimum resource size of the job is met.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
state of job only allowed change from `Pending` to `Inqueue` if the minimum resource size of the job is met.
State of job only allowed change from `Pending` to `Inqueue` if the minimum resource size of the job is met.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thks.

performed for pod under a Job, pod will not be schedule because the "Gang" constraint is not reached.
state of job only allowed change from `Pending` to `Inqueue` if the minimum resource size of the job is met.

Enqueue action is the preparatory stage in the scheduling process. it can prevent a large number of
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Enqueue action is the preparatory stage in the scheduling process. it can prevent a large number of
Enqueue action is the preparatory stage in the scheduling process. It can prevent a large number of

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thks.

Each resource type of the minResource of the current job lessEqual the resourceQuota of the namespace, the job status is allowed to change from `Pending` to `Inqueue`, otherwise it will be rejected.

#### SLA:
Job pending state waiting timeout, the job status is allowed to change from `Pending` to `Inqueue`, otherwise it will be rejected.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, SLA plugin would not REJECT job if its sla-waiting-time was not reached, it would just abstain and leave the decision to other plugins such as overcommit:

// JobEnqueueable invoke jobEnqueueableFns function of the plugins
func (ssn *Session) JobEnqueueable(obj interface{}) bool {
var hasFound bool
for _, tier := range ssn.Tiers {
for _, plugin := range tier.Plugins {
if !isEnabled(plugin.EnabledJobEnqueued) {
continue
}
fn, found := ssn.jobEnqueueableFns[plugin.Name]
if !found {
continue
}
res := fn(obj)
if res < 0 {
return false
}
if res > 0 {
hasFound = true
}
}
// if plugin exists that votes permit, meanwhile other plugin votes abstention,
// permit job to be enqueueable, do not check next tier
if hasFound {
return true
}
}

Copy link
Contributor Author

@zbbkeepgoing zbbkeepgoing Feb 3, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this description is only the behavior of sla plugin, For the overall control of jobEnqueueable, I added a description as a whole.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I mean that SLA plugin behavior is different with other plugins, which would never reject podgroup from pending to inqueue. So the description otherwise it will be rejected should be modified.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, It will be abstained, thks.


1. QueueOrderFn(Plugin: Priority, DRF, Proportion),
2. JobOrderFn(Plugin: Priority, DRF, Gang),
3. JobEnqueueable(Plugin: OverCommit, Proportion, ResourceQuota, SLA),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. JobEnqueued(Plugin: OverCommit)

This interface was used to update resource infos after one job turned inqueue, overcommit plugin need this interface to update inqueue resources in each queue and whole cluster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done, thks.

@zbbkeepgoing zbbkeepgoing force-pushed the enqueue_action_doc branch 2 times, most recently from dea085f to 75c5c38 Compare February 3, 2023 09:41
Copy link
Member

@hwdef hwdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Mar 8, 2023
Signed-off-by: Binbin Zou <binbin.zou@kyligence.io>

update doc

Signed-off-by: Binbin Zou <binbin.zou@kyligence.io>

update doc

Signed-off-by: Binbin Zou <binbin.zou@kyligence.io>
@volcano-sh-bot volcano-sh-bot removed the lgtm Indicates that a PR is ready to be merged. label Mar 28, 2023
@stale
Copy link

stale bot commented Jun 10, 2023

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 10, 2023
Copy link
Member

@hwdef hwdef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@volcano-sh-bot volcano-sh-bot added the lgtm Indicates that a PR is ready to be merged. label Jun 11, 2023
@stale stale bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 11, 2023
@hwdef
Copy link
Member

hwdef commented Jun 11, 2023

Please close and reopen this PR to retrigger the CI

@stale
Copy link

stale bot commented Aug 12, 2023

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Aug 12, 2023
@stale stale bot closed this Sep 17, 2023
@hwdef
Copy link
Member

hwdef commented Sep 17, 2023

/reopen

@volcano-sh-bot
Copy link
Contributor

@hwdef: Reopened this PR.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@volcano-sh-bot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign k82cn
You can assign the PR to them by writing /assign @k82cn in a comment when ready.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@stale stale bot closed this Oct 15, 2023
@lowang-bh
Copy link
Member

/reopen

@volcano-sh-bot
Copy link
Contributor

@lowang-bh: Reopened this PR.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@stale stale bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 22, 2023
Copy link

stale bot commented Mar 17, 2024

Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward?

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

@stale stale bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 17, 2024
@lowang-bh
Copy link
Member

/remove lifecycle-stale

@stale stale bot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lgtm Indicates that a PR is ready to be merged. retest-not-required-docs-only size/M Denotes a PR that changes 30-99 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants