-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add enqueue-action.md doc #2664
base: master
Are you sure you want to change the base?
Conversation
|
||
### 1. QueueOrderFn: | ||
#### Priority: | ||
Compares queuePriority set in Spec(using PriorityClass) and returns the decision of comparison between two priorities. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
related to #2656
docs/design/enqueue-action.md
Outdated
|
||
When the minimum number of resource requests under a job can not be met, even if the scheduling action is | ||
performed for pod under a Job, pod will not be schedule because the "Gang" constraint is not reached. | ||
state of job only allowed change from `Pending` to `Inqueue` if the minimum resource size of the job is met. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
state of job only allowed change from `Pending` to `Inqueue` if the minimum resource size of the job is met. | |
State of job only allowed change from `Pending` to `Inqueue` if the minimum resource size of the job is met. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thks.
docs/design/enqueue-action.md
Outdated
performed for pod under a Job, pod will not be schedule because the "Gang" constraint is not reached. | ||
state of job only allowed change from `Pending` to `Inqueue` if the minimum resource size of the job is met. | ||
|
||
Enqueue action is the preparatory stage in the scheduling process. it can prevent a large number of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Enqueue action is the preparatory stage in the scheduling process. it can prevent a large number of | |
Enqueue action is the preparatory stage in the scheduling process. It can prevent a large number of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thks.
docs/design/enqueue-action.md
Outdated
Each resource type of the minResource of the current job lessEqual the resourceQuota of the namespace, the job status is allowed to change from `Pending` to `Inqueue`, otherwise it will be rejected. | ||
|
||
#### SLA: | ||
Job pending state waiting timeout, the job status is allowed to change from `Pending` to `Inqueue`, otherwise it will be rejected. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, SLA
plugin would not REJECT
job if its sla-waiting-time
was not reached, it would just abstain and leave the decision to other plugins such as overcommit
:
volcano/pkg/scheduler/framework/session_plugins.go
Lines 393 to 419 in ac21d08
// JobEnqueueable invoke jobEnqueueableFns function of the plugins | |
func (ssn *Session) JobEnqueueable(obj interface{}) bool { | |
var hasFound bool | |
for _, tier := range ssn.Tiers { | |
for _, plugin := range tier.Plugins { | |
if !isEnabled(plugin.EnabledJobEnqueued) { | |
continue | |
} | |
fn, found := ssn.jobEnqueueableFns[plugin.Name] | |
if !found { | |
continue | |
} | |
res := fn(obj) | |
if res < 0 { | |
return false | |
} | |
if res > 0 { | |
hasFound = true | |
} | |
} | |
// if plugin exists that votes permit, meanwhile other plugin votes abstention, | |
// permit job to be enqueueable, do not check next tier | |
if hasFound { | |
return true | |
} | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this description is only the behavior of sla plugin, For the overall control of jobEnqueueable, I added a description as a whole.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I mean that SLA
plugin behavior is different with other plugins, which would never reject podgroup from pending
to inqueue
. So the description otherwise it will be rejected
should be modified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, It will be abstained, thks.
docs/design/enqueue-action.md
Outdated
|
||
1. QueueOrderFn(Plugin: Priority, DRF, Proportion), | ||
2. JobOrderFn(Plugin: Priority, DRF, Gang), | ||
3. JobEnqueueable(Plugin: OverCommit, Proportion, ResourceQuota, SLA), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- JobEnqueued(Plugin: OverCommit)
This interface was used to update resource infos after one job turned inqueue
, overcommit
plugin need this interface to update inqueue
resources in each queue and whole cluster.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, thks.
dea085f
to
75c5c38
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Signed-off-by: Binbin Zou <binbin.zou@kyligence.io> update doc Signed-off-by: Binbin Zou <binbin.zou@kyligence.io> update doc Signed-off-by: Binbin Zou <binbin.zou@kyligence.io>
2562431
to
6195ce9
Compare
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward? This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
Please close and reopen this PR to retrigger the CI |
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward? This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
/reopen |
@hwdef: Reopened this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/reopen |
@lowang-bh: Reopened this PR. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Is this still relevant? If so, what is blocking it? Is there anything you can do to help move it forward? This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
/remove lifecycle-stale |
Enqueue action doc not exist in design doc, so add it.
Signed-off-by: Binbin Zou binbin.zou@kyligence.io