-
Notifications
You must be signed in to change notification settings - Fork 4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented batch processing for check capacity provisioning class #7283
Conversation
Hi @Duke0404. Thanks for your PR. I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
5dddce4
to
bca8b93
Compare
bca8b93
to
bbc758d
Compare
/ok-to-test |
bbc758d
to
11703e9
Compare
please, fix tests |
11703e9
to
28f4c57
Compare
I think the tests were failing due to an issue with Github actions, they are passing now. |
FYI @MaciekPytel @mwielgus the change we've been discussing a couple weeks ago. |
I don't like the way that checkcapacity is implemented and I'm not super happy about doubling down on it. Checking capacity is not really similar to scale-up, but it is conceptually pretty much the same as fitting existing pods on upcoming nodes in FilterOutSchedulable. Both cases involve just binpacking on existing nodes and don't require using Estimator, Expander, etc to make scale-up decisions or all the logic related to actuating such decisions. PodListProcessor interface has been pretty much designed exactly for this use-case, while scale-up orchestrator is intended to do more complex operations that you don't actually need here. The architectural / maintenance downside is inconsistency with the rest of the codebase and related maintenance problems (anyone debugging CA must be aware that provreq works differently from other, similar extensions to CA - our steep learning curve is likely a sum total of small gotchas like that). I'm not going to block this PR, but I'd really like to look into aligning provisioning request implementation with the expected CA architecture - and migrating the logic to PLP would be an obvious first step here. |
I agree with the general point about CA architecture. I would go even further and say we should probably explicitly group "capacity booking" (CRs, PRs, and possible future overprovisioning API) to make the implementation more consistent - while it'll be slightly different for each case (ex. all-or-nothing for PRs) they're logically very similar and if there's ever a substantial change to how the capacity needs to be booked for CA to recognize it, it'll have to be re-implemented everywhere.
I think this argument essentially boils down to "this could be optimized further". With improvements to frequent loops logic (meaning we'll skip the scan interval), this change will still significantly improve performance compared to what there's now, even in large clusters. Yes, it's possible to improve it even more by skipping refreshing cluster state, but it isn't necessarily an argument against doing partial optimization now.
I expect batch processing to remain an experimental feature for 1.31 (meaning it's turned off by default). I agree we may want to solve this before it becomes enabled by default. |
cluster-autoscaler/main.go
Outdated
proactiveScaleupEnabled = flag.Bool("enable-proactive-scaleup", false, "Whether to enable/disable proactive scale-ups, defaults to false") | ||
podInjectionLimit = flag.Int("pod-injection-limit", 5000, "Limits total number of pods while injecting fake pods. If unschedulable pods already exceeds the limit, pod injection is disabled but pods are not truncated.") | ||
checkCapacityBatchProcessing = flag.Bool("check-capacity-batch-processing", false, "Whether to enable batch processing for check capacity requests.") | ||
maxBatchSize = flag.Int("max-batch-size", 10, "Maximum number of provisioning requests to process in a single batch.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think initial value for maxBatchSize should be 1, which means no batching, and with that we don't need to have checkCapacityBatchProcessing flag.
If we remove checkCapacityBatchProcessing flag, then we should rename maxBatchSize and batchTimebox to have checkCapacityProvisioningRequest in the name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TBH I'd rather have a one top-level flag controlling the feature. Also it allows us to have a reasonable defaults that have been tested for max batch size and processing time.
I agree with renaming the remaining flags to be clearly associated with check-capacity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree that would be much better user experience to just enable the feature with predefined parameters, but I'm not sure we have enough data to recommend max batch size.
However I think we can drop max processing time flag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think having the maxBatchSize to a value other than 1 makes sense if we keep a top level flag for clarity.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we keep checkCapacityBatchProcessing flag, i'd leave maxBatchSize default to 10. Or we can remove checkCapacityBatchProcessing and have maxBatchSize default to 1.
cluster-autoscaler/provisioningrequest/checkcapacity/provisioningclass.go
Outdated
Show resolved
Hide resolved
cluster-autoscaler/provisioningrequest/checkcapacity/provisioningclass.go
Outdated
Show resolved
Hide resolved
28f4c57
to
a6f4fc2
Compare
1ff5348
to
6976fe9
Compare
context *context.AutoscalingContext | ||
client *provreqclient.ProvisioningRequestClient | ||
schedulingSimulator *scheduling.HintingSimulator | ||
checkCapacityBatchProcessing bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having a checkCapacityBatchProcessing bool flag makes sence to have some default for max batch size other than 1, but I would remove bool value here and specify maxBatchSize equal to 1 in case checkCapacityBatchProcessing flag is false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
cluster-autoscaler/main.go
Outdated
proactiveScaleupEnabled = flag.Bool("enable-proactive-scaleup", false, "Whether to enable/disable proactive scale-ups, defaults to false") | ||
podInjectionLimit = flag.Int("pod-injection-limit", 5000, "Limits total number of pods while injecting fake pods. If unschedulable pods already exceeds the limit, pod injection is disabled but pods are not truncated.") | ||
checkCapacityBatchProcessing = flag.Bool("check-capacity-batch-processing", false, "Whether to enable batch processing for check capacity requests.") | ||
maxBatchSize = flag.Int("max-batch-size", 10, "Maximum number of provisioning requests to process in a single batch.") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we keep checkCapacityBatchProcessing flag, i'd leave maxBatchSize default to 10. Or we can remove checkCapacityBatchProcessing and have maxBatchSize default to 1.
if scaleUpIsSuccessful { | ||
combinedStatus.Add(&status.ScaleUpStatus{Result: status.ScaleUpSuccessful}) | ||
} else { | ||
combinedStatus.Add(&status.ScaleUpStatus{Result: status.ScaleUpNoOptionsAvailable}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose else statement is missed, otherwise in case err != nil we append combinedStatus twice.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add a test case that will catch this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even without the else, it would not cause any problematic behavior, thus I do not think we can test for this either. Since, in combinedStatusSet we do not store the number of statuses that we recorded. And ScaleUpNoOptionsAvailable is overwritten by ScaleUpSuccessful in the result.
if len(st) < len(unschedulablePods) || err != nil { | ||
conditions.AddOrUpdateCondition(provReq, v1.Provisioned, metav1.ConditionFalse, conditions.CapacityIsNotFoundReason, "Capacity is not found, CA will try to find it later.", metav1.Now()) | ||
capacityAvailable = false | ||
} else { | ||
commitErr := o.context.ClusterSnapshot.Commit() | ||
if commitErr != nil { | ||
commit = true |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we revert snapshot in case of error during Commit? By the way, we can use a helper function WithForkedSnapshot for this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed and updated.
6976fe9
to
93d5c1e
Compare
93d5c1e
to
68d3224
Compare
cluster-autoscaler/provisioningrequest/checkcapacity/provisioningclass.go
Show resolved
Hide resolved
} | ||
|
||
// Export converts the combinedStatusSet into a ScaleUpStatus. | ||
func (c *combinedStatusSet) Export() (*status.ScaleUpStatus, errors.AutoscalerError) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add few tests for combinedStatusSet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
@Duke0404 could you address comments in separate commit, please? It will make review process easier and faster. |
if scaleUpIsSuccessful { | ||
combinedStatus.Add(&status.ScaleUpStatus{Result: status.ScaleUpSuccessful}) | ||
} else { | ||
combinedStatus.Add(&status.ScaleUpStatus{Result: status.ScaleUpNoOptionsAvailable}) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you also add a test case that will catch this?
scaleUpResult: status.ScaleUpSuccessful, | ||
batchProcessing: true, | ||
maxBatchSize: 5, | ||
batchTimebox: 1 * time.Nanosecond, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make it 0, to avoid flakiness in case fast execution (which is very unlikely, but is possible in theory).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated.
} | ||
for _, tc := range testCases { | ||
tc := tc | ||
allNodes := allNodes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need it, because we modify cluster snapshot in each test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My bad, this should be removed. We need to have a deep copy of nodes in each case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ack.
2a6d391
to
4b19bbc
Compare
0f6aa3a
to
2e1135e
Compare
2e1135e
to
d73bdb1
Compare
/lgtm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
/approve
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: Duke0404, mwielgus The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Implements batch processing such that user can configure CA to process multiple CheckCapacity ProvisioningRequests in a single autoscaling iteration.
Which issue(s) this PR fixes:
Fixes #
Special notes for your reviewer:
Does this PR introduce a user-facing change?
Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:
cc: @yaroslava-serdiuk @aleksandra-malinowska @kawych