-
Notifications
You must be signed in to change notification settings - Fork 499
[WIP]: Add Kruize Autoscaler #1801
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
dinogun
wants to merge
5
commits into
openshift:master
Choose a base branch
from
dinogun:add_kruize_autoscalar
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,139 @@ | ||
--- | ||
title: Autoscaling with Kruize | ||
authors: | ||
- "@dinogun" | ||
reviewers: | ||
- "@mrunalp" | ||
approvers: | ||
- "@mrunalp" | ||
creation-date: 2025-05-19 | ||
last-updated: 2025-05-19 | ||
status: provisional | ||
see-also: | ||
replaces: | ||
superseded-by: | ||
--- | ||
|
||
# Autoscaling with Kruize | ||
|
||
## Release Signoff Checklist | ||
|
||
- [ ] Enhancement is `implementable` | ||
- [ ] Design details are appropriately documented from clear requirements | ||
- [ ] Test plan is defined | ||
- [ ] Graduation criteria for dev preview, tech preview, GA | ||
- [ ] User-facing documentation is created in [openshift-docs](https://github.com/openshift/openshift-docs/) | ||
|
||
**Summary** | ||
|
||
This enhancement proposes the integration of Kruize as a new Autoscaling and Recommendation Engine in OpenShift. Kruize will assist users not only with container right-sizing and autoscaling, but also with namespace quota recommendations and GPU right-sizing and autoscaling through integration with the Instaslice project. Additionally, it will offer a Bulk API for managing multiple workloads simultaneously, box plots for improved visualization of recommendations, and notifications for various resource usage events. | ||
|
||
|
||
**Motivation** | ||
|
||
Vertical autoscaling is currently handled by the VPA project. However it has many drawbacks, including that it currently only scales the compute (cpu and memory), and has to be individually applied to containers that are of interest. It also does not provide GPU related recommendations or handle its autoscaling. | ||
|
||
SREs and developers alike would like to see resource usage recommendations at a namespace level. This helps to setup namespace quotas that align with the needs of the set of applications that will need to be deployed together. | ||
|
||
SREs typically need to resize several workloads in any given cluster. Currently the only way of accomplishing this is to have to write scripts that create VPA objects for each of them individually. Several customers have expressed interest to have a bulk API that can simplify the user experience. | ||
|
||
Efficient use of GPU resources is a top priority given how costly these resources are. AI workloads dominate the use of GPU and inference servers account for a majority of these, it would be great to autoscale GPU resources for inference workloads similar to how we do them for CPU workloads. | ||
|
||
With AI Agents, it would be easy to automate complex workflows such as, “shutdown workloads that have been idle for more than 10 days on my staging cluster”. Having notifications for idle workloads and exposing them through MCP can help manifest these workflows. | ||
|
||
**User Stories** | ||
|
||
* As an SRE, I would like to right size both my Application container and the namespace that it is deployed in, so that all of my workloads run in an efficient manner. | ||
|
||
* As an SRE, I would like to right size the use of GPU resources, so that my GPU resources are used efficiently and reduce overall costs for my workload. | ||
|
||
* As an SRE, I would like to right size all or a subset of my containers in the cluster with just one API, so that I can reduce the time needed to right size resources | ||
|
||
* As a developer, I would like to better understand the resources my workload needs, so that I can get the best performance for my workload with the right amount of resources | ||
|
||
* As a SRE, I want to all of the workloads that have been idle for more than a week on my staging cluster, so that I can shut them down | ||
|
||
**Goals** | ||
|
||
* Help both developers and SREs to arrive at right sizing both containers and namespaces | ||
* Provide a way to monitor GPU resources and right size them | ||
* Provide a way to handle scaling of multiple workloads in a easy to use way | ||
* Provide notifications for specific resource usage events such as idle | ||
|
||
**Non-Goals** | ||
|
||
* Replace VPA. Currently Kruize makes use of VPA under the covers to apply CPU and memory recommendations. | ||
|
||
**Proposal** | ||
|
||
I. **CPU Autoscaler** | ||
|
||
The following outlines the approach taken by Kruize for auto scaling CPU and Memory resources by using the custom recommender feature of the VPA. | ||
|
||
1. User creates a Kruize experiment of mode “auto” for a specific container. | ||
2. Kruize monitors the metrics of the container and produces recommendations. | ||
3. If the mode is set to “auto”, Kruize creates a VPA object with a custom VPA recommender marked as “kruize”. | ||
4. It then pushes the recommendation to the VPA object. | ||
5. VPA then applies the recommendation as per the rules specified in the VPA object. | ||
6. Kruize continues to monitor the usage of the container and updates the recommendations based on changes observed. | ||
7. These recommendations are then pushed to the VPA object if they are larger than a user specified threshold. | ||
8. VPA applies the changes whenever it notices a new recommendation | ||
9. This loop continues for each experiment that has the “auto” mode specified. | ||
|
||
 | ||
|
||
**Fig 1\. Kruize \- VPA Integration Block Diagram** | ||
|
||
**Workflow Description** | ||
|
||
Here is a more detailed description of the workflow | ||
|
||
1. User creates a regular kruize “experiment”. This is currently possible through REST APIs either as an individual experiment or through the bulk API. | ||
2. At the time of experiment creation, the user specifies the “mode” that is associated with the experiment. Currently the supported modes for Kruize are “monitor” and “autotune”. | ||
3. Two new Kruize modes “auto” and “recreate” will be added that map to the “updateModes” of the same names used by the VPA. They will also map to the same functionality as that of the VPA. | ||
4. When Kruize needs to apply recommendations, it does the following | ||
1. Create a ‘VerticalPodAutoscaler’ Object with the recommender set to ‘kruize’ | ||
2. Patch the ‘recommendation’ in the appropriate VPA terms | ||
5. Kruize will continue to update the ‘recommendation’ on a regular basis to match the app usage | ||
6. The VPA object will be deleted on two conditions | ||
1. Kruize experiment mode is updated from “auto” or “recreate” to “monitor” | ||
2. The Kruize experiment is deleted | ||
|
||
 | ||
**Fig 2\. Experiment Workflow** | ||
|
||
II. **GPU Autoscaler** | ||
|
||
Kruize can monitor the resource usage of Nvidia accelerators and provide right sizing recommendations in the form of Multi Instance GPU (MIG) partitions for supported cards. Kruize uses Instascale under the covers to scale Nvidia GPU resources using the right size for MIG partitions. | ||
|
||
**API Extensions** | ||
|
||
**Topology Considerations** | ||
|
||
**Implementation Details/Notes/Constraints** | ||
|
||
**Risks and Mitigations** | ||
|
||
**Drawbacks** | ||
|
||
**Open Questions \[optional\]** | ||
|
||
**Test Plan** | ||
|
||
**Graduation Criteria** | ||
|
||
**Dev Preview \-\> Tech Preview** | ||
|
||
**Tech Preview \-\> GA** | ||
|
||
**Upgrade / Downgrade Strategy** | ||
|
||
**Version Skew Strategy** | ||
|
||
**Operational Aspects of API Extensions** | ||
|
||
**Support Procedures** | ||
|
||
**Alternatives** | ||
|
||
**Infrastructure Needed \[Optional\]** |
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (nit) The app's metrics are pulled by Prometheus rather than pushed. Similarly I suppose that Kruize queries metrics from Prometheus. |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Integrated...how? (At least, conceptually?) 😄
Since this is an OpenShift enhancement, I assume there is some "OpenShift specific stuff" that you want to happen? Like, OpenShift is expected to change in some way as a result of this proposal?
What's in here so far just seems to be "yes, we've connected Kruize to the VPA" -- what's being described here so far could be done right now presumably in any Kubernetes cluster, it's not clear what OpenShift needs to do differently.
The VPA operator currently ships as an OLM operator installable from the marketplace/OperatorHub. Is the intent something like:
Or are you proposing some sort of OpenShift platform component that stitches together Kruize + VPA + the configuration into some kind of cohesive thing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jkyros Thanks for the review. I agree the request is looking deeper into the Kruize VPA integration whereas maybe it needs to first address the OpenShift aspect.
We dont yet have Kruize as a OLM operator, it is in the works. However the goal is for Kruize to be made available as a OLM operator and for it to be installed from there. Do you think that this is a pre-req for this enhancement request or is it ok for us to have the installation done through scripts until that happens.
Also do you think the Kruize VPA integration aspect is needed to be explained here or is that not relevant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dinogun No, not a prereq by any means for this request. I actually think it would great if we captured the intended progression of this here in the enahcement. Something like:
<blessed location>
And maybe Phase 3 is someday something that "makes configuring it way more pleasant" like:
I don't know. I just think capturing how we expect it to evolve would be great and make it clear where you want/expect this to go?
I think if we want/expect to to build something into OpenShift to deal with VPA integration or document it somewhere as part of this process it's relevant.
Like maybe at least state what the installation scripts are intended to do for configuration between the VPA and Kruize? Like do your Kruize install scripts currently install the VPA OLM operator automatically as a dependency or is that something the user is expected to do?
Or if "how users will configure it" is "read the OpenShift Kruize doc / OpenShift VPA doc which will be produced as part of this enahcement work and see the section on VPA integration" we could at least state that in the enhancement?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for all the clarifications, it really helps!!
I think we are fairly close to Phase 2, so maybe we can start from there.
Kruize has a dependency on VPA for compute resources, it has a dependency on Instaslice for GPU resources. Given this, it would be nice to have Kruize install (through Operator) ideally take care of both of these under the covers if they are not already present.
As an aside, does Instaslice installation take care of Nvidia GPU operator as well?
At the moment, VPA is expected to be present, we don't take care of that
Listing all of the dependencies in the doc definitely makes sense, especially as it would mean setting aside resources for each of these entities