-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FEATURE: add cli argument to modify controller workqueue ratelimiter #2186
FEATURE: add cli argument to modify controller workqueue ratelimiter #2186
Conversation
08200aa
to
251376f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the contribution! The PR looks good to me, except pending resolution of the comment.
We’re planning to run some load tests in v2 with this change to evaluate if it improves performance and whether we actually need a queue-per-app solution.
Thanks, @ImpSy!
Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
…o helm chart Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
251376f
to
39d4a03
Compare
@vara-bonthu @ChenYi015 Added the documentation to the charts like you wanted |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/approve
wait for another approval
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: ChenYi015, vara-bonthu The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
…ubeflow#2186) * add cli argument to modify controller workqueue ratelimiter Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com> * add cli argument to modify controller workqueue ratelimiter support to helm chart Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com> --------- Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com>
…ubeflow#2186) * add cli argument to modify controller workqueue ratelimiter Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com> * add cli argument to modify controller workqueue ratelimiter support to helm chart Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com> --------- Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com> (cherry picked from commit d37a0e9) Signed-off-by: Yi Chen <github@chenyicn.net>
* FEATURE: add cli argument to modify controller workqueue ratelimiter (#2186) * add cli argument to modify controller workqueue ratelimiter Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com> * add cli argument to modify controller workqueue ratelimiter support to helm chart Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com> --------- Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com> (cherry picked from commit d37a0e9) Signed-off-by: Yi Chen <github@chenyicn.net> * Fix ingress capability discovery (#2201) Signed-off-by: Jacob Salway <jacob.salway@gmail.com> (cherry picked from commit 56b4974) Signed-off-by: Yi Chen <github@chenyicn.net> * Bump github.com/aws/aws-sdk-go-v2 from 1.30.5 to 1.31.0 (#2207) Bumps [github.com/aws/aws-sdk-go-v2](https://github.com/aws/aws-sdk-go-v2) from 1.30.5 to 1.31.0. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Commits](aws/aws-sdk-go-v2@v1.30.5...v1.31.0) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit faa0822) Signed-off-by: Yi Chen <github@chenyicn.net> * Bump golang.org/x/net from 0.28.0 to 0.29.0 (#2205) Bumps [golang.org/x/net](https://github.com/golang/net) from 0.28.0 to 0.29.0. - [Commits](golang/net@v0.28.0...v0.29.0) --- updated-dependencies: - dependency-name: golang.org/x/net dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit 6106178) Signed-off-by: Yi Chen <github@chenyicn.net> * Bump github.com/docker/docker from 27.0.3+incompatible to 27.1.1+incompatible (#2125) Bumps [github.com/docker/docker](https://github.com/docker/docker) from 27.0.3+incompatible to 27.1.1+incompatible. - [Release notes](https://github.com/docker/docker/releases) - [Commits](moby/moby@v27.0.3...v27.1.1) --- updated-dependencies: - dependency-name: github.com/docker/docker dependency-type: indirect ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit 316536f) Signed-off-by: Yi Chen <github@chenyicn.net> * Bump github.com/aws/aws-sdk-go-v2/service/s3 from 1.58.3 to 1.63.3 (#2206) Bumps [github.com/aws/aws-sdk-go-v2/service/s3](https://github.com/aws/aws-sdk-go-v2) from 1.58.3 to 1.63.3. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Commits](aws/aws-sdk-go-v2@service/s3/v1.58.3...service/s3/v1.63.3) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2/service/s3 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit 1972fb7) Signed-off-by: Yi Chen <github@chenyicn.net> * Update integration test workflow and add golangci lint check (#2197) * Update integration test workflow Signed-off-by: Yi Chen <github@chenyicn.net> * Update golangci lint config Signed-off-by: Yi Chen <github@chenyicn.net> --------- Signed-off-by: Yi Chen <github@chenyicn.net> (cherry picked from commit 143b16e) Signed-off-by: Yi Chen <github@chenyicn.net> * Bump github.com/aws/aws-sdk-go-v2 from 1.31.0 to 1.32.0 (#2229) Bumps [github.com/aws/aws-sdk-go-v2](https://github.com/aws/aws-sdk-go-v2) from 1.31.0 to 1.32.0. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Commits](aws/aws-sdk-go-v2@v1.31.0...v1.32.0) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit a4dcfcb) Signed-off-by: Yi Chen <github@chenyicn.net> * Bump cloud.google.com/go/storage from 1.43.0 to 1.44.0 (#2228) Bumps [cloud.google.com/go/storage](https://github.com/googleapis/google-cloud-go) from 1.43.0 to 1.44.0. - [Release notes](https://github.com/googleapis/google-cloud-go/releases) - [Changelog](https://github.com/googleapis/google-cloud-go/blob/main/CHANGES.md) - [Commits](googleapis/google-cloud-go@pubsub/v1.43.0...spanner/v1.44.0) --- updated-dependencies: - dependency-name: cloud.google.com/go/storage dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit 2542009) Signed-off-by: Yi Chen <github@chenyicn.net> * Bump manusa/actions-setup-minikube from 2.11.0 to 2.12.0 (#2226) Bumps [manusa/actions-setup-minikube](https://github.com/manusa/actions-setup-minikube) from 2.11.0 to 2.12.0. - [Release notes](https://github.com/manusa/actions-setup-minikube/releases) - [Commits](manusa/actions-setup-minikube@v2.11.0...v2.12.0) --- updated-dependencies: - dependency-name: manusa/actions-setup-minikube dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit 4358fd4) Signed-off-by: Yi Chen <github@chenyicn.net> * Bump golang.org/x/time from 0.6.0 to 0.7.0 (#2227) Bumps [golang.org/x/time](https://github.com/golang/time) from 0.6.0 to 0.7.0. - [Commits](golang/time@v0.6.0...v0.7.0) --- updated-dependencies: - dependency-name: golang.org/x/time dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit 29ba4e7) Signed-off-by: Yi Chen <github@chenyicn.net> * fix: imagePullPolicy was ignored (#2222) Signed-off-by: xuqingtan <missedone@gmail.com> (cherry picked from commit 7fb14e6) Signed-off-by: Yi Chen <github@chenyicn.net> * fix: spark-submission failed due to lack of permission by user `spark` (#2223) error: Exception in thread "main" java.io.FileNotFoundException: /home/spark/.ivy2/cache/resolved-org.apache.spark-spark-submit-parent-511288aa-ce7c-4a38-9c8e-4869b71c68fa-1.0.xml (No such file or directory) Signed-off-by: xuqingtan <missedone@gmail.com> (cherry picked from commit d07821b) Signed-off-by: Yi Chen <github@chenyicn.net> * Bump github.com/aws/aws-sdk-go-v2/config from 1.27.33 to 1.27.42 (#2231) Bumps [github.com/aws/aws-sdk-go-v2/config](https://github.com/aws/aws-sdk-go-v2) from 1.27.33 to 1.27.42. - [Release notes](https://github.com/aws/aws-sdk-go-v2/releases) - [Commits](aws/aws-sdk-go-v2@config/v1.27.33...config/v1.27.42) --- updated-dependencies: - dependency-name: github.com/aws/aws-sdk-go-v2/config dependency-type: direct:production update-type: version-update:semver-patch ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit 9be8dce) Signed-off-by: Yi Chen <github@chenyicn.net> * Bump github.com/prometheus/client_golang from 1.19.1 to 1.20.4 (#2204) Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.19.1 to 1.20.4. - [Release notes](https://github.com/prometheus/client_golang/releases) - [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md) - [Commits](prometheus/client_golang@v1.19.1...v1.20.4) --- updated-dependencies: - dependency-name: github.com/prometheus/client_golang dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> (cherry picked from commit fe833fa) Signed-off-by: Yi Chen <github@chenyicn.net> * Remove `cap_net_bind_service` from image (#2216) Signed-off-by: Jacob Salway <jacob.salway@gmail.com> (cherry picked from commit ac761ef) Signed-off-by: Yi Chen <github@chenyicn.net> * fix: webhook panics due to logging (#2232) Signed-off-by: Yi Chen <github@chenyicn.net> (cherry picked from commit 247e834) Signed-off-by: Yi Chen <github@chenyicn.net> * Add check for generating manifests and code (#2234) Signed-off-by: Yi Chen <github@chenyicn.net> (cherry picked from commit c75d99f) Signed-off-by: Yi Chen <github@chenyicn.net> * Spark Operator Official Release v2.0.2 Signed-off-by: Yi Chen <github@chenyicn.net> --------- Signed-off-by: ImpSy <3097030+ImpSy@users.noreply.github.com> Signed-off-by: Yi Chen <github@chenyicn.net> Signed-off-by: Jacob Salway <jacob.salway@gmail.com> Signed-off-by: dependabot[bot] <support@github.com> Signed-off-by: xuqingtan <missedone@gmail.com> Co-authored-by: Sébastien Maintrot <3097030+ImpSy@users.noreply.github.com> Co-authored-by: Jacob Salway <jacob.salway@gmail.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Nick Tan <missedone@gmail.com>
Purpose of this PR
Add the capacity to customize the controller workqueue ratelimiter and remove a potential bug on high usage cluster
Proposed changes:
Change Category
Indicate the type of change by marking the applicable boxes:
Rationale
1- controler-runtime uses
workqueue.DefaultControllerRateLimiter
by default which has a rather low BucketRatelimiterWith only 10 QPS and 100 Bucket max size we easily reach those number, before migrating to
controller-runtime
the value was actually 50/500 (0dc641b#diff-915090b1cf1857dc448319dbbbc11699ba40e674744723a5772a4624a1c79282L59)2- BucketRatelimiter does not have a max delay
We encounter a bug on a high usage cluster a few years ago where we discover that
BucketRatelimiter
rely ongolang.org/x/time/rate
Limiter.Reserve()
function that is not upperBound:rate.InfDuration
Being able to set a MaxDelay fix this
Checklist
Before submitting your PR, please review the following:
Additional Notes
This changed has been implemented on our fork for almost 2 years -> spotinst#5