✨ Use a buffer to optimize priority queue AddWithOpts performance by zach593 · Pull Request #3415 · kubernetes-sigs/controller-runtime

zach593 · 2026-01-11T12:30:12Z

before #3416, In the implementation of the priority queue, when there are a large number of items and many workers blocked in GetWithPriority(), calls to spin() can take a significant amount of time. This method shares the same lock with AddWithOpts(), which causes AddWithOpts() to spend 10 ms or more contending for the lock during writes.

Although AddWithOpts() supports enqueuing multiple items in a single call, the event handler as the caller passes only one item at a time. As a result, when the enqueue volume is high, a large number of items end up being blocked in the processorListener’s ring buffer.

After merging #3416, the time consumption of handleReadyItems() (i.e., spin()) has been significantly reduced, but due to the still existing lock contention issue, there is still room for optimization in the time consumption of AddWithOpts().

benchmark function:

func BenchmarkAddLockContended(b *testing.B) {
	q := New[int]("")
	defer q.ShutDown()

	for i := range 1000000 {
		q.Add(i)
	}

	for range 1000 {
		go func() {
			for {
				item, _ := q.Get()
				time.Sleep(1 * time.Millisecond)
				q.Done(item)
			}
		}()
	}

	for b.Loop() {
		for i := range 1000 {
			q.Add(i)
		}
	}
}

upstream(5de4c4f) with 1,000,000 preloaded items, 1,000 workers, 1ms reconcile cost:

$ go test sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue -bench=BenchmarkAddLockContended -run=^$ -v -benchtime=10s -count=10

goos: linux
goarch: amd64
pkg: sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue
cpu: 12th Gen Intel(R) Core(TM) i9-12900K
BenchmarkAddLockContended
BenchmarkAddLockContended-8   	   13590	    860324 ns/op
BenchmarkAddLockContended-8   	   12276	    962917 ns/op
BenchmarkAddLockContended-8   	   13140	    884829 ns/op
BenchmarkAddLockContended-8   	   10000	   1099948 ns/op
BenchmarkAddLockContended-8   	   12951	    957041 ns/op
BenchmarkAddLockContended-8   	   10384	   1097541 ns/op
BenchmarkAddLockContended-8   	   13532	    859918 ns/op
BenchmarkAddLockContended-8   	   12502	    933303 ns/op
BenchmarkAddLockContended-8   	   13098	    893137 ns/op
BenchmarkAddLockContended-8   	   13420	    898457 ns/op
PASS
ok  	sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue	119.886s

optimized:

$ go test sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue -bench=BenchmarkAddLockContended -run=^$ -v -benchtime=10s -count=10

goos: linux
goarch: amd64
pkg: sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue
cpu: 12th Gen Intel(R) Core(TM) i9-12900K
BenchmarkAddLockContended
BenchmarkAddLockContended-8   	  110494	     93792 ns/op
BenchmarkAddLockContended-8   	  118648	     84992 ns/op
BenchmarkAddLockContended-8   	  136893	     87722 ns/op
BenchmarkAddLockContended-8   	  139353	     92412 ns/op
BenchmarkAddLockContended-8   	  149298	     77878 ns/op
BenchmarkAddLockContended-8   	  170970	     81339 ns/op
BenchmarkAddLockContended-8   	  185634	     61435 ns/op
BenchmarkAddLockContended-8   	  163879	     80373 ns/op
BenchmarkAddLockContended-8   	  194013	     83871 ns/op
BenchmarkAddLockContended-8   	  222698	     76021 ns/op
PASS
ok  	sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue	136.549s

With the same benchmark setup, the optimized version brings BenchmarkAddLockContended() down from ~1 ms/op to ~80 µs/op, showing a 10–15× improvement under high contention.

k8s-ci-robot · 2026-01-11T12:30:22Z

Hi @zach593. Thanks for your PR.

I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository.

alvaroaleman · 2026-01-12T01:35:07Z

/ok-to-test

Sorry for the other pings, I didn't see this PR before.

Executing the provided benchmark on main takes more than a 100 seconds for me on my m2 pro, so I agree there is clearly a problem. It is a bit surprising that it takes almost seven times more on my device as opposed to what you got but that difference probably doesn't really matter much, bottom line is its too slow.

Using the provided benchmark to compare this with #3416, I get these results:

$ benchstat result-buffer.txt result-twotrees.txt
goos: darwin
goarch: arm64
pkg: sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue
cpu: Apple M2 Pro
                    │ result-buffer.txt │            result-twotrees.txt            │
                    │      sec/op       │     sec/op      vs base                   │
AddLockContended-10        37.84µ ± 10%   4976.99µ ± 17%  +13052.54% (p=0.000 n=10)

Going further and comparing #3416 with the upstream workqueue, I get these results (worth noticing that the upstream q has a crazy high variance though, so something is weird there but I didn't manage to figure out what):

$ benchstat result-twotrees.txt ~/upstreamq.txt
goos: darwin
goarch: arm64
pkg: sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue
cpu: Apple M2 Pro
                    │ result-twotrees.txt │    /Users/alvaro/ratelimitingq.txt     │
                    │       sec/op        │    sec/op      vs base                 │
AddLockContended-10          4.977m ± 17%   19.837m ± 97%  +298.58% (p=0.023 n=10)

Upstream diff for the benchmark

diff --git a/staging/src/k8s.io/client-go/util/workqueue/rate_limiting_queue_test.go b/staging/src/k8s.io/client-go/util/workqueue/rate_limiting_queue_test.go
index 9bf22c4b9ab..0cbc928ad57 100644
--- a/staging/src/k8s.io/client-go/util/workqueue/rate_limiting_queue_test.go
+++ b/staging/src/k8s.io/client-go/util/workqueue/rate_limiting_queue_test.go
@@ -23,6 +23,33 @@ import (
 	testingclock "k8s.io/utils/clock/testing"
 )
 
+func BenchmarkAddLockContended(b *testing.B) {
+	q := NewTypedWithConfig(TypedQueueConfig[int]{
+		MetricsProvider: noopMetricsProvider{},
+	})
+	defer q.ShutDown()
+
+	for i := range 1000000 {
+		q.Add(i)
+	}
+
+	for range 1000 {
+		go func() {
+			for {
+				item, _ := q.Get()
+				time.Sleep(1 * time.Millisecond)
+				q.Done(item)
+			}
+		}()
+	}
+
+	for b.Loop() {
+		for i := range 1000 {
+			q.Add(i)
+		}
+	}
+}
+
 func TestRateLimitingQueue(t *testing.T) {
 	limiter := NewItemExponentialFailureRateLimiter(1*time.Millisecond, 1*time.Second)
 	queue := NewRateLimitingQueue(limiter).(*rateLimitingType[any])

So #3416 seems to be slower than this, but still somewhat faster than upstream? I am mostly just bringing that up because I'd prefer if we first got that one in, we could still do a version of this after if you want, since its effectively just trading a bit of mem for reduced lock contention.

/cc @sbueringer

alvaroaleman · 2026-01-12T01:46:01Z

pkg/controller/priorityqueue/priorityqueue.go

 	log logr.Logger
+
+	bufferLock sync.Mutex
+	buffer     map[addOpts][]T


Why not a slice that combins addOpts and key? That way we'd preserve order and it would simplify the code quite a bit

Updated. Using slices does simplify the code structure, but it comes at the cost of additional overhead, since each slice needs to carry AddOpts. Benchmarks show that introducing slices increases the overall cost by about 70%. However, given that the baseline latency is very low, this overhead should still be within an acceptable range.

Still with 1,000,000 preloaded items, 1,000 workers, 1ms reconcile cost.

use map:

$ go test sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue -bench=BenchmarkAddLockContended -run=^$ -v -benchtime=10s -count=10 goos: linux goarch: amd64 pkg: sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue cpu: 12th Gen Intel(R) Core(TM) i9-12900K BenchmarkAddLockContended BenchmarkAddLockContended-8 159739 72763 ns/op BenchmarkAddLockContended-8 266252 42454 ns/op BenchmarkAddLockContended-8 187652 74544 ns/op BenchmarkAddLockContended-8 278227 43991 ns/op BenchmarkAddLockContended-8 284910 42860 ns/op BenchmarkAddLockContended-8 290059 43863 ns/op BenchmarkAddLockContended-8 280312 42673 ns/op BenchmarkAddLockContended-8 281852 41656 ns/op BenchmarkAddLockContended-8 280782 46362 ns/op BenchmarkAddLockContended-8 266382 47360 ns/op PASS ok sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue 125.617s

use slice:

$ go test sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue -bench=BenchmarkAddLockContended -run=^$ -v -benchtime=10s -count=10 goos: linux goarch: amd64 pkg: sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue cpu: 12th Gen Intel(R) Core(TM) i9-12900K BenchmarkAddLockContended BenchmarkAddLockContended-8 153133 92869 ns/op BenchmarkAddLockContended-8 163017 100041 ns/op BenchmarkAddLockContended-8 214800 80015 ns/op BenchmarkAddLockContended-8 150663 86073 ns/op BenchmarkAddLockContended-8 213728 65746 ns/op BenchmarkAddLockContended-8 145710 94048 ns/op BenchmarkAddLockContended-8 195998 80824 ns/op BenchmarkAddLockContended-8 132121 87749 ns/op BenchmarkAddLockContended-8 165680 81779 ns/op BenchmarkAddLockContended-8 144142 85884 ns/op PASS ok sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue 155.121s

pkg/controller/priorityqueue/priorityqueue.go

zach593 · 2026-01-12T16:45:35Z

@alvaroaleman Indeed, #3416 appears to address the cost of spin, since it no longer needs to traverse every item. However, this optimization still seems to provide a significant speedup on top of #3416, so it may still have some value. Since you’d like to handle #3416 first, please ping me after it’s merged. I’ll update this PR accordingly, and then we can review it again.

sbueringer · 2026-01-14T09:30:45Z

@zach593 It is merged

pkg/controller/priorityqueue/priorityqueue.go

zach593 · 2026-01-15T17:53:10Z

updated benchmark result:

test function:

func BenchmarkAddLockContended(b *testing.B) {
	q := New[int]("")
	defer q.ShutDown()

	for i := range 1000000 {
		q.Add(i)
	}

	for range 1000 {
		go func() {
			for {
				item, _ := q.Get()
				time.Sleep(1 * time.Millisecond)
				q.Done(item)
			}
		}()
	}

	for b.Loop() {
		for i := range 1000 {
			q.Add(i)
		}
	}
}

upstream(5de4c4f):

$ go test sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue -bench=BenchmarkAddLockContended -run=^$ -v -benchtime=10s -count=10

goos: linux
goarch: amd64
pkg: sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue
cpu: 12th Gen Intel(R) Core(TM) i9-12900K
BenchmarkAddLockContended
BenchmarkAddLockContended-8   	   13590	    860324 ns/op
BenchmarkAddLockContended-8   	   12276	    962917 ns/op
BenchmarkAddLockContended-8   	   13140	    884829 ns/op
BenchmarkAddLockContended-8   	   10000	   1099948 ns/op
BenchmarkAddLockContended-8   	   12951	    957041 ns/op
BenchmarkAddLockContended-8   	   10384	   1097541 ns/op
BenchmarkAddLockContended-8   	   13532	    859918 ns/op
BenchmarkAddLockContended-8   	   12502	    933303 ns/op
BenchmarkAddLockContended-8   	   13098	    893137 ns/op
BenchmarkAddLockContended-8   	   13420	    898457 ns/op
PASS
ok  	sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue	119.886s

optimized:

$ go test sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue -bench=BenchmarkAddLockContended -run=^$ -v -benchtime=10s -count=10

goos: linux
goarch: amd64
pkg: sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue
cpu: 12th Gen Intel(R) Core(TM) i9-12900K
BenchmarkAddLockContended
BenchmarkAddLockContended-8   	  110494	     93792 ns/op
BenchmarkAddLockContended-8   	  118648	     84992 ns/op
BenchmarkAddLockContended-8   	  136893	     87722 ns/op
BenchmarkAddLockContended-8   	  139353	     92412 ns/op
BenchmarkAddLockContended-8   	  149298	     77878 ns/op
BenchmarkAddLockContended-8   	  170970	     81339 ns/op
BenchmarkAddLockContended-8   	  185634	     61435 ns/op
BenchmarkAddLockContended-8   	  163879	     80373 ns/op
BenchmarkAddLockContended-8   	  194013	     83871 ns/op
BenchmarkAddLockContended-8   	  222698	     76021 ns/op
PASS
ok  	sigs.k8s.io/controller-runtime/pkg/controller/priorityqueue	136.549s

With the same benchmark setup, the optimized version brings BenchmarkAddLockContended() down from ~1 ms/op to ~80 µs/op, showing a 10–15× improvement under high contention.

pkg/controller/priorityqueue/priorityqueue.go

alvaroaleman · 2026-01-15T17:57:25Z

pkg/controller/priorityqueue/priorityqueue.go

+			// primarily for unit testing purposes.
+			// Successfully adding a ready item may result in an additional call to handleReadyItems(),
+			// but the cost is negligible.
+			w.flushAddBuffer()


Should we also do this in handleWaitingItems?

I don’t think it’s necessary. Items won’t stay in the buffer for long, because once an item enters the buffer, handleAddBuffer() will be triggered. There might be a tiny delay, but that should be negligible for items carrying a non zero after.

In fact, in most real usage scenarios, adding a flush in Len() and handleReadyItems() is also unnecessary, since everything runs in an asynchronous environment. The only reason to keep it is to avoid odd behavior in synchronous setups and to make unit tests simpler. Otherwise, we would have to write Eventually(q.addBufferLen).Should(BeZero()) everywhere.

Signed-off-by: zach593 <zach_li@outlook.com>

alvaroaleman

/hold

lgtm

@sbueringer any comments?

k8s-ci-robot · 2026-01-15T19:36:34Z

LGTM label has been added.

Details

Git tree hash: 3d6913c3326dac3a935c8948958cf72f49b549d0

sbueringer · 2026-01-16T06:16:11Z

/assign

Will review later today

sbueringer · 2026-01-16T09:14:16Z

@zach593 Please update the PR description. I think it's probably slightly outdated after #3416

sbueringer · 2026-01-16T09:27:43Z

All good. Thank you very much!!

/hold cancel
/lgtm
/approve

k8s-ci-robot · 2026-01-16T09:27:51Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alvaroaleman, sbueringer, zach593

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [alvaroaleman,sbueringer]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

sbueringer · 2026-01-16T09:33:28Z

@zach593 Thank you very much for the feedback and the optimizations!

zach593 · 2026-01-16T15:40:05Z

@zach593 Please update the PR description. I think it's probably slightly outdated after #3416

@sbueringer I just updated the description.

k8s-ci-robot requested review from FillZpp and sbueringer January 11, 2026 12:30

k8s-ci-robot added cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 11, 2026

k8s-ci-robot added the size/L Denotes a PR that changes 100-499 lines, ignoring generated files. label Jan 11, 2026

k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Jan 12, 2026

alvaroaleman reviewed Jan 12, 2026

View reviewed changes

RainbowMango mentioned this pull request Jan 12, 2026

[Umbrella] Performance Optimization planned for v1.17 karmada-io/karmada#7072

Open

3 tasks

zach593 force-pushed the pq-buffer branch from 4db0843 to 6f04e4e Compare January 12, 2026 16:21

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. and removed size/L Denotes a PR that changes 100-499 lines, ignoring generated files. labels Jan 12, 2026

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 14, 2026

alvaroaleman reviewed Jan 14, 2026

View reviewed changes

pkg/controller/priorityqueue/priorityqueue.go Outdated Show resolved Hide resolved

zach593 force-pushed the pq-buffer branch from 6f04e4e to 5d39798 Compare January 15, 2026 17:22

k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Jan 15, 2026

zach593 force-pushed the pq-buffer branch 2 times, most recently from f4f4ee2 to 2e3ff1a Compare January 15, 2026 17:37

alvaroaleman reviewed Jan 15, 2026

View reviewed changes

Use a buffer to optimize priority queue AddWithOpts performance

c47f9cb

Signed-off-by: zach593 <zach_li@outlook.com>

zach593 force-pushed the pq-buffer branch from 2e3ff1a to c47f9cb Compare January 15, 2026 18:02

alvaroaleman approved these changes Jan 15, 2026

View reviewed changes

k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 15, 2026

k8s-ci-robot assigned alvaroaleman Jan 15, 2026

k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Jan 15, 2026

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Jan 15, 2026

k8s-ci-robot assigned sbueringer Jan 16, 2026

k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Jan 16, 2026

k8s-ci-robot merged commit 137b9c0 into kubernetes-sigs:main Jan 16, 2026
11 checks passed

This was referenced Jan 19, 2026

✨ Delay reconciliation until handlers sync #3406

Merged

Feature: Priority Queue #2374

Open

marvin-serp-bot mentioned this pull request Jan 26, 2026

chore(deps): update all non-major dependencies SwanseaUniversityMedical/slurm-operator#24

Open

1 task

Comments

Conversation

zach593 commented Jan 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Jan 11, 2026

Uh oh!

alvaroaleman commented Jan 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alvaroaleman Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

zach593 Jan 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

zach593 commented Jan 12, 2026

Uh oh!

sbueringer commented Jan 14, 2026

Uh oh!

Uh oh!

zach593 commented Jan 15, 2026

Uh oh!

Uh oh!

alvaroaleman Jan 15, 2026

Choose a reason for hiding this comment

Uh oh!

zach593 Jan 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alvaroaleman left a comment

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented Jan 15, 2026

Uh oh!

sbueringer commented Jan 16, 2026

Uh oh!

sbueringer commented Jan 16, 2026

Uh oh!

sbueringer commented Jan 16, 2026

Uh oh!

k8s-ci-robot commented Jan 16, 2026

Uh oh!

Uh oh!

sbueringer commented Jan 16, 2026

Uh oh!

zach593 commented Jan 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

zach593 commented Jan 11, 2026 •

edited

Loading

alvaroaleman commented Jan 12, 2026 •

edited

Loading

zach593 Jan 15, 2026 •

edited

Loading