-
Notifications
You must be signed in to change notification settings - Fork 0
feat: controller sharding #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pkg/manager/manager.go
Outdated
| c() | ||
| } | ||
| for _, s := range starts { | ||
| go m.startForCluster(s.ctx, s.name, s.cl) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this happen in goroutines?
If there is the possibility of runnables blocking so long that this can become a problem then they could also cancel contexts of a newer tick because the access the .cancel of what is currently stored.
Actually the engaged would still be .started=true so there shouldn't be a second tick starting it.
Maybe wrap them in a wait group with a timeout, if runnables with long engagement times are a concern?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we might be okay because:
- we make the ownership decision under the lock, set started=true, and capture the specific cancel chosen there; that prevents double-starts and ensures any stop only targets the intended context;
- starts run in goroutines by design so a slow cluster doesn’t stall the tick; they’re based on the captured ctx, not whatever is current later;
- on re-engage we swap the engagement token (and cancel the old one), and cleanup only deletes if the token still matches — so a late cancel from the old ctx can’t remove the new engagement;
- actual work is still gated by the fence lease; even if a delayed start tried to run, it wouldn’t pass TryAcquire.
If we ever see edge cases, we can add a lightweight token check in startForCluster (drop stale starts) or an optional engage timeout, but the current shape should be safe and responsive.
Co-authored-by: Nelo-T. Wallus <10514301+ntnn@users.noreply.github.com>
embik
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have some more comments probably later, but these are the first few things I wanted to bring up.
| case err != nil: | ||
| return false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking out loud here, should we log errors here? I'm a little uncomfortable completely discarding them, but this would be debug logging at best I suppose.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Errors here are expected (conflicts/transient API issues) and this path is hot, so logging in TryAcquire would be noisy. I’d keep it silent and, if we want visibility, add a low-verbosity, rate-limited log (and/or metric) in the caller when TryAcquire returns false. We already throttle retries in the engine, so a V(2)+ log there would be controlled.
| // PerClusterLease controls whether we create one fence per cluster (true) or a | ||
| // single shared fence name (false). Per-cluster is recommended. | ||
| PerClusterLease bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this makes sense, isn't a single fence basically the same as running a global leader election?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure i'm following. Single fence (PerClusterLease = False) is one lease for the whole cluster fleet. One peer would manage all downstream clusters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, exactly. Is there a noticeable difference to using the built-in leader election of the manager if it's a "winner takes it all" kind of situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question—PerClusterLease=false does behave like “winner takes all,” but it’s scoped to the multicluster engine and integrates with our peer registry (IDs/weights) and lease naming, so you can later flip to per-cluster fencing without changing your wiring. Built-in manager leader election is great if you truly want a single active process for everything; the single-fence option is for MC-only exclusivity, peer-aware behavior, and an easy path to sharding later. We default to per-cluster; single-fence remains an advanced, opt-in mode.
Co-authored-by: Marvin Beckers <mail@embik.me>
| "sigs.k8s.io/controller-runtime/pkg/client" | ||
| ) | ||
|
|
||
| // leaseGuard manages a single Lease as a *fence* for one shard/cluster. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@embik wondering, how does this scale with O(1000) workspaces in case of kcp? 1000 leases gives a lot of server traffic to renew.
In other words, maybe should be pluggable? Thinking of doing load-balancing in buckets for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This doesn't have to happen in this MR as long as there is some transition we can come up with. Now, this lease logic is active automatically for every controller?
mjudeikis
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just nit. But will defer to @sttts for full review.
What
Why
How to try
Follow the README (examples/sharded-namespace) for build/deploy/observe steps.
Changes