Prioritise leadership changes in cc #3955

jackkleeman · 2025-11-06T13:16:56Z

On cc startup, bulk load pp epoch state up front (was previously a todo)
Make leadership selections based on current membership for all partitions before moving on to making membership reconfigurations. This allows us to fail over leadership without doing any metadata operations

jackkleeman · 2025-11-06T13:17:59Z

crates/admin/src/cluster_controller/service/scheduler.rs

+        .try_filter_map(
+            async |(partition_id, partition_state)| match partition_state {
+                Some(partition_state) => {
+                    Self::note_observed_membership_update(


unsure if this is needed. it would be set when inserting/loading initial config, even if that initial config hasnt changed from what was already in metadata. in this code path we are only loading whats in metadata and not doing any mutation. my thinking is it doesn't hurt to update the observed membership.

jackkleeman · 2025-11-06T13:25:48Z

crates/admin/src/cluster_controller/service/scheduler.rs

+        // instructing a new leader when we already have the metadata requires no new metadata operations and can be done nearly instantly
+        // by comparison, ensure_valid_partition_configuration can take (metadata operation latency * 2 * affected partitions)
+        // which might be several seconds, and leader instruction would only happen at the end.
+        self.ensure_valid_leaders(cluster_state, legacy_cluster_state, partition_table);


this is no longer happening after ensure_valid_partition_configuration, which means that we have a less recent load of the epoch metadata to base our membership decision off of. the benefit of that is that we don't need any metadata reads to react to the leader going down.

previously the leader selection would be using a load from the same on_cluster_state_change iteration, whereas now we are using a load from the previous iteration (unless this is the first iteration). however in either case, we can have outdated metadata if someone other than us is mutating it, and i guess the leadership instruct will just be ignored in that case.

github-actions · 2025-11-06T13:59:37Z

Test Results

7 files ±0 7 suites ±0 2m 29s ⏱️ -6s
47 tests ±0 47 ✅ ±0 0 💤 ±0 0 ❌ ±0
200 runs ±0 200 ✅ ±0 0 💤 ±0 0 ❌ ±0

Results for commit 7d930db. ± Comparison against base commit 2ca7864.

♻️ This comment has been updated with latest results.

jackkleeman commented Nov 6, 2025

View reviewed changes

jackkleeman force-pushed the jk/tmnlkrpnxmnq branch 2 times, most recently from 917ea1c to 42ead35 Compare November 6, 2025 13:32

Prioritise leadership changes in cc

7d930db

jackkleeman force-pushed the jk/tmnlkrpnxmnq branch from 42ead35 to 7d930db Compare November 6, 2025 13:43

jackkleeman marked this pull request as ready for review November 6, 2025 14:45

jackkleeman requested a review from tillrohrmann November 6, 2025 14:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Prioritise leadership changes in cc #3955

Prioritise leadership changes in cc #3955

jackkleeman commented Nov 6, 2025

Uh oh!

jackkleeman Nov 6, 2025 •

edited

Loading

Uh oh!

jackkleeman Nov 6, 2025

Uh oh!

github-actions bot commented Nov 6, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Prioritise leadership changes in cc #3955

Are you sure you want to change the base?

Prioritise leadership changes in cc #3955

Conversation

jackkleeman commented Nov 6, 2025

Uh oh!

jackkleeman Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jackkleeman Nov 6, 2025

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Nov 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jackkleeman Nov 6, 2025 •

edited

Loading

github-actions bot commented Nov 6, 2025 •

edited

Loading