lazy cluster topology reload #3614

ndyakov · 2025-11-24T15:17:55Z

Introduces reload of topology in the cluster maint notifications

fixes ClusterOptions missing DialerRetries&DialerRetryTimeout #3594

…E-1626-cluster-topology-update

Copilot

Pull request overview

This PR introduces lazy cluster topology reload functionality to handle maintenance notifications, specifically addressing issue #3594. The implementation adds a queuing mechanism to prevent excessive cluster state reloads when multiple SMIGRATED notifications arrive in quick succession.

Added new connection options: DialerRetries, DialerRetryTimeout, MaxConcurrentDials, and PushNotificationProcessor across all client types
Implemented a lazy reload mechanism with cooldown period and pending request collapsing to optimize cluster topology reloads
Integrated SMIGRATED notification handling to trigger cluster state reloads when slot migrations occur

Reviewed changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
universal.go	Added new connection and notification handling options to UniversalOptions, propagated fields to Cluster(), Failover(), and Simple() methods
sentinel.go	Added new connection options to FailoverOptions with field propagation to clientOptions(), sentinelOptions(), and clusterOptions() methods, plus URL parameter parsing
osscluster.go	Added connection options and PushNotificationProcessor to ClusterOptions, implemented lazy reload with queuing/cooldown logic, and integrated SMIGRATED notification callback for cluster state management
osscluster_lazy_reload_test.go	Added comprehensive test coverage for LazyReload behavior including single reload, concurrent deduplication, pending reload queuing, error handling, and cascading notification scenarios
commands_test.go	Improved latency test reliability by increasing threshold and sleep duration to reduce flakiness, changed assertions to be more robust

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

osscluster_lazy_reload_test.go

osscluster.go

Copilot · 2025-12-02T13:59:40Z

osscluster.go

+	if opt.MaintNotificationsConfig != nil {
+		c.nodes.OnNewNode(func(nodeClient *Client) {
+			manager := nodeClient.GetMaintNotificationsManager()
+			if manager != nil {
+				manager.SetClusterStateReloadCallback(func(ctx context.Context, hostPort string, slotRanges []string) {
+					// Log the migration details for now
+					if internal.LogLevel.InfoOrAbove() {
+						internal.Logger.Printf(ctx, "cluster: slots %v migrated to %s, reloading cluster state", slotRanges, hostPort)
+					}
+					// Currently we reload the entire cluster state
+					// In the future, this could be optimized to reload only the specific slots
+					c.state.LazyReload()
+				})
+			}
+		})
+	}


The cluster state reload callback is only registered for new nodes via OnNewNode(). This means that any nodes that already exist when the ClusterClient is created will not have this callback registered, and SMIGRATED notifications from these existing nodes will not trigger cluster state reloads.

Consider also registering the callback for existing nodes, similar to how it's done in maintnotifications/e2e/notiftracker_test.go:

if opt.MaintNotificationsConfig != nil { // Register callback for existing nodes ctx := context.Background() _ = c.ForEachShard(ctx, func(ctx context.Context, nodeClient *Client) error { manager := nodeClient.GetMaintNotificationsManager() if manager != nil { manager.SetClusterStateReloadCallback(func(ctx context.Context, hostPort string, slotRanges []string) { // ... callback implementation }) } return nil }) // Register callback for new nodes c.nodes.OnNewNode(func(nodeClient *Client) { // ... existing implementation }) }

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

ndyakov and others added 4 commits November 24, 2025 17:17

lazy cluster topology reload

f3e9126

Merge branch 'ndyakov/feature/CAE-1313-maint-cluster' into ndyakov/CA…

6ed2349

…E-1626-cluster-topology-update

fix discrepancies between options structs

e9da546

Merge branch 'ndyakov/feature/CAE-1313-maint-cluster' into ndyakov/CA…

644a1aa

…E-1626-cluster-topology-update

ndyakov marked this pull request as ready for review December 2, 2025 13:56

ndyakov requested review from Copilot and ofekshenawa December 2, 2025 13:56

Copilot started reviewing on behalf of ndyakov December 2, 2025 13:56 View session

Copilot finished reviewing on behalf of ndyakov December 2, 2025 13:58

Copilot AI reviewed Dec 2, 2025

View reviewed changes

ndyakov and others added 2 commits December 2, 2025 16:00

Update osscluster_lazy_reload_test.go

fc05874

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update osscluster.go

f3bab0d

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

lazy cluster topology reload #3614

lazy cluster topology reload #3614

ndyakov commented Nov 24, 2025 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 2, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

lazy cluster topology reload #3614

Are you sure you want to change the base?

lazy cluster topology reload #3614

Conversation

ndyakov commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Copilot AI Dec 2, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ndyakov commented Nov 24, 2025 •

edited

Loading