Fix flaky pubsub tests with retry mechanism for PUBLISH/SPUBLISH commands #4384
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem
The tests
test_async_cluster_restore_resp3_pubsub_state_after_scale_out
andtest_async_cluster_restore_resp3_pubsub_state_passive_disconnect
were failing intermittently due to a race condition during cluster topology changes.The issue occurred when:
Ok(Value::Int(1))
instead of the expectedOk(Value::Int(2))
, indicating only 1 of 2 expected subscribers received the messageSolution
Added robust retry logic with exponential backoff for PUBLISH and SPUBLISH operations during cluster topology changes:
retry_publish_until_expected_subscribers
andretry_spublish_until_expected_subscribers
Changes
test_cluster_async.rs
ClusterConnection
instead ofMultiplexedConnection
)Testing
test_cluster_async.rs
continue to passThe fix ensures tests are resilient to timing variations during cluster topology changes while maintaining the same test coverage and expectations.
Fixes #3827.
Warning
Firewall rules blocked me from connecting to one or more addresses
I tried to connect to the following addresses, but was blocked by firewall rules:
ask_with_extra_nodes
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)esm.ubuntu.com
/usr/lib/apt/methods/https
(dns block)foo
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)node
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)rebuild_with_extra_nodes
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)refresh_topology_client_init
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)refresh_topology_moved
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_can_be_created_with_partial_slot_coverage
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_do_not_retry_when_receiver_was_dropped
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_dont_route_to_a_random_on_non_key_based_cmd
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_fan_out_and_aggregate_logical_array_response
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_fan_out_and_aggregate_numeric_response
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_fan_out_and_return_all_succeeded_response
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_fan_out_and_return_one_succeeded_response
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_non_retryable_io_error_should_not_retry
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_pass_errors_from_split_multi_shard_command
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_read_from_primary_when_primary_loading
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_reconnect_even_with_zero_retries
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_reroute_from_replica_if_in_loading_state
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_reset_routing_if_redirect_fails
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_retry_safe_io_error_should_be_retried
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_route_according_to_passed_argument
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_route_to_random_on_key_based_cmd
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_saves_reconnected_connection
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)test_async_cluster_update_slots_based_on_moved_error_no_change
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)tryagain
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)tryagain_exhaust_retries
/home/REDACTED/work/valkey-glide/valkey-glide/glide-core/redis-rs/target/debug/deps/test_cluster_async-5f1c548054fb4f2b --nocapture
(dns block)If you need me to access, download, or install something from one of these locations, you can either:
💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.