-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Redis Enterprise Maintenance Events: Comprehensive Functional Testing #3461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Redis Enterprise Maintenance Events: Comprehensive Functional Testing #3461
Conversation
…rise maintenance events - Add ConnectionTesting class with 9 test scenarios for maintenance handoff behavior - Test old connection graceful shutdown during MOVING operations - Validate traffic resumption with autoconnect after handoff - Verify maintenance notifications only work with RESP3 protocol - Test new connection establishment during migration and bind phases - Add memory leak validation for multiple concurrent connections - Include TLS support testing for maintenance events - Replace .supportMaintenanceEvents(true) with MaintenanceEventsOptions.enabled() - Add comprehensive monitoring and validation of connection lifecycle Tests cover CAE-1130 requirements for Redis Enterprise maintenance event handling including connection draining, autoconnect behavior, and notification delivery.
…IONS - connectionHandshakeIncludesEnablingNotificationsTest: Verifies all 5 notification types (MOVING, MIGRATING, MIGRATED, FAILING_OVER, FAILED_OVER) are received when maintenance events are enabled - disabledDontReceiveNotificationsTest: Verifies no notifications received when maintenance events are disabled - clientHandshakeWithEndpointTypeTest: Tests CLIENT MAINT_NOTIFICATIONS with 'none' endpoint type (nil IP scenario) - clientMaintenanceNotificationInfoTest: Verifies CLIENT MAINT_NOTIFICATIONS configuration with moving-endpoint-type Based on CLIENT MAINT_NOTIFICATIONS implementation from commit bd408cf
- Update push notification patterns to include sequence numbers (4-element format) - Fix MOVING notification parsing to handle new address format with sequence and time - Update MIGRATING, MIGRATED, FAILING_OVER, and FAILED_OVER patterns with sequence numbers - Improve FaultInjectionClient status handling: change from 'pending' to 'running' checks - Enhance JSON response parsing with better output field handling and debugging - Remove deprecated maintenance sequence functionality and associated unit test - Add test phase isolation to prevent cleanup notification interference - Extend monitoring timeout from 2 to 5 minutes for longer maintenance operations - Add @AfterEach cleanup to restore cluster state between tests - Remove hardcoded optimal node selection logic in RedisEnterpriseConfig This aligns with the updated Redis Enterprise maintenance events specification and improves test reliability by handling the new notification protocol format.
…raised during review
…and-connection-testing-of-maint-events-2 Ci fix functional handoff and connection testing of maint events 2
…ow, also implement more fixes and improvements
…fixes from review, started removin comments
…nctionally correct, enable logging for testing
|
Hi, I’m Jit, a friendly security platform designed to help developers build secure applications from day zero with an MVS (Minimal viable security) mindset. In case there are security findings, they will be communicated to you as a comment inside the PR. Hope you’ll enjoy using Jit. Questions? Comments? Want to learn more? Get in touch with us. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR adds comprehensive functional testing for Redis Enterprise maintenance event handling in Lettuce, with a focus on connection lifecycle management, notification validation, and timeout relaxation. The test suite validates proper client behavior during database migrations and failover scenarios.
Key changes:
- New comprehensive test class
MaintenanceNotificationConnectionTestfor testing all 5 maintenance notification types with connection management - Enhanced timeout configuration tests with 4-phase validation and optimized execution
- Complete removal of legacy test classes and their consolidation into improved integration tests
Reviewed Changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| src/test/resources/log4j2-test.xml | Enhanced logging configuration for maintenance-aware components |
| src/test/java/io/lettuce/test/ConnectionTestUtil.java | Added MaintenanceAwareExpiryWriter support and connection verification utilities |
| src/test/java/io/lettuce/scenario/RelaxedTimeoutConfigurationTest.java | Comprehensive 4-phase timeout testing with optimized execution and connection handoff |
| src/test/java/io/lettuce/scenario/RedisEnterpriseConfig.java | Dynamic cluster configuration with 6-node support and endpoint-aware operations |
| src/test/java/io/lettuce/scenario/MaintenancePushNotificationMonitor.java | Simplified monitoring without periodic pings and correct notification format parsing |
| src/test/java/io/lettuce/scenario/MaintenanceNotificationTest.java | Removed - functionality migrated to MaintenanceNotificationConnectionTest |
| src/test/java/io/lettuce/scenario/MaintenanceNotificationConnectionTest.java | New comprehensive test suite for all maintenance scenarios with connection lifecycle validation |
| src/test/java/io/lettuce/scenario/FaultInjectionClientUnitTest.java | Removed - consolidated into integration tests |
| src/test/java/io/lettuce/scenario/FaultInjectionClient.java | Enhanced with endpoint-aware operations and improved status checking |
| src/test/java/io/lettuce/scenario/ConnectionEventBusMonitoringUtil.java | New utility for monitoring connection state transitions via EventBus |
Comments suppressed due to low confidence (1)
src/test/java/io/lettuce/scenario/MaintenanceNotificationConnectionTest.java:1
- Using fixed
Thread.sleep()delays in tests can lead to flaky behavior and unnecessarily slow test execution. Consider using condition-based waiting or shorter, more appropriate delays for these specific scenarios.
package io.lettuce.scenario;
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
| log.info("Testing unrelaxed timeouts after MIGRATED notification..."); | ||
| log.info("Waiting for grace period ({}ms) to allow timeout un-relaxation to take effect...", | ||
| 2 * RELAXED_TIMEOUT_ADDITION.toMillis()); | ||
| Thread.sleep(2 * RELAXED_TIMEOUT_ADDITION.toMillis()); // Wait for grace period + 100ms buffer |
Copilot
AI
Oct 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Thread.sleep() for timing in tests can lead to flaky behavior. Consider using Awaitility or similar polling mechanisms to wait for the actual condition (timeout un-relaxation) rather than relying on fixed sleep durations.
| log.info("Waiting for grace period ({}ms) to allow timeout un-relaxation to take effect...", | ||
| 2 * RELAXED_TIMEOUT_ADDITION.toMillis()); | ||
| Thread.sleep(2 * RELAXED_TIMEOUT_ADDITION.toMillis()); // Wait for grace period + 100ms buffer | ||
| log.info("Grace period expired, starting traffic to verify unrelaxed timeouts..."); |
Copilot
AI
Oct 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another instance of Thread.sleep() that could make tests flaky. This should be replaced with a condition-based wait mechanism to ensure the timeout un-relaxation has actually taken effect before proceeding with verification.
| log.info("Waiting for grace period ({}ms) to allow timeout un-relaxation to take effect...", | |
| 2 * RELAXED_TIMEOUT_ADDITION.toMillis()); | |
| Thread.sleep(2 * RELAXED_TIMEOUT_ADDITION.toMillis()); // Wait for grace period + 100ms buffer | |
| log.info("Grace period expired, starting traffic to verify unrelaxed timeouts..."); | |
| log.info("Waiting for timeout un-relaxation to take effect (up to {}ms)...", | |
| 2 * RELAXED_TIMEOUT_ADDITION.toMillis()); | |
| waitForTimeoutUnrelaxed(context, 2 * RELAXED_TIMEOUT_ADDITION.toMillis()); | |
| log.info("Timeout un-relaxation detected, starting traffic to verify unrelaxed timeouts..."); |
| } | ||
|
|
||
| // Throttle traffic to ~1000 ops/sec to avoid memory pressure | ||
| Thread.sleep(1); |
Copilot
AI
Oct 1, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using Thread.sleep(1) in a tight loop for traffic throttling is inefficient and imprecise. Consider using a proper rate limiter or a more accurate timing mechanism for controlling traffic generation at ~1000 ops/sec.
tishun
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, small notes only
| import io.netty.util.internal.logging.InternalLogger; | ||
| import io.netty.util.internal.logging.InternalLoggerFactory; | ||
|
|
||
| public class ConnectionEventBusMonitoringUtil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's avoid the "Util" pattern. Utility classes quickly become swiss-army-knives and violate the single responsibility principle.
Could we just name it ConnectionEventBusMonitor or something similar?
| } catch (Exception e) { | ||
| if (event instanceof ConnectedEvent) { | ||
| return "connected-" + ((ConnectedEvent) event).remoteAddress().toString(); | ||
| } else if (event instanceof DisconnectedEvent) { | ||
| return "disconnected-" + ((DisconnectedEvent) event).remoteAddress().toString(); | ||
| } else { | ||
| return event.getClass().getSimpleName() + "-" + System.currentTimeMillis(); | ||
| } | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of if-else and instanceof can we use multiple catch statements?
| expectedEndpoint, cleanCurrentEndpoint).isTrue(); | ||
| } | ||
| } else { | ||
| log.warn("⚠ Could not verify endpoint - currentRemoteAddress: {}, expectedEndpoint: {}", currentRemoteAddress, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this fail the test?
| log.info("Second connection created and monitoring setup completed"); | ||
|
|
||
| } catch (Exception e) { | ||
| log.error("Failed to create second connection: {}", e.getMessage(), e); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same question here - should the test fail?
| /** | ||
| * Configuration holder for dynamically discovered Redis Enterprise cluster information. | ||
| */ | ||
| public class RedisEnterpriseConfig { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RedisEnterpriseConfig - a "Config" object is typically a POJO that has some configuration details in it. In contrast this class seems to be configuring the RS
So I'd put some other end to the name like "Configurer" or idk
|
I will take care of Tisho's comments on the topic in next PR. |
Summary
This PR adds comprehensive functional testing for Redis Enterprise maintenance event handling in Lettuce, including connection handoff, timeout relaxation, and notification protocol validation. The test suite validates proper client behavior during database migrations and failover scenarios.
Key Changes
New Test Infrastructure
MaintenanceNotificationConnectionTestConnectionEventBusMonitoringUtilEnhanced Tests
RelaxedTimeoutConfigurationTestConnectionTestUtilRedisEnterpriseConfigRemoved
MaintenanceNotificationTest- All functionality migrated toMaintenanceNotificationConnectionTestwith improved test structureFaultInjectionClientUnitTest- Consolidated into integration testsTest Coverage
The test suite validates:
Performance Improvements
Testing
All tests pass successfully with proper logging enabled for debugging purposes.