-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29724: Backport "HBASE-25334: TestRSGroupsFallback.testFallback is flaky" into branch-2 and branch-2.6 #7472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This comment has been minimized.
This comment has been minimized.
… is flaky" into branch-2 and branch-2.6
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR backports HBASE-25334 from the master branch to fix a flaky test TestRSGroupsFallback.testFallback in branch-2 and branch-2.6. The fix addresses race conditions and improves test reliability by refining procedure state checking and removing unnecessary explicit waits.
- Enhances
ServerManager.areDeadServersInProgress()to only count unfinished ServerCrashProcedures - Improves
testCrashProcedureReplayto properly test procedure replay after simulated master crash - Simplifies
TestRSGroupsFallback.testFallbackby removing flaky wait logic for region server registration
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| hbase-server/src/main/java/org/apache/hadoop/hbase/master/ServerManager.java | Adds !p.isFinished() check to areDeadServersInProgress() to filter out completed procedures |
| hbase-server/src/test/java/org/apache/hadoop/hbase/master/TestDeadServer.java | Enhances procedure replay test with proper stop/restart sequence and completion verification |
| hbase-rsgroup/src/test/java/org/apache/hadoop/hbase/rsgroup/TestRSGroupsFallback.java | Removes explicit wait for default group server recognition and improves code clarity |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
This comment has been minimized.
This comment has been minimized.
taklwu
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 HBASE-25334 has previous ported partially, and this time we have this backport and #7463 to completely patch with using SCP as the source of dead in progress list
TestReplicationSource.testAbortTrueOnError should not be related to this change, let me retrigger another run to see if this is still failing
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
… is flaky" into branch-2 and branch-2.6 (apache#7472) Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
|
adding one more note as requested in (#7476) The change introduced in HBASE-25334 (commit 32c4432 on GitHub) has diverged between the master and branch-2 branches. This is a following up changes after HBASE-29720 and have the complete functional changes of HBASE-25334 that uses SCP as the source of dead servers in progress. |
…allback.testFallback is flaky" into branch-2 and branch-2.6 (apache#7472) apache#7476 The change in HBASE-25334 (introduced by commit 32c4432) has diverged between the master and branch-2 branches. This is a following up changes after HBASE-29720 and have the complete functional changes of HBASE-25334 that uses SCP as the source of dead servers in progress. Signed-off-by: Tak Lon (Stephen) Wu <taklwu@apache.org>
…allback.testFallback is flaky" into branch-2 and branch-2.6 (#7472) #7476 (#7476) The change in HBASE-25334 (introduced by commit 32c4432 in branch-2 and branch-2.6) has diverged between the master (#2728) and branch-2 branches. This is a following up changes after HBASE-29720 and have the complete functional changes of HBASE-25334 that uses SCP as the source of dead servers in progress. Signed-off-by: Istvan Toth <stoty@apache.org> Co-authored-by: Kevin Geiszler <kevin.j.geiszler@gmail.com>
Original Jira: https://issues.apache.org/jira/browse/HBASE-25334
Backport Jira task: https://issues.apache.org/jira/browse/HBASE-29724
This pull request is a cherry-pick from PR #2728.
Note
In this branch, the
TestRSGroupsFallback.javafile is currently part of thehbase-rsgroupmodule. However, in the master branch, this file is part ofhbase-server.