-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-22964 Fix flaky TestClusterRestartFailover and TestClusterResta… #574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💔 -1 overall
This message was automatically generated. |
We need to know what is the purpose of the test. The current fix looks like we just give up testing if some conditions are not match... |
It was added by HBASE-21565: Delete dead server from dead server list too early leads to concurrent Server Crash Procedures(SCP) for a same server. So the test purpose is assertFalse(UTIL.getHBaseCluster().getMaster().getServerManager().expireServer(testServer)). It should not submit SCP again if there arleady had a SCP for this server. |
Updated the patch. Will test the submit SCP again when serverNode is null. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still not fully understand this UT here. I guess we need to find a way to make sure that the serverNode is not null?
LOG.info("start to find the procedure of SCP for the severName we choose"); | ||
Procedure<?> procedure = UTIL.getHBaseCluster().getMaster().getProcedures().stream().filter( | ||
p -> (p instanceof ServerCrashProcedure) && | ||
((ServerCrashProcedure) p).getServerName().equals(testServer)).findAny().get(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Optional.get will never return null, it will throw a NoSuchElementException...
Checked HBASE-21565. I thought this ut is added to test should submit SCP for a same server twice. Need to mock SCP to stuck it if make sure that serverNode is not null. |
Yes, this is possible. You can see TestSCPGetRegionsRace, where we override AssignmentManager and hang the SCP when it calls getRegionsOnServer. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1.
💔 -1 overall
This message was automatically generated. |
…rtFailoverSplitWithoutZk
…rtFailoverSplitWithoutZk (#574) Signed-off-by: Duo Zhang <zhangduo@apache.org>
…rtFailoverSplitWithoutZk (#574) Signed-off-by: Duo Zhang <zhangduo@apache.org>
…rtFailoverSplitWithoutZk (apache#574) Signed-off-by: Duo Zhang <zhangduo@apache.org>
…rtFailoverSplitWithoutZk (apache#574) Signed-off-by: Duo Zhang <zhangduo@apache.org> (cherry picked from commit 5e98129) Change-Id: Ib1f0549c24d2b81559e33bc22d4096b246a1e37a
…rtFailoverSplitWithoutZk