-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDFS-16369. RBF: Fix the retry logic of RouterRpcServer#invokeAtAvailableNs. #3745
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💔 -1 overall
This message was automatically generated. |
@@ -668,14 +674,16 @@ public void testInvokeAtAvailableNs() throws IOException { | |||
// Make one subcluster unavailable. | |||
MiniDFSCluster dfsCluster = cluster.getCluster(); | |||
dfsCluster.shutdownNameNode(0); | |||
dfsCluster.shutdownNameNode(1); | |||
try { | |||
// Verify that #invokeAtAvailableNs works by calling #getServerDefaults. | |||
RemoteMethod method = new RemoteMethod("getServerDefaults"); | |||
FsServerDefaults serverDefaults = | |||
rpcServer.invokeAtAvailableNs(method, FsServerDefaults.class); | |||
assertNotNull(serverDefaults); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there something else we can assert? Any metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought of using the NamenodeMetrics, but it doesn't track getServerDefaults. Second is RBFClientMetrics. It tracks all the invokes for getServerDefault, irrespective of success and it isn't per namespace as well. So, in case the invocation order by any chance changes, from ns0->ns1->n2 to ns0->ns2->ns1 in that case the test will become flaky.
In general without the fix, the getServerDefault call will fail only if 2 NS is down, so I thought if the call is successful we can conclude things work. Let me know if you have any ideas around what we can assert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the flakiness is not ideal.
Let's go with this.
for (FederationNamespaceInfo ns : nss) { | ||
if (!nsId.equals(ns.getNameserviceId())) { | ||
namespaceInfos.add(ns); | ||
for (Iterator<FederationNamespaceInfo> it = nss.iterator(); it |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we are going to loop over, can we just do:
for (String nsId : nss)
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
🎊 +1 overall
This message was automatically generated. |
…ableNs. (apache#3745). Contributed by Ayush Saxena. Reviewed-by: litao <tomleescut@gmail.com> Reviewed-by: Inigo Goiri <inigoiri@apache.org>
Description of PR
Fixes the retry the logic of invokeAtAvailableNs (RBF
How was this patch tested?
Through UT
For code changes: