-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDFS-15417. RBF: Get the datanode report from cache for federation WebHDFS operations #2080
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @NickyYe for the very useful fix.
LocatedBlock location0 = locations.get(0); | ||
return bestNode(location0.getLocations(), excludes); | ||
} | ||
} | ||
} | ||
|
||
if (dns == null) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this is always true.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I follow the previous logic now with cached DN report
if (dns == null) { | ||
dns = getDatanodeReport(router); | ||
} | ||
|
||
return getRandomDatanode(dns, excludes); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, excludes
is always empty. We need to compute it using the dns
obtained.
/** | ||
* Get the datanode report from all namespaces that are registered | ||
* and active in the federation. | ||
* @param router Router from which to get the report. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: leave a blank line before the @param
line.
private static DatanodeInfo[] getDatanodeReport( | ||
final Router router) throws IOException { | ||
// We need to get the DNs as a privileged user | ||
final RouterRpcServer rpcServer = getRPCServer(router); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we wrap these in the try .. catch
block? and what is the return value if rpcServer.getDatanodeReport
fail? null?
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
@@ -18,6 +18,8 @@ | |||
package org.apache.hadoop.hdfs.server.federation.router; | |||
|
|||
import static org.apache.hadoop.fs.CommonConfigurationKeysPublic.HADOOP_SECURITY_AUTHORIZATION; | |||
import static org.apache.hadoop.hdfs.server.federation.metrics.NamenodeBeanMetrics.DN_REPORT_CACHE_EXPIRE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will cause compilation error. Also I'm wondering whether it makes sense to move the DN cache logic from NamenodeBeanMetrics
to here and have the former to depend on this. This way we don't have to keep two copies of cache.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching this. I was working on an old branch, now it is fixed. I prefer to move the NamenodeBeanMetrics logic by a separate change. Let's make this change cohesive. NamenodeBeanMetrics cached report needs a different API or at least some extra work since it needs a String.
...-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterWebHdfsMethods.java
Show resolved
Hide resolved
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @NickyYe for the update. Yes we can separate the report cache consolidation as a follow-up. Could you create a JIRA for that? Thanks.
...-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
Show resolved
Hide resolved
* @return List of datanodes. | ||
* @throws IOException If it cannot get the report. | ||
*/ | ||
public DatanodeInfo[] getCachedDatanodeReport(DatanodeReportType type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this can be package-private?
} | ||
|
||
private DatanodeInfo[] getCachedDatanodeReportImpl | ||
(final DatanodeReportType type) throws IOException{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: space after {
...-hdfs-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterRpcServer.java
Show resolved
Hide resolved
final DatanodeReportType type, DatanodeInfo[] oldValue) | ||
throws Exception { | ||
ListenableFuture<DatanodeInfo[]> listenableFuture = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: variable listenableFuture
is redundant - you can just return from submit
call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the comments. I've addressed all of them.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. @goiri @Hexiaoqiao do you also want to check this before I committing it?
Thanks @sunchao involve me here. In my internal version I try to turn off |
💔 -1 overall
This message was automatically generated. |
Thanks @Hexiaoqiao . Similarly, in our case we also disabled this for router metrics but need webhdfs on router. Relying on @NickyYe thanks for addressing the checkstyle issue. Could you also add a unit test as well? I think we need one for |
💔 -1 overall
This message was automatically generated. |
Any update @NickyYe ? we are pretty close to get this in with the UT. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @NickyYe for updating the PR.
assertArrayEquals(datanodeReport1, datanodeReport); | ||
|
||
// Add one datanode | ||
getCluster().getCluster().startDataNodes(getCluster().getCluster().getConfiguration(0), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we need to clear state after the test case so that it won't affect others like testNamenodeMetrics
.
Also long line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks. @sunchao
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merged. Thanks @NickyYe for the contribution and @Hexiaoqiao for helping review! |
…deration WebHDFS operations (apache#2080)
…bHDFS operations (apache#2080)
https://issues.apache.org/jira/browse/HDFS-15417
NOTICE
Please create an issue in ASF JIRA before opening a pull request,
and you need to set the title of the pull request which starts with
the corresponding JIRA issue number. (e.g. HADOOP-XXXXX. Fix a typo in YYY.)
For more details, please see https://cwiki.apache.org/confluence/display/HADOOP/How+To+Contribute