-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDFS-15052. WebHDFS getTrashRoot leads to OOM due to FileSystem objec… #1758
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
The findbugs warning was addressed by HDFS-15048. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks good to me except for a very tiny comment.
trashRoot = new org.apache.hadoop.fs.Path( | ||
new org.apache.hadoop.fs.Path(USER_HOME_PREFIX, user), TRASH_PREFIX); | ||
} | ||
return trashRoot.toUri().getPath(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change assumes namenode's default fs is a DistributedFileSystem, which makes sense.
The change basically copies the implementation inside DistributedFileSystem#getTrashRoot(). If that method evolves in the future, the same changes should be applied in here too.
Can we add a comment inside DistributedFileSystem#getTrashRoot(), that any change should be added in NamenodeWebHdfsMethods#getTrashRoot() too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the comment. I moved the common logic into DFSUtilClient in the last commit. While added TestWebHDFS#testGetEZTrashRoot
could detect the future mismatch, it is not in the same subproject of DistributedFileSystem (hadoop-hdfs-client).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Discussed with @daryn-sharp on this. New features in the future might require additional logic in determining trash root. It is better to be done at the server-side. So, the Ideal solution would be to add a getTrashRoot()
namenode RPC method. DistributedFileSystem
would then have a fallback logic for compatibility with older servers. NamenodeWebHdfsMethods
would simply call this rpc method.
Also, use of Path
on NN is highly discouraged. Path is super-expensive and performs normalization that no other operation performs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From the the current implementation, we need to fix the memory leak and the fact it does not work with security enabled, and probably backport that across the active branches. The memory leak especially is a big problem.
Would it make sense to adopt this patch as it stands and push it to all the branches and then open another Jira to implement the namenode RPC suggestion on trunk as a new feature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am fine with an interim solution to fix the immediate issues. But use of Path
in NN is far from ideal. It will be best if this is addressed now than later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kihwal Thanks for the comment. I updated the code based on your suggestion.
💔 -1 overall
This message was automatically generated. |
Failed unit tests are not related to trash logic and passed on my local. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm. EZ root of "/" is the only special case we need to handle, as the path will be absolute path. So the logic seems fine.
💔 -1 overall
This message was automatically generated. |
The unit test failure is unrelated and caused by "BindException: Address already in use" |
Thanks. I merged this. trying to backport to relevant branches. |
…t creation. (#1758) (cherry picked from commit 2338d25) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java (cherry picked from commit 610805e) Conflicts: hadoop-hdfs-project/hadoop-hdfs-client/src/main/java/org/apache/hadoop/hdfs/DFSUtilClient.java hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/web/resources/NamenodeWebHdfsMethods.java hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java
…t creation. (apache#1758) (cherry picked from commit 2338d25) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java (cherry picked from commit 610805e)
…t creation. (apache#1758) (cherry picked from commit 2338d25) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java (cherry picked from commit 610805e)
…t creation. (apache#1758) (cherry picked from commit 2338d25) Conflicts: hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/web/TestWebHDFS.java (cherry picked from commit 610805e) Change-Id: I1ae0e46a18f4d367ee409db63ee98f0c9baf7720
No description provided.