-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-16769. LocalDirAllocator to provide diagnostics when file creation fails #1768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: trunk
Are you sure you want to change the base?
Conversation
…n file creation is not successuful
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/LocalDirAllocator.java
Show resolved
Hide resolved
…n file creation is not successuful
…n file creation is not successuful
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now, there's another failure mode: no write access to any of the specified dirs.
If that's the situation, we should be able to identify and report it as well. Indeed, now that we have being helpful on disk capacity -we probably need to differentiate the other failure mode to avoid confusion. Otherwise people who encounter permissions problems we misled into thinking its disk capacity.
Not sure the best approach here. We would probably need to keep the DiskErrorException from createPath() and use that as the inner cause of the new failure. That is: move the catch() clause up to the caller where the exception can be cached for later use.
This is probably broadly useful as the other failure modes which are probably worth reporting.
Goal: there's enough information in the strings and stack traces to identify the root cause without having to chase into the logs of machines.
String dir0 = buildBufferDir(ROOT, 0); | ||
String dir1 = buildBufferDir(ROOT, 1); | ||
conf.set(CONTEXT, dir0 + "," + dir1); | ||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use LambaTestUtils.intercept, which will do the catch, reporting on failure and error message checks
@@ -532,4 +532,23 @@ public void testGetLocalPathForWriteForInvalidPaths() throws Exception { | |||
} | |||
} | |||
|
|||
/** | |||
* Test to check the LocalDirAllocation for the less space HADOOP-16769 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add trailing "." for javadocs
String dir1 = buildBufferDir(ROOT, 1); | ||
conf.set(CONTEXT, dir0 + "," + dir1); | ||
try { | ||
dirAllocator.getLocalPathForWrite("p1/x", 3_000_000_000_000L, conf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's going to happen on a disk with >3TB of capacity? Should we go for a bigger number?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I missed updating the pull request. I have a new code change that uses Long.MAX_VALUE, instead of this hardcoded number and then use a regex to match the error message. I will create a new pull request
I will address the above change to catch and throw the two errors, so they are nested into one another, as part of the new pull request. Thanks! |
💔 -1 overall
This message was automatically generated. |
No description provided.