-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-28065 Corrupt HFile data is mishandled in several cases (branch-2.5) #5384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HBASE-28065 Corrupt HFile data is mishandled in several cases (branch-2.5) #5384
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Show resolved
Hide resolved
4e60a26
to
0457e83
Compare
Added a new unit test that relies on a LocalFileSystem to mess with the hfile under the covers. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Show resolved
Hide resolved
HFile.Reader reader = null; | ||
HFileBlock.BlockIterator iter = null; | ||
try { | ||
reader = HFile.createReader(hfs, hfsPath, CacheConfig.DISABLED, true, conf); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like there's a couple ways to read through an HFile -- this is roughly how the HFilePrettyPrinter
does it. Is this how we want to do it in this test?
} | ||
|
||
/** | ||
* Enables writing and rewriting portions of the file backing an {@link HFileBlock}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This implementation assumes a LocalFileSystem
. I haven't yet looked at how to corrupt a header underneath a data node.
} | ||
|
||
@Test | ||
public void testBlockMagicCorruptionFirstBlock() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First and second block access patterns are slightly different because the caller has slightly more block location data due to the next-header over-read. Thus they have different failure modes ; thus they're tested in two different test methods.
} | ||
|
||
@Test | ||
public void testUncompressedSizeWithoutHeaderCorruption() throws Exception { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I stubbed out test methods for all the Header fields, but, if we're careful, we don't need to test all of them because only a couple fields are read before the header checksum has been validated. Once the dust settles, I'll comment the test class with notes about which headers are tested and why, dropping the rest.
byte[] family = Bytes.toBytes("f"); | ||
try (HFile.Writer writer = factory.create()) { | ||
for (int i = 0; i < 40; i++) { | ||
byte[] row = RandomKeyValueUtil.randomOrderedFixedLengthKey(rand, i, 100); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this could be encapsulated in a RandomCellGenerator kind of class.
0457e83
to
e68f2da
Compare
Looks like |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To the extent that I understand the issues here, this seems good to me.
202af84
to
d0b6988
Compare
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great! I have a few small comments/questions just to dot all of our i's
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Outdated
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Show resolved
Hide resolved
hbase-server/src/main/java/org/apache/hadoop/hbase/io/hfile/HFileBlock.java
Outdated
Show resolved
Hide resolved
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Checkstyle warning is,
It's counting the entire body of the method, which is comprised more of line comments than code, in that tally. |
The test failed due to an IllegalArgumentException that had no message? I wish for an ExceptionMatcher class that prints the exception's stacktrace in the failure description.
|
This one is identical, I just missed converting the asserts to use matchers -- a null for the IllegalArgumentException exception message.
|
Hmm the test run on #5398 failed identically. |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
9254e9d
to
2985bfa
Compare
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
I think that the test failure is from a difference in JDK implementation -- Jenkins is running Eclipse Adoptium-11.0.17+8 while I'm testing on Eclipse Adoptium-11.0.18+10. Both versions throw an |
2985bfa
to
cb061b0
Compare
For reference, the two stack traces look like,
vs.
|
Wait, the failing stack trace doesn't include module information in the JDK path entires. Was that faliing test run on JDK8? |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good find on the jvm test issue. LGTM
Also of note, we've deployed this on about 90% of our fleet
Failures on java8 look unrelated. The new tests were run and passed on both JVM environments. |
* when no block size is provided and there's not a preread headerBuf, treat the value with caution. * verify HBase checksums before making use of the block header. * inline verifyOnDiskSizeMatchesHeader to keep throw/return logic in the method body. * separate validation of onDiskSizeWithHeader as input parameter from as read from block header * simplify branching around fetching and populating onDiskSizeWithHeader. * inline retrieving nextOnDiskBlockSize ; add basic validation. * whenever a read is determined to be corrupt and fallback to HDFS checksum is necessary, also invalidate the cached value of headerBuf. * build out a test suite covering various forms of block header corruption, for blocks in first and second positions. Signed-off-by: Bryan Beaudreault <bbeaudreault@apache.org>
cb061b0
to
aace0a4
Compare
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
Uh oh!
There was an error while loading. Please reload this page.