-
Notifications
You must be signed in to change notification settings - Fork 8.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-16479. EC: NameNode should not send a reconstruction work when the source datanodes are insufficient #4138
Conversation
…he source datanodes are insufficient
💔 -1 overall
This message was automatically generated. |
// skip if source datanodes for reconstructing ec block are not enough | ||
if (block.isStriped()) { | ||
BlockInfoStriped stripedBlock = (BlockInfoStriped) block; | ||
if (stripedBlock.getDataBlockNum() > srcNodes.length) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a very quick look.
Just thinking about a scenario with say RS-6-3-1024k, and we just write 1 mb, in that case the total number of blocks available will be 1 Datablock + 3 Parity. In that case BG itself will have total 4 Blocks. Will this code start returning null? Not sure if getRealDataBlockNum
helps here or not. If it is actually a problem
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ayushtkn Thanks for your review. You're right, it's a problem.
I updated the PR to calculate the real data block number. It is the same logic used in StripedReader. I also added one more unit test to cover the case.
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanx @tasanuma for the fix, the changes makes sense to me, dropped some minor comments.
int cellsNum = (int) ((stripedBlock.getNumBytes() - 1) / stripedBlock.getCellSize() + 1); | ||
int minRequiredSources = Math.min(cellsNum, stripedBlock.getDataBlockNum()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this logic same as BlockInfoStriped.getRealDataBlockNum() can we use or extract the logic from there? or do some refactoring there, just trying if we can keep the logic at one place, in case there is some issue in the logic changing at one places fixes all the places..
int minRequiredSources = Math.min(cellsNum, stripedBlock.getDataBlockNum()); | ||
if (minRequiredSources > srcNodes.length) { | ||
LOG.debug("Block {} cannot be reconstructed due to shortage of source datanodes ", block); | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we increment the metrics before returning null
NameNode.getNameNodeMetrics().incNumTimesReReplicationNotScheduled();
|
||
// striped blockInfo: 2 data blocks + 2 paritys | ||
Block aBlock = new Block(blockId, ecPolicy.getCellSize() * (ecPolicy.getNumDataUnits() - 1), 0); | ||
BlockInfoStriped aBlockInfoStriped = new BlockInfoStriped(aBlock, ecPolicy); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Can you use a better variable name, couldn't decode what does a
stands for, or drop a comment above.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I updated the variable name. I want to keep the comment to clarify the difference between testSkipReconstructionWithManyBusyNodes
and testSkipReconstructionWithManyBusyNodes2
.
ErasureCodingPolicy ecPolicy = | ||
SystemErasureCodingPolicies.getPolicies().get(1); | ||
|
||
// striped blockInfo: 2 data blocks + 2 paritys |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo paritys
@ayushtkn Thanks for your reviews. I update the PR addressing your comments. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
@ayushtkn Thanks for your review! I'll merge it. |
…he source datanodes are insufficient (apache#4138)
Upstream JIRA ID: HDFS-16479. EC: NameNode should not send a reconstruction work when the source datanodes are insufficient (apache#4138) (cherry picked from commit 2efab92) Change-Id: Icadc749f06d350255ed9297eb7183db2dcfa08a5
@zhengchenyu Could you elaborate on the situation a bit more? In your example, do you mean you have only 9 storage units? |
@tasanuma Sorry for miss your comment. In the case of a 6+3 ec policy, if 4 blocks are unavailable due to busy, the size of srcNodes is 5. If one of these 5 blocks is in the decommissioning state, I think block copy for the decommissioning block should be triggered. However, this simple block copy cannot be triggered now. I create PR HDFS-17542. The test Case 1.7 show the problem I described. For current trunk, scheduleReconstruction will return null. But after this PR HDFS-17542, will return a work for copy. HDFS-17542 reorganized the code structure. Would you be interested in taking a look at HDFS-17542? |
Description of PR
NameNode should not send a reconstruction work when the source datanodes are insufficient.
Otherwise, DataNodes receive the order and throw the following exception.
How was this patch tested?
unit test
For code changes:
LICENSE
,LICENSE-binary
,NOTICE-binary
files?