-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HDFS-15904 : De-flake TestBalancer#testBalancerWithSortTopNodes() #2785
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
efb276c
to
1cb6258
Compare
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the PR title from testBalancerServiceOnError()
to testBalancerWithSortTopNodes()
. I think the JIRA has the right one.
@@ -2297,7 +2297,9 @@ public void testBalancerWithSortTopNodes() throws Exception { | |||
maxUsage = Math.max(maxUsage, datanodeReport[i].getDfsUsed()); | |||
} | |||
|
|||
assertEquals(200, balancerResult.bytesAlreadyMoved); | |||
// Either 2 blocks of 100+100 bytes or 3 blocks of 100+100+50 bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could add some explanation why this would happen.
The 95% usage DN will have 9 blocks of 100 bytes and 1 block of 50 byte - all for the same file. The HDFS balancer will choose a block to move from this node randomly. More likely it will be 100B block. Since that is greater than DFS_BALANCER_MAX_SIZE_TO_MOVE_KEY
which is 99L (see above settings), it will stop here. Total bytes moved from this 95% DN will be 1 block and hence 100B.
However, chances are the first block to move from this 95% DN is the 50B block. After this block being moved, the total moved size 50B is smaller than DFS_BALANCER_MAX_SIZE_TO_MOVE_KEY
, it will try to move another block. The second block will always be 100 bytes. So total bytes moved from this 95% DN will be 2 blocks and hence 150B (100B + 50B).
Please reword or rephrase this as comment before this assertion so readers can have more context without thinking too much again.
Thanks,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure thing, let me provide detailed comment as per your suggestion.
Thanks
1cb6258
to
5dc3756
Compare
Will commit after a good QA. Thanks |
💔 -1 overall
This message was automatically generated. |
Majority test failures are due to timeout. Let me trigger build one more time? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanx @virajjasani for the fix, changes LGTM.
Have triggered the build again.
https://ci-hadoop.apache.org/job/hadoop-multibranch/job/PR-2785/5/
💔 -1 overall
This message was automatically generated. |
comments: failing tests were not related or due to testing machine out of memory error. We have so many intermittent failures in HDFS tests. Glad to see bug fixes like this. Thank you! |
…ache#2785) Contributed by Viraj Jasani. Signed-off-by: Mingliang Liu <liuml07@apache.org> Signed-off-by: Ayush Saxena <ayushsaxena@apache.org>
No description provided.