-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-25635 CandidateGenerator may miss some region balance actions #3024
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
virajjasani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can avoid introducing this config, it will be great
| if (getConf() == null || getConf().getBoolean( | ||
| HBASE_MASTER_BALANCER_RANDOM_PICK_REGIONS_WITH_CONSIDER_COUNT_BALANCE, true)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to avoid this logic to resolve test failures? Maybe we can make changes in test cases to accommodate this logic rather and avoid using this config
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @virajjasani . Yes, I can change the test to make the case pass.
I added this config because I think the random pick in default candidate with just consider region count may be unexpected. There are cost functions in balancer, which means we can change the cost of the functions to get a balance of region read/write loads, such as ReadRequestCostFunction. When these costs are considered, the region count between regionservers may be not fair like the test case, but we need the count skew to make read/write loads fairly.
| */ | ||
| @InterfaceAudience.Private | ||
| abstract class CandidateGenerator { | ||
| abstract class CandidateGenerator extends BaseConfigurable { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we directly make changes in test, we can avoid this
31d8a86 to
61b0d7c
Compare
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
virajjasani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left few comments, almost there
hbase-server/src/main/java/org/apache/hadoop/hbase/master/balancer/StochasticLoadBalancer.java
Show resolved
Hide resolved
|
|
||
| private static final Logger LOG = LoggerFactory.getLogger(SecureTestUtil.class); | ||
| private static final int WAIT_TIME = 10000; | ||
| private static final int WAIT_TIME = 30000; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this required?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I enlarge the timeout because TestRSGroupsWithACL failed when grant permission timeout, https://ci-hadoop.apache.org/job/HBase/job/HBase-PreCommit-GitHub-PR/job/PR-3024/1/testReport/
It is a small problem, I'll remove this change.
|
@enis if you can also take a look, that would be really great. |
fbe3446 to
e036d92
Compare
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
|
🎊 +1 overall
This message was automatically generated. |
…3024) Signed-off-by: Viraj Jasani <vjasani@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org>
…3024) Signed-off-by: Viraj Jasani <vjasani@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org>
…3024) Signed-off-by: Viraj Jasani <vjasani@apache.org> Signed-off-by: Duo Zhang <zhangduo@apache.org>
No description provided.