YARN-11082 AbstractCSQueue#canAssignToThisQueue DRF should use node partition reource as denominator #4043
+5
−2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of PR
AbstractCSQueue#canAssignToThisQueue will check current queue useage and limit, and DRF will use cluster resource as denominator to check which resource is dominated and comapre the ratio however if our cluster's nodes resource are not blance such as there is larger proportion of memory/vores, then DRF will chose wrong dominated resource.
For Example our cluster's total resouce are <memory:175117312, vCores:40222> the ratio is 1 vores : 4.25 GB, and the ratio changed to 1 : 4.8 under node label x.
clusterResource = <memory:175117312, vCores:40222>
usedExceptKillable = <memory:3381248, vCores:687>
currentLimitResource = <memory:3420315, vCores:687>
currentLimitResource:
memory : 3381248/175117312 = 0.01930847362
vCores : 687/40222 = 0.01708020486
usedExceptKillable:
memory : 3384320/175117312 = 0.01932601615
vCores : 688/40222 = 0.01710506687
DRF will think memory is dominated resource and compare the ratio of memeory in this scenario