-
Notifications
You must be signed in to change notification settings - Fork 9k
[YARN-6483] Add nodes transitioning to DECOMMISSIONING state to the list of updated nodes returned by the Resource Manager as a response to the Application Master heartbeat #289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
adapt test to decommission timeout checks being independent of received heartbeats
in this version that is the only way to specify a timeout in the excludes file
replace dynamic conf by using the configuration passed by AdminService
} | ||
|
||
@Private | ||
@Unstable | ||
public static NodeReport newInstance(NodeId nodeId, NodeState nodeState, | ||
String httpAddress, String rackName, Resource used, Resource capability, | ||
int numContainers, String healthReport, long lastHealthReportTime, | ||
Set<String> nodeLabels) { | ||
Set<String> nodeLabels, Integer decommissioningTimeout) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use int ? Am not confortable using Integer - given the sometimes arbitrary boxing/unboxing rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This Integer
comes from RMNode.getDecommissioningTimeout()
that was already returning an Integer before this patch, because only nodes in DECOMMISSIONING state have an associated decommission timeout, so null
is used to express absent timeout. In this patch RMNode.getDecommissioningTimeout()
is used in DefaultAMSProcessor.handleNodeUpdates
to get the argument decommissioningTimeout
for BuilderUtils.newNodeReport
. If we use a int
for decommissioningTimeout
in NodeReport.newInstance
I think we should also use an int
for the same argument in BuilderUtils.newNodeReport
for uniformity, which implies a conversion from null
to -1
in DefaultAMSProcessor.handleNodeUpdates
.
So I think we should either keep using Integer decommissioningTimeout
everywhere, enconding absent timeout with null
, or use int decommissioningTimeout
everywhere, enconding absent timeout with a negative timeout, which is coherent with message NodeReportProto
using an unsigned int for decommissioning_timeout
. What do you think about these 2 alternatives?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah - sorry, just noticed that. maybe stick with Integer (although I dont like it much). Maybe we can raise another JIRA to fix it properly via your second suggestion.
// could take any required actions. | ||
rmNode.context.getDispatcher().getEventHandler().handle( | ||
new NodesListManagerEvent( | ||
NodesListManagerEventType.NODE_USABLE, rmNode)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm.. don't think we should be sending NODE_USABLE event here.. since technically, it is not usable.
Maybe consider creating a new NODE_DECOMMISSIONING event ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wasn't very sure about using NODE_USABLE
, but while I was making the changes to follow your suggestion, I have found that in the current code TestRMNodeTransitions.testResourceUpdateOnDecommissioningNode
is asserting that NodesListManagerEventType.NODE_USABLE
is expected for a node that transitions to DECOMMISSIONING
. Also, NodesListManagerEventType is transformed into the corresponding RMAppNodeUpdateType
in NodesListManager.handle
to build a RMAppNodeUpdateEvent
that is processed in RMAppImpl.processNodeUpdate
which just uses the RMAppNodeUpdateType
for logging.
So it looks like it is ok to use NodesListManagerEventType.NODE_USABLE
for nodes in decommissioning state. Do you still think it's worth adding some additional value for NodesListManagerEventType
and RMAppNodeUpdateType
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel we should make intentions explicit - having a separate event type might make the code cleaner and easier to follow - rather than overloading. It could be that the assumption in the testcase is wrong (will have to double check though), in which case - it is perfectly alright to modify the testcasse with the new event.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, I have added a new commit for adding a new value NODE_DECOMMISSIONING
for NodesListManagerEventType
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks - looking at the patch.. can you also attach a consolidated patch on the JIRA ? So as to kick Jenkins.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have just attached the patch. Thanks a lot for taking a look!
use it for notifying nodes transitions to DECOMMISSIONG state
so it works ok for older versions
Pushed in b46ca7e |
This is an alternative approach to https://issues.apache.org/jira/browse/YARN-3224 for notifying all affected application masters when a node transitions into the DECOMMISSIONING state. This change modifies the AllocateResponse that the YARN Resource Manager uses to respond to heartbeat request from application masters, to add any node that has transitioned to DECOMMISSIONING state since the last heartbeat to the list of NodeReport objects that is part of the AllocateResponse object. We also add a new field to each NodeReport to add the decommission timeout for DECOMMISSIONING nodes, thus covering the same functionality of the original proposal in YARN-3224.