-
Notifications
You must be signed in to change notification settings - Fork 186
Don't reactivate tasks on agent reregistration after lb removal #2129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| taskManager.createTaskCleanup( | ||
| new SingularityTaskCleanup( | ||
| null, | ||
| TaskCleanupType.PRIORITY_KILL, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if this is the type we actually want here. Nothing really fit perfectly, so this seemed reasonable. Or we could add a type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
priority kill is an inta-kill, probably too harsh. We can have this shut down gracefully similar to when we replace it with a bounce. Could easily fall under DECOMISSIONING ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
forget this actually, saw your follow up question below. if we return a StatusUpdateResult.KILL_TASK in the end, that will take care of it all for us
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Though not sure if it will take care of the cleaning + pending request. Would have to double check that. If not we'll need to stick to the cleanup being created + returning DONE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
returning the KILL_TASK will put a timeout of 1 minute for graceful termination on the task, so I think we want to go with the task cleanup + DONE here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
| Optional.empty() | ||
| ) | ||
| ); | ||
| return StatusUpdateResult.IGNORED; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also not sure of this, but since we want to clean gracefully I figure we don't want mesos to touch it.
| return StatusUpdateResult.KILL_TASK; | ||
| } else if ( | ||
| maybeTaskHistory.isPresent() && | ||
| status.getReason() == Reason.REASON_AGENT_REREGISTERED |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is already covered by the enclosing isRecoveryStatusUpdate if statement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see anything specific to re-registration in that method - will leave as is for now
| .anyMatch( | ||
| lbUpdate -> | ||
| lbUpdate.getMethod() == LoadBalancerMethod.CANCEL || | ||
| lbUpdate.getMethod() == LoadBalancerMethod.DELETE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are the methods we cal in baragon, not a representation of an add or remove. What we want to be checking is for a LoadBalancerRequestType of REMOVE. taskManager has a getLoadBalancerState method for this
|
🚢 |
If an agent re-registers with mesos, but the task has already started being removed from the load balancer, don't reactivate it. Instead, clean it gracefully.