Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Replacing Coordinator Queue With Deque & Fixing Usage Of toMap Util #950

Conversation

shrinandthakkar
Copy link
Collaborator

@shrinandthakkar shrinandthakkar commented Jul 24, 2023

Summary

Deque Coordinator Queue

  • When a newly elected leader receives an instance-task assignment from ZK, it is expected that the leader will perform a cleanup routine on the assignment. Failure to do so may lead to inconsistencies in task assignment across instances.
  • In the event that the cleanup routine encounters an exception, the "LEADER_DO_ASSIGNMENT" event with the "isNewlyElectedLeader" flag enabled must be reinserted at the front of the coordinator queue to ensure that no other events are processed before re-handling the event.

Bug Fixing toMap() Usage

  • Replace the use of Collectors.toMap(keyMapper, valueMapper) in cases where duplicate values may be encountered, with Collectors.toMap(keyMapper, valueMapper, mergeFunction).

Important: DO NOT REPORT SECURITY ISSUES DIRECTLY ON GITHUB.
For reporting security issues and contributing security fixes,
please, email security@linkedin.com instead, as described in
the contribution guidelines.

Please, take a minute to review the contribution guidelines at:
https://github.com/linkedin/Brooklin/blob/master/CONTRIBUTING.md

@shrinandthakkar shrinandthakkar marked this pull request as ready for review July 24, 2023 22:06
_leaderDoAssignmentScheduledFuture = _scheduledExecutor.schedule(() -> {
_eventQueue.put(CoordinatorEvent.createLeaderDoAssignmentEvent(isNewlyElectedLeader));
_eventQueue.put(CoordinatorEvent.createLeaderDoAssignmentEvent(isNewlyElectedLeader), false);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the main change in this PR. When a newly elected leader does assignment, it schedules the assignment event with task cleanup in front of the queue, to make sure it gets executed before anything else. Just to confirm my understanding.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct! We just need to make sure that nothing else gets handled apart from a successful handling of "LEADER_DO_ASSIGNMENT" (with newly elected leader flag enabled) for a newly elected leader.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isNewlyElectedLeader this will always be true, right? or when do we have that false?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isNewlyElectedLeader is only true when a new leader is elected. In many other cases we have to ask the leader to do assignments, and at those calls, isNewlyElectedLeader will be false.

/**
* Callable Coordinator is used for overriding coordinator behaviors for tests
*/
public interface CallableCoordinatorForTest {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this interface? Seems like TestCoordinator.java has all you need.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didnt wanted to add another constructor in the TestCoordinator.java to override another method of coordinator. With this interface, we could minimize code duplication and pass the overrides of coordinator as an argument.

For the test "testLeaderDoAssignmentForNewlyElectedLeaderFailurePath", I overrode performPreAssignmentCleanup method to test a failure path, where I am using this.

jzakaryan
jzakaryan previously approved these changes Aug 1, 2023
Comment on lines 82 to 88

/**
* Add a single event to the queue, de-duping events with the same name and same metadata.
* @param event CoordinatorEvent event to add to the queue
* @param insertInTheEnd if true, indicates to add the event to the end of the queue and front, otherwise.
*/
public synchronized void put(CoordinatorEvent event, boolean insertInTheEnd) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the public APIs be similar to a Deque? I think that put(<event>) and putFirst(<event>) are clearer than using a boolean flag. The boolean option is fine for the private internal implementation.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

* @param event CoordinatorEvent event to add to the queue
*/
public synchronized void put(CoordinatorEvent event) {
LOG.info("Queuing event {} to event queue", event.getType());
put(event, true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When do we need to support for inserting at rear?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the events to this coordinator queue are inserted in the rear. The only case in which we have to insert in the front is what the PR proposes.

* @param event CoordinatorEvent event to add to the queue
* @param insertInTheEnd if true, indicates to add the event to the end of the queue and front, otherwise.
*/
public synchronized void put(CoordinatorEvent event, boolean insertInTheEnd) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for understanding: LinkedBlockingDeque is already a thread safe and we are using only offer or offerFirst, what is the rational on having this method synchronized?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Minor]: another thing is, we want to have add from both ends but remove should be only from front. However, by using deque we will allow add/remove from both ends, Is there a way we can restrict remove from rear?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The synchronized methods were added long before but I am guessing that the shared set variable might create race and would let in duplicate events in some cases if those methods are not synchronized.

That is why the wrapper on top of the LinkedBlockingDeque is implemented which only supports taking out events from the front.

@jzakaryan jzakaryan self-requested a review August 7, 2023 23:51
jzakaryan
jzakaryan previously approved these changes Aug 7, 2023
Copy link
Collaborator

@jzakaryan jzakaryan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

ehoner
ehoner previously approved these changes Aug 8, 2023
hshukla
hshukla previously approved these changes Aug 8, 2023
@shrinandthakkar shrinandthakkar dismissed stale reviews from hshukla and ehoner via b8d7192 August 8, 2023 20:45
@shrinandthakkar shrinandthakkar merged commit d19a2c2 into linkedin:master Aug 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants