Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply lock at Coordinator class level #979

Merged
merged 4 commits into from
Jan 24, 2024
Merged

Apply lock at Coordinator class level #979

merged 4 commits into from
Jan 24, 2024

Conversation

hshukla
Copy link
Collaborator

@hshukla hshukla commented Jan 24, 2024

As part of #975 where we used

synchronized (this) {...}

it was giving lock on the CoordinatorEventProcessor instance. However, we need a lock on the Coordinator class, hence the notifyAll was not working as expected.

This PR fixes the above regression by introducing

synchronized (Coordinator.class) {

atoomula
atoomula previously approved these changes Jan 24, 2024
atoomula
atoomula previously approved these changes Jan 24, 2024
Copy link
Collaborator

@shrinandthakkar shrinandthakkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great find!! 🙌

@hshukla hshukla dismissed stale reviews from shrinandthakkar and atoomula via f92c00a January 24, 2024 20:33
@@ -3230,6 +3230,7 @@ void testOnSessionExpired(boolean handleNewSession) throws DatastreamException,
Assert.assertTrue(t != null && t.isAlive());
Assert.assertTrue(PollUtils.poll(() -> instance1.getIsLeader().getAsBoolean(), 100, 30000));
}
instance1._coordinatorEventThreadExiting = false;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

interesting that we have to set this explicitly 🤔

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test do zkExpiry and then handleNewSession and then call stop. So during zkExpiry, we go from false -> true, and during stop if _coordinatorEventThreadExiting stays false, then wait does not occur and creates deadlock.

@hshukla hshukla merged commit 07d26fc into master Jan 24, 2024
1 check passed
@hshukla hshukla deleted the lock-fix branch January 24, 2024 21:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants