Skip to content

Workaround a synchronization bug. #772

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Mar 9, 2023
Merged

Conversation

wks
Copy link
Collaborator

@wks wks commented Mar 8, 2023

If a worker thread spuriously wakes up, it may observe the pending coordinator work counter going from 0 to 1 when the EndOfGC work packet starts to be executed. That'll trigger an assertion failure.

It is a workaround because a full solution should explicitly address the need to prevent opening more work packets when the coordinator (or any worker) is adding work packets to related buckets.

Closes: #770

If a worker thread spuriously wakes up, it may observe the pending
coordinator work counter going from 0 to 1 when the EndOfGC work packet
starts to be executed.  That'll trigger an assertion failure.

It is a workaround because a full solution should explicitly address the
need to prevent opening more work packets when the coordinator (or any
worker) is adding work packets to related buckets.
@wks wks requested a review from qinsoon March 8, 2023 08:27
@k-sareen
Copy link
Collaborator

k-sareen commented Mar 8, 2023

I would say it's better that we don't close the issue if the fix is just a workaround.

@wks
Copy link
Collaborator Author

wks commented Mar 8, 2023

I would say it's better that we don't close the issue if the fix is just a workaround.

I created an issue to discuss the full fix in detail. #774 I think it is OK to close #770 while we can continue discussion in #774

Copy link
Member

@qinsoon qinsoon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a pretty hacky workaround. But if this is necessary to make sure we do not break bindings, it is okay with me.

// packets for the workers. The only two other coordinator, i.e. `ScheduleCollection` and
// `StopMutators`, belong to that category. `EndOfGC` doesn't add new work packets, so it
// doesn't need to prevent the workers from opening buckets (as there are none to be
// opened.)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually we still want to use initiate_coordinator_work here, right? Can you put the link to both issues in the comments, and add a FIXME?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably not. The initiate_coordinator_work function is intended for preventing workers from opening more packets. Maybe run_coordinator_work_and_prevent_opening_buckets is a more meaningful (but verbose) function name. From the diagram I drew in #774, ScheduleCollection and StopMutators are preconditions of some buckets, but EndOfGC is not a precondition of anything. So it may not make sense to wrap all CoordinatorWork with the function currently named initiate_coordinator_work.

Anyway, I'll add links to both issues so that we can fix it more properly later.

@wks wks merged commit 7750fc0 into mmtk:master Mar 9, 2023
JunmingZhao42 pushed a commit to JunmingZhao42/mmtk-core that referenced this pull request Mar 19, 2023
If a worker thread spuriously wakes up, it may observe the pending
coordinator work counter going from 0 to 1 when the EndOfGC work packet
starts to be executed. That'll trigger an assertion failure.
wenyuzhao pushed a commit to wenyuzhao/mmtk-core that referenced this pull request Mar 20, 2023
If a worker thread spuriously wakes up, it may observe the pending
coordinator work counter going from 0 to 1 when the EndOfGC work packet
starts to be executed. That'll trigger an assertion failure.
wenyuzhao pushed a commit to wenyuzhao/mmtk-core that referenced this pull request Mar 22, 2023
If a worker thread spuriously wakes up, it may observe the pending
coordinator work counter going from 0 to 1 when the EndOfGC work packet
starts to be executed. That'll trigger an assertion failure.

(cherry picked from commit 7750fc0)
wenyuzhao pushed a commit to wenyuzhao/mmtk-core that referenced this pull request Mar 27, 2023
If a worker thread spuriously wakes up, it may observe the pending
coordinator work counter going from 0 to 1 when the EndOfGC work packet
starts to be executed. That'll trigger an assertion failure.

(cherry picked from commit 7750fc0)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Trying to open buckets when there are pending coordinator packets
3 participants