Description
Currently, the structure GCWorkScheduler
has way too many fine-grained locks, and the synchronisation is compilcated (See issue #537). The frequent query of bucket lengths also incentivised the use of RwLock
which has sharing issues (See issue #543). This issue proposes a simplified work scheduler design.
Design
Instead of giving each WorkBucket
a separate BinaryHeap<PrioritizedWork>
, in the new design, we have only one BinaryHeap<PrioritizedWork>
. That binary heap contains all work that are currently executable.
We modify WorkBucket
so that it becomes merely a staging area where not-yet-executable work packets are temporarily stored. A work bucket has a Vec<PrioritizedWork>
, and can have two states: closed and open.
- closed: In this state, work packets added to the bucket will be appended to the
Vec
. - switching from closed to open: All work packets in the
Vec
are dumped into the globalBinaryHeap
. - open: In this state, work packets skip the local
Vec
, and are directly added into the globalBinaryHeap
.
Querying "is all open buckets drained": Because open buckets no longer hold any work packets (they are dumped directly to the global BinaryHeap
), this question becomes simply whether the BinaryHeap
is empty.
Are all mutators parked: We don't count how mutators have parked. Instead we count how may work packets are "in action", i.e. being executed by any worker. We maintain another global counter: packets_in_action
. It is incremented by n
every time a worker polls n
work packets from the BinaryHeap
; it is decremented by m
every time a worker finished executing m
work packets. The increment has to be timely, but the decrement can be done relatively lazily to reduce contention, for example, after a worker finishes several work packets in a row.
When should we open more buckets?: Only if all open buckets are drained, and there are 0 "packets in action". Only in-action packets can add more packets.
Synchronisation mechanism:
- One
Mutex
that protects everything. - One
Condvar
for the "more work is available" event.
Alternative design choices:
When should we open more buckets?: Instead of querying both BinaryHeap::len()
and packets_in_action
, we can count
- The number of packets that have ever been added to to the
BinaryHeap
, and - the number of packets "retired" (i.e. finished execution on any worker).
Once the latter equals the former, we open more buckets.
Could we use only one counter to count the difference between the two? i.e. inc when add to BinaryHeap
, dec when "retired". Then we open more buckets when the counter decreases to zero.
Limitation
- This design is still blocking (lock-based) like before. We need to switch to a non-blocking priority heap if we want to get rid of the lock entirely.
- Still no work-stealing.