Description
When reading the code of the scheduler, I found some problem with the current synchronisation mechanism. This issue describes the problems I see, but I am not planning for an immediate refactoring because I don't have any concrete idea of how to solve the problems.
Spurious wake-up
From scheduler.rs
:
#[cold]
fn poll_slow(&self, worker: &GCWorker<VM>) -> Box<dyn GCWork<VM>> {
debug_assert!(!worker.shared.is_parked());
let mut guard = self.worker_monitor.0.lock().unwrap();
loop {
debug_assert!(!worker.shared.is_parked());
if let Some((work, bucket_is_empty)) = self.pop_scheduable_work(worker) {
if bucket_is_empty {
worker.sender.send(CoordinatorMessage::BucketDrained).unwrap();
}
return work;
}
// Park this worker
worker.shared.parked.store(true, Ordering::SeqCst);
if self.worker_group().all_parked() {
worker.sender.send(CoordinatorMessage::AllWorkerParked).unwrap();
}
// Wait
guard = self.worker_monitor.1.wait(guard).unwrap(); // ERROR1: potential spurious wake-up
// Unpark this worker
worker.shared.parked.store(false, Ordering::SeqCst);
}
}
The expression self.worker_monitor.1.wait(guard)
does not follow the standard way of using condition variable. According to the semantics of condition variable, the wait
function can wake up spuriously, and must be used in a loop:
mutex.lock();
while (!conditon_is_met()) {
condvar.wait();
}
do_something();
mutex.unlock();
What's the real condition of the wait()
?
The poll_slow()
function attempts to get some work from local work buckets or global work buckets. If there is no work available, it will park the current GC worker. So the condition it waits for could be that "there are some work packets available in any global buckets". From the structure of the loop, the current code could still be a valid use of condvar, because it will check if "any work is available in any bucket" after waking up, although the condition is an implicit condition.
There are four places where the worker_monitor.1
condvar is notified:
- In
GCWorkScheduler::update_buckets
- In
GCWorkScheduler::notify_mutators_paused
- In
WorkBucket::notify_all_workers
(called bybulk_add
) - In
WorkBucket::notify_one_worker
(called byadd
)
All seem to support my assumption that the implicit condition of the condvar is "any work is available in any bucket". The first one is supposed to open some buckets and make more work packets available. The second one activates the Prepare
bucket, making work packets in it available. The third and the fourth add new work packets to work buckets.
At the call sites of WorkBucket::notify_one_worker
and notify_all_workers
, there are comments noting performance problem. I think part of the problem is that adding a work packet involves grabbing two locks in order, namely the WorkBucket::queue
and the WorkBucket::monitor
. It may be calling notify_{one,all}
more often than necessary because there isn't a data structure like a Semaphore that records how many workers are blocked waiting for work packets.
Sloppy "all workers are parked" condition
Despite that the current approach looks correct, the condition of "all workers are parked" is a bit sloppy. Because of the spurious wakeup mentioned above, the last worker await could enter a loop of spurious wakeup, causing the CoordinatorMessage::AllWorkerParked
message to be sent multiple times to the controller.
Unroll the loop above, and we get this trace:
// start of loop {
EXECUTE: worker.shared.parked.store(true, Ordering::SeqCst);
CHECK: self.worker_group().all_parked() // true
EXECUTE: worker.sender.send(CoordinatorMessage::AllWorkerParked).unwrap(); // AllWorkerParked sent
EXECUTE: guard = self.worker_monitor.1.wait(guard).unwrap();
SPURIOUSLY WAKE UP
EXECUTE: worker.shared.parked.store(false, Ordering::SeqCst);
// end of loop }
// start of loop {
EXECUTE: worker.shared.parked.store(true, Ordering::SeqCst);
CHECK: self.worker_group().all_parked() // true
EXECUTE: worker.sender.send(CoordinatorMessage::AllWorkerParked).unwrap(); // AllWorkerParked sent again!
EXECUTE: guard = self.worker_monitor.1.wait(guard).unwrap();
SPURIOUSLY WAKE UP
EXECUTE: worker.shared.parked.store(false, Ordering::SeqCst);
// end of loop }
// start of loop {
EXECUTE: worker.shared.parked.store(true, Ordering::SeqCst);
CHECK: self.worker_group().all_parked() // true
EXECUTE: worker.sender.send(CoordinatorMessage::AllWorkerParked).unwrap(); // AllWorkerParked sent again!
...
This can keep going on and on without any thread calling worker_monitor.1.notify_all()
at all, and any worker can oscillate between the parked and unparked state. In GCWorkScheduler::initialize
, the "open condition" of each bucket is defined as
let should_open = self.are_buckets_drained(&cur_stages) && self.worker_group().all_parked();
With all_parked()
oscillating between true
and false
(note that the WorkerGroup::all_parked()
function does not hold the worker_monitor.0
mutex), the should_open
condition is at least "unstable". It may not be completely wrong, because if any worker wakes up spuriously, it will eventually send another CoordinatorMessage::AllWorkerParked
message, which allows the controller thread to try to open the buckets again. But I still think it is a bit unstable.