Skip to content

Workers spuriously wake up #537

Closed
Closed
@wks

Description

@wks

When reading the code of the scheduler, I found some problem with the current synchronisation mechanism. This issue describes the problems I see, but I am not planning for an immediate refactoring because I don't have any concrete idea of how to solve the problems.

Spurious wake-up

From scheduler.rs:

    #[cold]
    fn poll_slow(&self, worker: &GCWorker<VM>) -> Box<dyn GCWork<VM>> {
        debug_assert!(!worker.shared.is_parked());
        let mut guard = self.worker_monitor.0.lock().unwrap();
        loop {
            debug_assert!(!worker.shared.is_parked());
            if let Some((work, bucket_is_empty)) = self.pop_scheduable_work(worker) {
                if bucket_is_empty {
                    worker.sender.send(CoordinatorMessage::BucketDrained).unwrap();
                }
                return work;
            }
            // Park this worker
            worker.shared.parked.store(true, Ordering::SeqCst);
            if self.worker_group().all_parked() {
                worker.sender.send(CoordinatorMessage::AllWorkerParked).unwrap();
            }
            // Wait
            guard = self.worker_monitor.1.wait(guard).unwrap();  // ERROR1: potential spurious wake-up
            // Unpark this worker
            worker.shared.parked.store(false, Ordering::SeqCst);
        }
    }

The expression self.worker_monitor.1.wait(guard) does not follow the standard way of using condition variable. According to the semantics of condition variable, the wait function can wake up spuriously, and must be used in a loop:

mutex.lock();
while (!conditon_is_met()) {
    condvar.wait();
}
do_something();
mutex.unlock();

What's the real condition of the wait()?

The poll_slow() function attempts to get some work from local work buckets or global work buckets. If there is no work available, it will park the current GC worker. So the condition it waits for could be that "there are some work packets available in any global buckets". From the structure of the loop, the current code could still be a valid use of condvar, because it will check if "any work is available in any bucket" after waking up, although the condition is an implicit condition.

There are four places where the worker_monitor.1 condvar is notified:

  1. In GCWorkScheduler::update_buckets
  2. In GCWorkScheduler::notify_mutators_paused
  3. In WorkBucket::notify_all_workers (called by bulk_add)
  4. In WorkBucket::notify_one_worker (called by add)

All seem to support my assumption that the implicit condition of the condvar is "any work is available in any bucket". The first one is supposed to open some buckets and make more work packets available. The second one activates the Prepare bucket, making work packets in it available. The third and the fourth add new work packets to work buckets.

At the call sites of WorkBucket::notify_one_worker and notify_all_workers, there are comments noting performance problem. I think part of the problem is that adding a work packet involves grabbing two locks in order, namely the WorkBucket::queue and the WorkBucket::monitor. It may be calling notify_{one,all} more often than necessary because there isn't a data structure like a Semaphore that records how many workers are blocked waiting for work packets.

Sloppy "all workers are parked" condition

Despite that the current approach looks correct, the condition of "all workers are parked" is a bit sloppy. Because of the spurious wakeup mentioned above, the last worker await could enter a loop of spurious wakeup, causing the CoordinatorMessage::AllWorkerParked message to be sent multiple times to the controller.

Unroll the loop above, and we get this trace:

// start of loop {
EXECUTE: worker.shared.parked.store(true, Ordering::SeqCst);
CHECK:   self.worker_group().all_parked() // true
EXECUTE: worker.sender.send(CoordinatorMessage::AllWorkerParked).unwrap();  // AllWorkerParked sent
EXECUTE: guard = self.worker_monitor.1.wait(guard).unwrap();
SPURIOUSLY WAKE UP
EXECUTE: worker.shared.parked.store(false, Ordering::SeqCst);
// end of loop }
// start of loop {
EXECUTE: worker.shared.parked.store(true, Ordering::SeqCst);
CHECK:   self.worker_group().all_parked() // true
EXECUTE: worker.sender.send(CoordinatorMessage::AllWorkerParked).unwrap();  // AllWorkerParked sent again!
EXECUTE: guard = self.worker_monitor.1.wait(guard).unwrap();
SPURIOUSLY WAKE UP
EXECUTE: worker.shared.parked.store(false, Ordering::SeqCst);
// end of loop }
// start of loop {
EXECUTE: worker.shared.parked.store(true, Ordering::SeqCst);
CHECK:   self.worker_group().all_parked() // true
EXECUTE: worker.sender.send(CoordinatorMessage::AllWorkerParked).unwrap();  // AllWorkerParked sent again!
...

This can keep going on and on without any thread calling worker_monitor.1.notify_all() at all, and any worker can oscillate between the parked and unparked state. In GCWorkScheduler::initialize, the "open condition" of each bucket is defined as

let should_open = self.are_buckets_drained(&cur_stages) && self.worker_group().all_parked();

With all_parked() oscillating between true and false (note that the WorkerGroup::all_parked() function does not hold the worker_monitor.0 mutex), the should_open condition is at least "unstable". It may not be completely wrong, because if any worker wakes up spuriously, it will eventually send another CoordinatorMessage::AllWorkerParked message, which allows the controller thread to try to open the buckets again. But I still think it is a bit unstable.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions