-
Notifications
You must be signed in to change notification settings - Fork 542
Description
I have a long running single-threaded computation, of which I'd like to run many concurrent instances:
let n = 320;
let iter = repeat(()).take(n).par_bridge();
let r: u32 = iter.map(|_| {
// Some long running computation
sleep(Duration::from_secs(10));
1
}).sum();
When I run this on many cores (96) often only a random number of cores are used to perform the computation, and the rest is idle.
The number of active cores does not change, over time.
However, it doesn't appear to be a deadlock: rayon works through the iterator in chunks of some random size,
and it terminates correctly.
I have written an MWE here.
The MWE is best run with RAYON_NUM_THREADS=96 cargo run --release
, which reproduces about 75% of the time on my 8 logical core system.
When I increase the item count n
in the iterator to threads * threads * 2
the bug does not occur anymore.
However it does still occur with n = threads * threads * 2 - 1
.
Therefore, I think some logic related to this line may be responsible.
Is this expected behavior? Is it a bug?
I will look into it further, I just wanted to get a report out before I forget about it.
Kind regards,
ambiso