Description
This is intended to be a tracking issue to releasing nightly compilers with the ability to internally parallelize themselves but they are still defaulted to single threaded mode. This is part of the larger parallel compiler tracking issue, and is intended to be an incremental step towards fully closing that out.
A recent attempt to build binaires of the parallel compiler led to the thought of whether we could just enable a parallel compiler by default. Note that there are two axes we can change here over time:
- Whether or nor the compiler can be parallelized at all, aka whether it's built with the
--cfg parallel_compiler
flag. - Whether or not the compiler by default is parallelized, aka the default value of
-Z threads
The proposal in this issue is to default to -Z threads=1
(or the moral equivalent) but build nightly compilers with --cfg parallel_compiler
(or the equivalent thereof). The intention is to get us closer to shipping a parallel compiler while buying us time to continue to fix any issues that arise. This would allow, for example, for users to very easily test out parallel compilation locally by using RUSTFLAGS=-Zthreads=16
.
The main blocker for doing this is performance. Requested in a recent thread we realized it's imported to not watch the comparison of instruction counts but rather instead watch the wall time numbers. The instruction count numbers regress 2-3% which looks deceptively good, but the wall-time numbers regress 10-20% (ish) which is much more serious.
Some further investigation shows that most of the slowdown is likely coming from the use of mutexes (as opposed to other avenues like removing parallel code, the overhead of using rayon, or using Arc
instead of Rc
).
The next steps here would be to investigate whether we can recover the performance lost from using mutexes (probably if we can remove the mutexes one way or another).
This issue will likely receive many updates over time!