Description
openedon Nov 3, 2023
tl;dr: Introduce a simple mechanism for limiting parallelism automatically in Cargo, to avoid consuming all system resources during the compilation.
Problem
Cargo by default uses all cores (std::thread::available_parallelism
) and spawns off rustc
or build scripts onto each core. This is not an issue when compiling on a decent machine. When working on low-end machines or large-scale codebase, developers often encounter issue like extremely high CPU loads or out-of-memory errors.
To solve these problem, developers can set --jobs
from command line, or build.jobs
in .cargo/config.toml
to control the maximum parallelism Cargo can use. This is not ideal because
build.jobs
is bound to number of core in use. It is not immediately correlated with memory usage. Executing parallel builds might lead to out of memory before any CPU throttling happens, especially when several linker invocations happens.build.jobs
assigns core fairly to each unit of work (i.e. a crate build"). However, some crate builds might consume more computing resources. If those crate builds are bottlenecks of the entire compilation, we might want to throw more resources to complete them to unblock other crate builds.- Developers need to set
build.jobs
explicitly to control the parallelism. However, it's often a long trial-and-error to figure out a proper value. The value also varies in different environments. Not really user friendly. - Developers might want a full control of every dependency build.
build.jobs
is too coarse-grained.
An "ideal" approach (but not now)
There are a couple of existing proposals trying to improve the situation. Some of them want to define a weight on a certain job, or tag jobs into a group. With weights and tags, job scheduler understands whether it should allocate a job. This is pretty much the ideal solution, as it maximizes the control of parallelism for developers, and the system could be extend to the job scheduling optimization.
However, such a system requires developers to fully understand the entire compilation of their projects. For now, the data is either missing or hard to get from Cargo. To incrementally build the system, there are prerequisites:
- Cargo can monitor the resource usage of the system and each unit of work during a build.
- Cargo can persist the resource usage of each unit of work for each build.
Start small
We should start small, and focus on monitoring the resource usage, additionally limiting the parallelism when the usage exceeds a threshold.
Some options we can do:
- Assign the maximum amount of resources that Cargo can use. This is how
build.jobs
works now. We might need an equivalent for memory usage. Something like[build.limit] local-memory = "3GiB" # or "95%" or "100% - 200MiB"
- Set a system threshold. Cargo won't allocate any new job and wait for the entire system usage going down, even when the usage of Cargo itself is still under the assigned maximum.
[build.limit] system = "3GiB" # or "95%" or "100% - 200MiB" cpu = "100%"
To minimize the impact of bad data points, these metrics will be sampled and averaged out within a period of time.
Instead of "usage", we can also leverage the concept "load average" from Unix-like, which might make more sense to manage computing resource loads.
I entirely don't know which one we want, or both, or none.
Library to use
procfs
— used by wider Rust web-dev community, via promethues and other metrics crates.sysinfo
— another popular crate for inspecting system info.
Both of then introduce an excessive amount of code Cargo doesn't need at this moment.
Alternatively, we can use syscall lib directly to get these info.
Prior arts
-
Bazel
--jobs
--local_{ram,cpu}_resources
to assign resources a build can use
-
Buck
--jobs
link_weight
to config how many job a link job consumes.
-
Cabel
--jobs
- Got the same linker invocation issue Add option to limit number of concurrent calls to linker when building with -j haskell/cabal#1529.
-
CMake
-j
to set max number of concurrent processes
-
GitHub Actions
- has
concurrency.group
- has
-
Go
go build -p
limits the number of programs, such as build commands or test binaries, that can be run in parallel.GOMAXPROCS
to limit the number of OS threads that can execute user-level Go code simultaneously.
-
Gradle
--max-workers
— like--jobs
- Has a
SharedResourceLeaseRegistry
for registering a resource with its maximum lease numbers. Like a semaphore. - Parallelism can be configured per-project on demand.
-
make
-j
to set max number of concurrent jobs--max-load
to limit the start of a new job if load average goes above the value- Read Parallel for more
-
Ninja
- has a pool concept that user can assign some stage of build to a pool with more restricted parallelism rules.
-
Nix
-
sbt
tasks
are tagged, and each tag get a default weight of resource restriction.
Related issues
There are more issues regaring scheduling but I dont want to link to them here. These are issue of people trying to tell Cargo not to be that greedy.
- Support cargo build --load-average / -l like GNU make #7480
- Hint mechanism to require more "slots" to build a crate #8405
- Avoid cargo throttling system with too many tasks on slower CPUs #8556
- Allow restricting the number of parallel linker invocations #9157
- Introduce 'nice' value under cargo.toml -> [build] #9250
- Cargo hits OOM when building many examples #11707
- cargo test renders device unresponsive due to MSVC linker RAM usage #12916
- Memory leak/spike during doc-tests #14190
And sorry I opened a new issue instead. Feel free to close and move to any existing one.