Limiting the parallelism automatically

**tl;dr:** Introduce a simple mechanism for limiting parallelism automatically in Cargo, to avoid consuming all system resources during the compilation.

## Problem

Cargo by default uses all cores ([`std::thread::available_parallelism`](http://doc.rust-lang.org/1.73.0/std/thread/fn.available_parallelism.html))  and spawns off `rustc` or build scripts onto each core. This is not an issue when compiling on a decent machine. When working on low-end machines or large-scale codebase, developers often encounter issue like extremely high CPU loads or out-of-memory errors.

To solve these problem, developers can set `--jobs` from command line, or `build.jobs` in `.cargo/config.toml` to control the maximum parallelism Cargo can use. This is not ideal because
* `build.jobs` is bound to number of core in use. It is not immediately correlated with memory usage. Executing parallel builds might lead to out of memory before any CPU throttling happens, especially when several linker invocations happens.
* `build.jobs` assigns core fairly to each unit of work (i.e. a crate build"). However, some crate builds might consume more computing resources. If those crate builds are bottlenecks of the entire compilation, we might want to throw more resources to complete them to unblock other crate builds. 
* Developers need to set `build.jobs` explicitly to control the parallelism. However, it's often a long trial-and-error to figure out a proper value. The value also varies in different environments. Not really user friendly.
* Developers might want a full control of every dependency build. `build.jobs` is too coarse-grained.



## An "ideal" approach (but not now)

There are a couple of existing proposals trying to improve the situation. Some of them want to define a weight on a certain job, or tag jobs into a group. With weights and tags, job scheduler understands whether it should allocate a job. This is pretty much the ideal solution, as it maximizes the control of parallelism for developers, and the system could be extend to the job scheduling optimization.

However, such a system requires developers to fully understand the entire compilation of their projects. For now, the data is either missing or hard to get from Cargo. To incrementally build the system, there are prerequisites:

* Cargo can monitor the resource usage of the system and each unit of work during a build.
* Cargo can persist the resource usage of each unit of work for each build.

## Start small

We should start small, and focus on monitoring the resource usage, additionally limiting the parallelism when the usage exceeds a threshold.

Some options we can do:

* Assign the maximum amount of resources that Cargo can use. This is how `build.jobs` works now. We might need an equivalent for memory usage. Something like
    ```toml
    [build.limit]
    local-memory = "3GiB" # or "95%" or "100% - 200MiB"
    ```
* Set a system threshold. Cargo won't allocate any new job and wait for the entire system usage going down, even when the usage of Cargo itself is still under the assigned maximum. 
    ```toml
    [build.limit]
    system = "3GiB" # or "95%" or "100% - 200MiB"
    cpu = "100%"
    ```

To minimize the impact of bad data points, these metrics will be sampled and averaged out within a period of time.

Instead of "usage", we can also leverage the concept "load average" from Unix-like, which might make more sense to manage computing resource loads.

I entirely don't know which one we want, or both, or none.

## Library to use

* [`procfs`](https://crates.io/crates/procfs) — used by wider Rust web-dev community, via promethues and other metrics crates.
* [`sysinfo`](https://crates.io/crates/procfs) — another popular crate for inspecting system info.

Both of then introduce an excessive amount of code Cargo doesn't need at this moment.

Alternatively, we can use syscall lib directly to get these info. 

## Prior arts

* Bazel
    * `--jobs`
    * [`--local_{ram,cpu}_resources`](https://bazel.build/docs/user-manual#local-resources) to assign resources a build can use
* Buck
    * `--jobs`
    * [`link_weight`](https://buck2.build/docs/legacy/files-and-directories/dot-buckconfig/#link_weight) to config how many job a link job consumes.

* Cabel
    * `--jobs`
    * Got the same linker invocation issue <https://github.com/haskell/cabal/issues/1529>.
* CMake
    * `-j` to set max number of concurrent processes
* GitHub Actions
    * has `concurrency.group`
* Go
	* `go build -p` limits the number of programs, such as build commands or test binaries, that can be run in parallel.
    * `GOMAXPROCS` to limit the number of OS threads that can execute user-level Go code simultaneously.
* Gradle
    * `--max-workers` — like `--jobs` 
    * Has a `SharedResourceLeaseRegistry` for registering a resource with its maximum lease numbers. Like a semaphore.
    * Parallelism can be configured per-project on demand.
* make
    * `-j` to set max number of concurrent jobs
    * `--max-load` to limit the start of a new job if load average goes above the value
    * Read [Parallel](https://www.gnu.org/software/make/manual/make.html#Parallel) for more
* Ninja
    * has a [pool](https://ninja-build.org/manual.html#ref_pool) concept that user can assign some stage of build to a pool with more restricted parallelism rules.
* Nix
    * [`max-jobs`](https://nixos.org/manual/nix/stable/command-ref/conf-file.html#conf-max-jobs)
* sbt
    * [`tasks` are tagged](https://www.scala-sbt.org/1.x/docs/Parallel-Execution.html), and each tag get a default weight of resource restriction.

## Related issues

There are more issues regaring scheduling but I dont want to link to them here. These are issue of people trying to tell Cargo not to be that greedy.

* https://github.com/rust-lang/cargo/issues/7480
* https://github.com/rust-lang/cargo/issues/8405
* https://github.com/rust-lang/cargo/issues/8556
* https://github.com/rust-lang/cargo/issues/9157
* https://github.com/rust-lang/cargo/issues/9250
* https://github.com/rust-lang/cargo/issues/11707
* https://github.com/rust-lang/cargo/issues/12916
* https://github.com/rust-lang/cargo/issues/14190

And sorry I opened a new issue instead. Feel free to close and move to any existing one.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Limiting the parallelism automatically #12912

Problem

An "ideal" approach (but not now)

Start small

Library to use

Prior arts

Related issues

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Limiting the parallelism automatically #12912

Description

Problem

An "ideal" approach (but not now)

Start small

Library to use

Prior arts

Related issues

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions