Help users discover overly broad rerun-if-changed directives #14240
Description
Maintainers notes
- Root problem was an overly broad rerun-if-changed, see Help users discover overly broad rerun-if-changed directives #14240 (comment)
Problem
At work we have a rather large monorepo - of which rust (currently) makes up a very small part.
Some quick numbers using cloc --vcs git
-- ~305k files (~45m LOC) of which ~400 files (~80k LOC) are rust.
We've noticed that the time for cargo check --workspace --manifest-path ./Cargo.toml
is quite slow - disproportionately slow compared to the sum of running check on each.
Here are the examples from running the commands locally
$ time cargo check -p crate1
$ time cargo check -p crate2
...
Sum: 32.377 total
$ time cargo check --workspace --manifest-path ./Cargo.toml
4.26s user 43.64s system 59% cpu 1:21.14 total
$ time cargo check --manifest-path ./Cargo.toml -p crate1 -p crate2 ...
3.89s user 43.02s system 58% cpu 1:20.16 total
Doing some investigation - in the last two cases it looks like cargo is scanning the entire repo. When inspecting what files that cargo is reading I can see it accessing files and folders not included in those those defined within the workspace's members
array that contain no rust code.
Running sudo fs_usage | grep cargo
I can see it accessing:
- our "frontend" folder and scanning through all of its
node_modules
, and - bazel's output folders (which are symlinked by bazel into the repo root) which are all build artifacts.
These two cases were actually what first triggered us to investigate we had reports from several engineers about slowness - we noticed two things:
- the more bazel builds they do, the slower cargo got - and it looks like it's because more builds = more files in the bazel output folders = more crawl time.
- the more npm packages that were installed, the slower cargo got - similarly more installs = more
node_modules
= more crawl time.
This is not great for us because it means that certain operations can take 3 times longer than they should - and it impacts IDE startup times too as rust-analyzer
runs cargo check --workspace --message-format=json-diagnostic-rendered-ansi --manifest-path /absolute/path/to/Cargo.toml
as part of its startup.
Steps
No response
Possible Solution(s)
I'd love a way to tell cargo "please don't scan anything - all you should care about is located right here in the members
list" so that we can avoid scanning irrelevant folders.
Alternately - making cargo respect .gitignore
during its scan would also be good as it would mean that folders like node_modules
and the bazel folders would be skipped entirely.
Notes
All of our dependencies are sourced from crates.io
We do not use any git links or patches
We do have workspace interdependencies
Please let me know if there are any log files, etc that you would like to help investigate - happy to provide whatever info I can to you.
Version
cargo 1.78.0 (54d8815d0 2024-03-26)
release: 1.78.0
commit-hash: 54d8815d04fa3816edc207bbc4dd36bf18014dbc
commit-date: 2024-03-26
host: aarch64-apple-darwin
libgit2: 1.7.2 (sys:0.18.2 vendored)
libcurl: 8.6.0 (sys:0.4.72+curl-8.6.0 system ssl:(SecureTransport) LibreSSL/3.3.6)
ssl: OpenSSL 1.1.1w 11 Sep 2023
os: Mac OS 14.5.0 [64-bit]