Skip to content

Discussion of defaults settings for build dependencies for fastest compile times #10481

Open
@lqd

Description

@lqd

Problem

There may be new settings and default values that could improve compile times for build dependencies.

Proposed Solution

In particular, turning off debuginfo, and to a lesser extent stripping, and turning off incremental compilation for local build dependencies.

Notes

As discussed in this zulip thread, we've been looking at compile times over a wider range of the crates.io ecosystem.

Looking at cargo schedules, it seems there are quite a few projects where build scripts, proc-macros, and their dependencies are on the critical path, and the reason why I'm opening this thread: to discuss the default settings used to build the "for host" nodes.

There were already great insights about these a few years ago: cargo focuses on fast compile times rather than raw throughput of the built artifacts, leading to the current defaults for build dependencies.

I'd like to specifically discuss these in the dev.build-override profile:

  • -Cdebuginfo=2: is this removable ? I myself have never tried running a build script or proc-macro in a debugger (and I'd suspect the latter to be hindered by the proc macro bridge + rustc execution model) but that would impact backtraces and lineinfo, if panics happen (though I'm not sure how compile errors generated by panicking proc-macros look under a different debuginfo level).
  • -Cstrip=debuginfo: on the targets where the linker is aware of this (rather than stripping after the fact) this can be a small win (e.g. on build scripts). This one could arguably be added to the release profile as well if for some reason we wanted to be as aggressive as possible, but it's probably slightly better but within noise in release builds I've tried.
  • -Cincremental: this looks removable for local build-dependencies. Incremental compilation is a 30%-50% pessimization that is amortized by building things more than once. This is already tracked in this cargo issue about build scripts: it would improve them slightly. They are an interesting node: they usually are small, and absolute improvements would be small. However, they are somewhat slow to build even for simple scripts, and they take up a couple spots in the cargo schedule: when they are on the critical path they can slow things down by serializing subgraphs and delaying dependent units of work (e.g. removing syn's and proc-macro2's simple build scripts can improve from scratch compile times up to 12% even though their own build time is lower than that). This could also improve local proc-macros a bit, but it's not clear cut (or noticeable in my tests below, but I don't believe you could have local proc-macros published on crates.io anyways).

I've tried to validate these in the real world:

  • since this is quite simple to try, I've made a prototype of this
  • did a perf.rlo run to "ensure" there are no surprising regressions (but perf.rlo itself only tracks leaf crates build times in most situations)
  • and benchmarked it locally over our extended set of crates (780 or so of the most popular crates on crates.io), with check and debug builds. There's a readme in there with more info, simple summaries and hyperfine results for each crate. (Note that these benchmarks were done with the slightly more aggressive "symbols" stripping, but it seems more sensible to only strip "debuginfo" in the odd case a panic happens. That's a 1% improvement at best anyways. I have not measured stripping in release builds either)

These wins seem to indeed translate to real world uses. In the cases we're interested in, heavy build dependencies on the critical path, that is quite noticeable: a lot of nice improvements with a tight confidence interval, both for check builds and debug builds. The regressions are smaller in magnitude and with a wide confidence interval: there are a few super short benchmarks (20-30% of all the crates), making variance in timing more impactful in their results (both improvements and regressions). (Note that hyperfine's summaries are about changes over 100%, and these are not a percentage change. For example, snafu's results are "1.24 times faster", and that is a "1 / 1.24" change, around -20% and not -24%. hyperfine will return "2 times faster" to mean -50%). These overall look like an improvement to me. For check builds at j12 there are around 300 out of the 780 crates, with >=3% mean improvement, 175 >= 5%, 115 >= 10%, 50 >= 13%, etc. For debug builds at j8, there are 215 with >= 5% mean improvement.

Possible problems:

  • lack of debuggability ? documentation would be needed to show which settings to choose for maximum debuggability of build dependencies when needed, rather than fastest compile times
  • backtraces and panics ?
  • stripping may be an overhead on platforms where it's done after the fact (osx ?)
  • less reuse when a dependency is shared between a build dependency and a regular dependency: the different defaults will require another compilation. (That already happens today but is opt-in, changing the debuginfo level, panic method, etc)

Alternatives:

  • having debuginfo at level 1 instead of 2 ? Most of the wins come from the lack of debuginfo, maybe level 1 could be a compromise between speed and debuggability.
  • since all these are just default values and easily overridable by users, we could also just document these values without changing the defaults. (I'd probably prefer these defaults as the "pit of success" for the common cases and faster compiles, but could have easily missed important use-cases making this change unacceptable)

I have been working on a PR, and am currently still updating cargo unit tests (especially the ones related to the expectation of reuse, and some debugging is needed in the examples scraping tests).

That means I won't be done before today's t-cargo meeting, and thus opening an issue as @joshtriplett suggested, in case this could be a topic at the meeting (not that it's particularly urgent or anything)

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-feature-requestCategory: proposal for a feature. Before PR, ping rust-lang/cargo if this is not `Feature accepted`PerformanceGotta go fast!S-triageStatus: This issue is waiting on initial triage.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions