Skip to content

Do not store LLVM bitcode in RLIBs by default #66961

Closed
@michaelwoerister

Description

@michaelwoerister

@nnethercote asked me to write up a more detailed plan for getting rid of LLVM bitcode in RLIBs, so here goes:

Current situation

  • We store LLVM bitcode in every RLIB and Rust dylib so that rustc can
    perform cross-crate LTO.
  • This has quite a bit of a cost in terms of compile times and file sizes (See here).
  • Cross-crate LTO is a niche use case and the current approach makes that costlier
    than necessary too (because then machine code is duplicated)
  • When doing xLTO the shortcomings of the current implementation become
    especially obvious because then we have two basically identical LLVM bitcode
    files for each module (modulo compression).
  • Embedding bitcode also causes the incremental compilation cache to be much
    bigger because we have to keep object files and bitcode files in the cache.

Proposed Solution

The proposed solution is to follow Clang's model: When compiling for cross-crate
LTO, no machine code is generated and the .o file is actually LLVM bitcode.
There are special "fat" object files that contain regular machine code and
additionally, in a special section, uncompressed LLVM bitcode. These fat objects
would mainly be used for the standard library.

The consequences of this approach are:

  • Code compiled by the user (i.e. everything except libstd) would only be either
    machine code or LLVM bitcode.
  • When compiling an RLIB rustc needs to know if it is intended to be used for
    LTO or not. Thus Cargo needs to invoke rustc differently.
  • We save quite a bit of space and time in the common case.
  • Because the "fat" object files are a standard LLVM thing, libstd can partake
    in LTO steps performed by the LLVM linker plugin.

Open questions

  • Do we need to keep things backwards compatible by defaulting to "fat" object files? If the new default is to not store LLVM bitcode in RLIBs, then non-cargo build systems would start to silently not do LTO.
    • My preference would be to make no-bitcode the new default, but make the compiler emit a warning if wants to do LTO but encounters rlibs without bitcode.
  • Should rustc be able to transparently handle bitcode-only RLIBs in the non-LTO case?
    • My preference would be yes. Possibly also add a warning that that is unexpected?
  • LLVM's WASM implementation doesn't seem to support "fat" object files. I don't know if this is a fundamental restriction. I guess WASM projects are actually more likely to use LTO than regular projects. If WASM doesn't support custom sections, then we could also keep the current setup of having LLVM bitcode in a separate file in the RLIB archive. Maybe @alexcrichton or @fitzgen know more about WASM object files?

Steps to get there

That actually depends on how we resolve the open questions above.

cc #66598

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-enhancementCategory: An issue proposing an enhancement or a PR with one.I-compiletimeIssue: Problems and improvements with respect to compile times.I-heavyIssue: Problems and improvements with respect to binary size of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions