Closed
Description
@nnethercote asked me to write up a more detailed plan for getting rid of LLVM bitcode in RLIBs, so here goes:
Current situation
- We store LLVM bitcode in every RLIB and Rust dylib so that
rustc
can
perform cross-crate LTO. - This has quite a bit of a cost in terms of compile times and file sizes (See here).
- Cross-crate LTO is a niche use case and the current approach makes that costlier
than necessary too (because then machine code is duplicated) - When doing xLTO the shortcomings of the current implementation become
especially obvious because then we have two basically identical LLVM bitcode
files for each module (modulo compression). - Embedding bitcode also causes the incremental compilation cache to be much
bigger because we have to keep object files and bitcode files in the cache.
Proposed Solution
The proposed solution is to follow Clang's model: When compiling for cross-crate
LTO, no machine code is generated and the .o
file is actually LLVM bitcode.
There are special "fat" object files that contain regular machine code and
additionally, in a special section, uncompressed LLVM bitcode. These fat objects
would mainly be used for the standard library.
The consequences of this approach are:
- Code compiled by the user (i.e. everything except
libstd
) would only be either
machine code or LLVM bitcode. - When compiling an RLIB
rustc
needs to know if it is intended to be used for
LTO or not. Thus Cargo needs to invokerustc
differently. - We save quite a bit of space and time in the common case.
- Because the "fat" object files are a standard LLVM thing,
libstd
can partake
in LTO steps performed by the LLVM linker plugin.
Open questions
- Do we need to keep things backwards compatible by defaulting to "fat" object files? If the new default is to not store LLVM bitcode in RLIBs, then non-cargo build systems would start to silently not do LTO.
- My preference would be to make no-bitcode the new default, but make the compiler emit a warning if wants to do LTO but encounters rlibs without bitcode.
- Should rustc be able to transparently handle bitcode-only RLIBs in the non-LTO case?
- My preference would be yes. Possibly also add a warning that that is unexpected?
- LLVM's WASM implementation doesn't seem to support "fat" object files. I don't know if this is a fundamental restriction. I guess WASM projects are actually more likely to use LTO than regular projects. If WASM doesn't support custom sections, then we could also keep the current setup of having LLVM bitcode in a separate file in the RLIB archive. Maybe @alexcrichton or @fitzgen know more about WASM object files?
Steps to get there
That actually depends on how we resolve the open questions above.
cc #66598