Closed
Description
Couldn't find a previous issue on this, so I'd like to open a tracking issue for this. We've known this for a long time, but the metadata format for the compiler is far too large and there are surely methods to shrink its size and impact. Today when I compile librustc
, I get the following numbers:
librustc.rlib
- 64MBrustc.o
- 12MBrust.metadata.bin
- 32MBrustc.0.bytecode.deflate
- 21MB
This means that the metadata is three times as large as the code we're generating. Another statistic is that 36% of the binary data of the nightly is metadata.
There are, however, a number of competing concerns around metadata:
- Reading metadata needs to be fast. Rustc reads a lot of metadata for upstream crates, and it needs to quickly be able to read the minimum set of metadata for crates. This is currently achieved by storing metadata uncompressed in rlibs to allow LLVM to mmap it directly into the address space and page it in for reading.
- Metadata needs to be fairly free-form to allow encoding various types of data into it. Ideally it's also extensible into the future so at some point we can use newer compilers against older libraries (this currently not possible for other reasons).
- All libraries need metadata available in them (currently). This means that if a library produces a dylib/rlib pair (the stdlib is one of the few that does this) then the metadata is duplicated among artifacts. It also means that metadata must be suitable to place inside of a dynamic library.
There are a few open issues on this already, but none of them are necessarily a silver bullet. Here's a smattering of wishlist ideas or various strategies.
- Changing metadata formats may gain us a win. Currently we're using a "variant" of EBML. Convert extra::ebml to extra::atom_trees #9303
- Move metadata to a separate crate for faster iteration, allowing external tools to inspect, or in general improving the overall quality of the code. Move metadata handling into its own crate #2213
- Being able to actually inspect metadata would probably give a helpful hand into what can be eliminated. Add a raw pretty printer for the metadata format #2326
More will likely be added to this over time as it's a metabug.