Skip to content

ThinLTO bloats size of bare metal programs by up to 1200% #47770

Closed
@japaric

Description

@japaric

Original report: japaric/stm32f103xx-hal#44

STR

$ git clone https://github.com/japaric/stm32f103xx-hal

$ cd stm32f103xx-hal

$ git checkout 9f80811a6026f0849ce2afee5baf0cf086563b65

$ xargo build --example reactive-serial-circ --release
(..)
    Finished release [optimized + debuginfo] target(s) in 13.52 secs
(..)
    Finished release [optimized + debuginfo] target(s) in 2.24 secs
(..)
    Finished release [optimized + debuginfo] target(s) in 38.37 secs

$ arm-none-eabi-size target/thumbv7m-none-eabi/release/examples/reactive-serial-circ
   text    data     bss     dec     hex filename
  12154       8      20   12182    2f96 (..)

$ xargo bloat --example reactive-serial-circ --release
File  .text   Size Name
0.0%   2.3%   170B [15 Others]
0.4%  18.9% 1.4KiB core::fmt::Formatter::pad_integral
0.2%  12.9%   962B core::fmt::Formatter::pad
0.2%   9.7%   720B stm32f103xx_hal::rcc::CFGR::freeze
0.2%   9.0%   672B core::str::slice_error_fail
0.1%   7.3%   544B core::fmt::write
0.1%   7.2%   538B <char as core::fmt::Debug>::fmt
0.1%   5.9%   440B _ZN20reactive_serial_circ4main17h416bbf939f973349E.llvm.E85F65E9
0.1%   4.6%   340B cortex_m::peripheral::nvic::<impl cortex_m::peripheral::NVIC>::enable
0.1%   3.6%   266B _ZN4core3fmt10ArgumentV110show_usize17h636fea4f06967745E.llvm.EA6B2883
0.1%   3.6%   266B core::fmt::num::<impl core::fmt::Display for usize>::fmt
0.1%   3.6%   266B core::fmt::num::<impl core::fmt::Debug for usize>::fmt
0.0%   2.4%   182B _ZN4core12char_private5check17h7f91014bbb6a3313E.llvm.9871CEB7
0.0%   1.6%   118B DMA1_CHANNEL5
0.0%   1.4%   106B cortex_m_rt::reset_handler
0.0%   1.2%    92B core::result::unwrap_failed
0.0%   1.0%    78B core::slice::slice_index_order_fail
0.0%   1.0%    78B core::slice::slice_index_len_fail
0.0%   1.0%    74B core::panicking::panic_bounds_check
0.0%   0.9%    70B <core::ops::range::Range<Idx> as core::fmt::Debug>::fmt
0.0%   0.8%    60B core::panicking::panic
1.9% 100.0% 7.3KiB .text section size, the file size is 376.2KiB

Changing profile.release.codegen-units to 1 (suggested in other issues) does not help. Fully
disabling ThinLTO passing -Z thinlto=no to rustc (is there a Cargo.toml setting for that?) fixes
the binary size problem without meaningfully affecting the compilation speed (if anything it's
slightly faster)

$ cat .cargo/config
[target.thumbv7m-none-eabi]
runner = 'arm-none-eabi-gdb'
rustflags = [
  "-C", "link-arg=-Tlink.x",
  "-C", "linker=arm-none-eabi-ld",
  "-Z", "linker-flavor=ld",
  "-Z", "thinlto=no", # <-
]

[build]
target = "thumbv7m-none-eabi"

$ xargo build --example reactive-serial-circ --release
(..)
    Finished release [optimized + debuginfo] target(s) in 12.69 secs
(..)
    Finished release [optimized + debuginfo] target(s) in 1.98 secs
(..)
    Finished release [optimized + debuginfo] target(s) in 38.54 secs

$ arm-none-eabi-size target/thumbv7m-none-eabi/release/examples/reactive-serial-circ
   text    data     bss     dec     hex filename
    890       8      20     918     396 (..)

$ xargo bloat --example reactive-serial-circ --release
File  .text Size Name
0.0%   0.0%   0B [0 Others]
0.1%  37.2% 250B reactive_serial_circ::init
0.1%  29.2% 196B cortex_m_rt::reset_handler
0.0%  16.1% 108B DMA1_CHANNEL5
0.0%   1.5%  10B core::result::unwrap_failed
0.0%   1.5%  10B SVCALL
0.0%   1.5%  10B MEM_MANAGE
0.0%   1.5%  10B DEBUG_MONITOR
0.0%   1.5%  10B PENDSV
0.0%   1.5%  10B USAGE_FAULT
0.0%   1.5%  10B SYS_TICK
0.0%   1.5%  10B NMI
0.0%   1.5%  10B DEFAULT_HANDLER
0.0%   1.5%  10B BUS_FAULT
0.0%   1.5%  10B HARD_FAULT
0.0%   0.6%   4B core::panicking::panic_fmt
0.0%   0.6%   4B cortex_m_rt::default_handler
0.3% 100.0% 672B .text section size, the file size is 261.6KiB

Meta

$ rustc -Vv
rustc 1.25.0-nightly (ae920dcc9 2018-01-22)

Given the severity of the issue I'm going to recommend my users to disable ThinLTO as I already do
with parallel codegen and incremental compilation.

cc @alexcrichton @aturon
cc #47745

Metadata

Metadata

Assignees

No one assigned

    Labels

    I-slowIssue: Problems and improvements with respect to performance of generated code.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions