Overhead of core::sync::atomic::* primitives in dev builds #68208
Description
The primitives provided by core::sync::atomic
come with a binary overhead in dev builds which also partly contradicts the documentation, e.g. compiler_fence
states:
compiler_fence does not emit any machine code
However when we compile trivial program for an embedded target:
#![no_main]
#![no_std]
use panic_halt as _;
use cortex_m_rt::entry;
#[entry]
fn main() -> ! {
loop {
continue;
}
}
e.g. for thumbv6m-none-eabi
we can observe that we not only get one but two compiler_fence
instances:
Analyzing target/thumbv6m-none-eabi/debug/examples/empty
File .text Size Crate Name
0.1% 11.3% 74B cortex_m_rt r0::init_data
0.0% 9.8% 64B cortex_m_rt core::sync::atomic::compiler_fence
0.0% 9.8% 64B panic_halt core::sync::atomic::compiler_fence
0.0% 8.6% 56B cortex_m_rt r0::zero_bss
0.0% 7.3% 48B cortex_m_rt Reset
0.0% 6.4% 42B cortex_m_rt core::ptr::read
0.0% 5.5% 36B std core::panicking::panic
0.0% 4.3% 28B [Unknown] __aeabi_memcpy4
0.0% 4.3% 28B std core::panicking::panic_fmt
0.0% 4.0% 26B cortex_m_rt core::intrinsics::copy_nonoverlapping
0.0% 3.4% 22B cortex_m_rt core::ptr::<impl *const T>::offset
0.0% 3.4% 22B cortex_m_rt core::ptr::<impl *mut T>::offset
0.0% 3.1% 20B cortex_m_rt HardFault_
0.0% 3.1% 20B panic_halt rust_begin_unwind
0.0% 2.8% 18B cortex_m_rt DefaultHandler_
0.0% 2.8% 18B [Unknown] __aeabi_memcpy
0.0% 2.4% 16B cortex_m_rt core::ptr::write
0.0% 2.4% 16B cortex_m_rt core::ptr::write_volatile
0.0% 2.4% 16B std <T as core::any::Any>::type_id
0.0% 1.8% 12B cortex_m_rt core::mem::zeroed
0.0% 0.6% 4B [Unknown] main
0.0% 0.3% 2B cortex_m_rt DefaultPreInit
0.0% 0.3% 2B std core::ptr::real_drop_in_place
0.5% 100.0% 654B .text section size, the file size is 127.6KiB
which amount to nearly 20% of the code size plus two instances of the panic message plus the formatting machinery which would not be needed otherwise.
In #68155 I've tried to coerce the compiler into fully inlining the code, especially since the argument (as I would imagine is pretty much always the case) is a constant. But as suspected by @rkruppe and @jonas-schievink, this does not have the intended effect.
As noted in the PR replacing the panic!
by a unit type helps somewhat (and one could argue that a "lazy" compiler fence is pretty much equal to doing nothing and should not panic at runtime) but it would be better if we could somehow insure that the compiler_fence
really only turns into a compiler hint and not a function call.
If we cannot ensure that a compiler_fence
actually turns into nothing, would it be acceptable to also expose the various fence types directly via functions?
Activity