Skip to content

Overhead of core::sync::atomic::* primitives in dev builds #68208

Open
@therealprof

Description

@therealprof

The primitives provided by core::sync::atomic come with a binary overhead in dev builds which also partly contradicts the documentation, e.g. compiler_fence states:

compiler_fence does not emit any machine code

However when we compile trivial program for an embedded target:

#![no_main]
#![no_std]

use panic_halt as _;
use cortex_m_rt::entry;

#[entry]
fn main() -> ! {
    loop {
        continue;
    }
}

e.g. for thumbv6m-none-eabi we can observe that we not only get one but two compiler_fence instances:

Analyzing target/thumbv6m-none-eabi/debug/examples/empty

File  .text Size       Crate Name
0.1%  11.3%  74B cortex_m_rt r0::init_data
0.0%   9.8%  64B cortex_m_rt core::sync::atomic::compiler_fence
0.0%   9.8%  64B  panic_halt core::sync::atomic::compiler_fence
0.0%   8.6%  56B cortex_m_rt r0::zero_bss
0.0%   7.3%  48B cortex_m_rt Reset
0.0%   6.4%  42B cortex_m_rt core::ptr::read
0.0%   5.5%  36B         std core::panicking::panic
0.0%   4.3%  28B   [Unknown] __aeabi_memcpy4
0.0%   4.3%  28B         std core::panicking::panic_fmt
0.0%   4.0%  26B cortex_m_rt core::intrinsics::copy_nonoverlapping
0.0%   3.4%  22B cortex_m_rt core::ptr::<impl *const T>::offset
0.0%   3.4%  22B cortex_m_rt core::ptr::<impl *mut T>::offset
0.0%   3.1%  20B cortex_m_rt HardFault_
0.0%   3.1%  20B  panic_halt rust_begin_unwind
0.0%   2.8%  18B cortex_m_rt DefaultHandler_
0.0%   2.8%  18B   [Unknown] __aeabi_memcpy
0.0%   2.4%  16B cortex_m_rt core::ptr::write
0.0%   2.4%  16B cortex_m_rt core::ptr::write_volatile
0.0%   2.4%  16B         std <T as core::any::Any>::type_id
0.0%   1.8%  12B cortex_m_rt core::mem::zeroed
0.0%   0.6%   4B   [Unknown] main
0.0%   0.3%   2B cortex_m_rt DefaultPreInit
0.0%   0.3%   2B         std core::ptr::real_drop_in_place
0.5% 100.0% 654B             .text section size, the file size is 127.6KiB

which amount to nearly 20% of the code size plus two instances of the panic message plus the formatting machinery which would not be needed otherwise.

In #68155 I've tried to coerce the compiler into fully inlining the code, especially since the argument (as I would imagine is pretty much always the case) is a constant. But as suspected by @rkruppe and @jonas-schievink, this does not have the intended effect.

As noted in the PR replacing the panic! by a unit type helps somewhat (and one could argue that a "lazy" compiler fence is pretty much equal to doing nothing and should not panic at runtime) but it would be better if we could somehow insure that the compiler_fence really only turns into a compiler hint and not a function call.

If we cannot ensure that a compiler_fence actually turns into nothing, would it be acceptable to also expose the various fence types directly via functions?

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-enhancementCategory: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-heavyIssue: Problems and improvements with respect to binary size of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.WG-embeddedWorking group: Embedded systems

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions