Skip to content

Overhead of core::sync::atomic::* primitives in dev builds #68208

Open
@therealprof

Description

The primitives provided by core::sync::atomic come with a binary overhead in dev builds which also partly contradicts the documentation, e.g. compiler_fence states:

compiler_fence does not emit any machine code

However when we compile trivial program for an embedded target:

#![no_main]
#![no_std]

use panic_halt as _;
use cortex_m_rt::entry;

#[entry]
fn main() -> ! {
    loop {
        continue;
    }
}

e.g. for thumbv6m-none-eabi we can observe that we not only get one but two compiler_fence instances:

Analyzing target/thumbv6m-none-eabi/debug/examples/empty

File  .text Size       Crate Name
0.1%  11.3%  74B cortex_m_rt r0::init_data
0.0%   9.8%  64B cortex_m_rt core::sync::atomic::compiler_fence
0.0%   9.8%  64B  panic_halt core::sync::atomic::compiler_fence
0.0%   8.6%  56B cortex_m_rt r0::zero_bss
0.0%   7.3%  48B cortex_m_rt Reset
0.0%   6.4%  42B cortex_m_rt core::ptr::read
0.0%   5.5%  36B         std core::panicking::panic
0.0%   4.3%  28B   [Unknown] __aeabi_memcpy4
0.0%   4.3%  28B         std core::panicking::panic_fmt
0.0%   4.0%  26B cortex_m_rt core::intrinsics::copy_nonoverlapping
0.0%   3.4%  22B cortex_m_rt core::ptr::<impl *const T>::offset
0.0%   3.4%  22B cortex_m_rt core::ptr::<impl *mut T>::offset
0.0%   3.1%  20B cortex_m_rt HardFault_
0.0%   3.1%  20B  panic_halt rust_begin_unwind
0.0%   2.8%  18B cortex_m_rt DefaultHandler_
0.0%   2.8%  18B   [Unknown] __aeabi_memcpy
0.0%   2.4%  16B cortex_m_rt core::ptr::write
0.0%   2.4%  16B cortex_m_rt core::ptr::write_volatile
0.0%   2.4%  16B         std <T as core::any::Any>::type_id
0.0%   1.8%  12B cortex_m_rt core::mem::zeroed
0.0%   0.6%   4B   [Unknown] main
0.0%   0.3%   2B cortex_m_rt DefaultPreInit
0.0%   0.3%   2B         std core::ptr::real_drop_in_place
0.5% 100.0% 654B             .text section size, the file size is 127.6KiB

which amount to nearly 20% of the code size plus two instances of the panic message plus the formatting machinery which would not be needed otherwise.

In #68155 I've tried to coerce the compiler into fully inlining the code, especially since the argument (as I would imagine is pretty much always the case) is a constant. But as suspected by @rkruppe and @jonas-schievink, this does not have the intended effect.

As noted in the PR replacing the panic! by a unit type helps somewhat (and one could argue that a "lazy" compiler fence is pretty much equal to doing nothing and should not panic at runtime) but it would be better if we could somehow insure that the compiler_fence really only turns into a compiler hint and not a function call.

If we cannot ensure that a compiler_fence actually turns into nothing, would it be acceptable to also expose the various fence types directly via functions?

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    C-enhancementCategory: An issue proposing an enhancement or a PR with one.C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-heavyIssue: Problems and improvements with respect to binary size of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.WG-embeddedWorking group: Embedded systems

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions