Inlined function duplication across complex branches when `extern "Rust"` is used with LTO and `opt-level="s"`

## Context

**The example code I linked/described here is an MCVE. See Background For "Real" Applications section for details.**

* Consider a Rust binary which calls a function `free(f)` within its `main()`. `free()` takes a closure `f` with a branch (`?`) as input, and in turn calls `f` and then a function called `release()`.
* The Rust binary has a feature called `use-extern-cs`. When disabled, the body of both `free()` and `release()` are provided by an external crate called `critical`. When enabled, the `free()` function is provided by the main binary instead of `critical`, and _the `release()` function is marked as `extern "Rust"` in the main binary's source file._
* Within the `critical` crate, the `release()` function may or may not be marked as `#[inline]`. This is controlled by the `critical/inline` feature.

## Instructions

1. If testing `msp430`, make sure the `msp430-elf-gcc` [toolchain](https://www.ti.com/tool/MSP430-GCC-OPENSOURCE) is installed. Optionally install [`just`](https://github.com/casey/just) for convenience.
2. `git clone https://github.com/cr1901/msp430-size`. Use commit [b8ef905](https://github.com/cr1901/msp430-size/commit/b8ef9058596bf7cb2b1ea021aa2df1b76f44cc09) **specifically.**

   Despite the name of the repo, this code works for `thumbv6m-none-eabi` as well; the behavior appears to be arch-agnostic.
3. Make sure a nightly Rust toolchain is installed (for `-Zbuild-std=core`).
4. Run the following command:
    ```sh
    cargo +nightly rustc --manifest-path=./test-cases/Cargo.toml --target=$TARGET --release -Zbuild-std=core --example=critical --features=$FEATURES -- --emit=obj=target/$TARGET/release/examples/critical.o,llvm-ir=target/$TARGET/release/examples/critical.ll,asm=target/$TARGET/release/examples/critical.s
    ```

    where:
    * `$TARGET`: either `msp430-none-elf` or `thumbv6m-none-eabi`.
    * `$FEATURES`: empty, `use-extern-cs`, `critical/inline`, or `use-extern-cs,critical/inline`
5. Examine the output LLVM, assembly, and object/ELF files with `objdump` and look for a series of ten `nop`s once or multiple times. Each `nop` sled represents a call to `release`.

## Expected Behavior

The body of `release` appears once for the single call to `free()`, regardless of which combinations of features are enabled (including none).

## Actual Behavior

The body of `release` appears twice in the single call to `free()` for all combinations of features, except for `--features=critical/inline`.

## Other Hints
* Sometimes I don't need the `#[inline]` attribute to prevent `release`'s body from being duplicated. However, I could not translate this behavior well from my real application to MCVE. One way that I found works is to remove the `extern "Rust" fn release()` declaration, and paste the `critical::internal::release()` impl directly in the main source file.
* The `extern "Rust"` declaration seems to prevent `#[inline]` hints from working at all. 
* If `rustc` decides to duplicate `release`, sometimes `rustc` will inline one call of `release` into `free`, but not the other.
* `release` duplication appears in the LLVM files emitted by `rustc`.

## Background For "Real" Applications

The embedded Rust community has started to standardize around a pluggable `critical-section` [crate](https://github.com/rust-embedded/critical-section). The `critical-section` crate by necessity marks some functions as `extern "Rust"` and defers to other crates to define them. Specifically, the `critical_section::free(f)` function takes a closure `f()` and calls in order (args omitted):

1. `extern "Rust" acquire()`
2. `f()`
3. `extern "Rust" release()`

The crate doesn't define any new functionality for embedded Rust applications; it rather changes how existing functionality ([critical sections](https://en.wikipedia.org/wiki/Critical_section)) is implemented. _In principle, the crate should be drop-in to existing embedded Rust applications._

When I transitioned [some](https://github.com/cr1901/AT2XT) embedded Rust code to use the `critical-section` crate, I noticed marked size increases in the `.text` section (1992 bytes => 2048+ bytes- no longer fits) due to new overhead from how `critical_section::free(f)` is inlined in my main application's functions. Specifically, _if the closure `f` to `critical_section::free(f)` has a sufficiently complex branch, `rustc` will duplicate the body of `release` across both sides of the branch, even when `lto="fat"` and `opt-level="s"`._

Calling `critical_section::free()` is essential for sharing non-atomic data between interrupts/threads in a bare-metal application. To minimize interrupt latency/maximize the amount of work that can be done, **the size/speed overhead these calls should be kept as small as possible**. I don't understand why Rust is unable to inline calls to `critical_section::free(f)` without duplicating the body of `release` (_when `lto="fat"` and `codegen-units=1` is enabled_), regardless of
the following scenarios:
 
 1. `acquire()`, `release()`, and `free()` are all provided inline by the main binary.
 2. `acquire()`, `release()`, and `free()` are all provided by the same crate (via `use` statements no `extern "Rust"`).
 3. `free()` is provided by one crate (via `use`), `acquire()` and `release()` are provided by another (via `use`).
 4. `free()` is provided by one crate (via `use`), `extern "Rust" acquire()` and `extern "Rust" release()` are provided by another crate.

For the MCVE the body of `release` is exaggerated; actual size difference will vary depending on application. From my own testing, real `thumbv6m-none-eabi` applications have the duplication, but on average are affected less than `msp430-none-elf`.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Inlined function duplication across complex branches when `extern "Rust"` is used with LTO and `opt-level="s"` #102295

Context

Instructions

Expected Behavior

Actual Behavior

Other Hints

Background For "Real" Applications

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Inlined function duplication across complex branches when extern "Rust" is used with LTO and opt-level="s" #102295

Description

Context

Instructions

Expected Behavior

Actual Behavior

Other Hints

Background For "Real" Applications

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Inlined function duplication across complex branches when `extern "Rust"` is used with LTO and `opt-level="s"` #102295