Atomic compare_exchange(_weak) functions produce overly complicated asm code on thumbv7(e)m-none-eabi(hf) targets

This code perform atomic increment:
```rust
#![no_std]

use core::sync::atomic::*;

pub extern "C" fn atomic_usize_inc(ptr: &AtomicUsize) -> usize {
    let mut old = ptr.load(Ordering::Relaxed);

    loop {
        let new = old + 1;

        match ptr.compare_exchange_weak(old, new, Ordering::Relaxed, Ordering::Relaxed) {
            Ok(_old) => break _old,
            Err(_old) => old = _old,
        }
    }
}
```

Expected asm:
```asm
example::atomic_usize_inc:
1:
        ldr     r1, [r0]
        add    r3, r1, #1

        ldrex   r2, [r0]
        cmp     r2, r1
        itt       ne
        clrexne
        bne     1b
        strex   r1, r3, [r0]
        cbz    r1, 2f
        b       1b
2:
        movs    r0, r2
```

Produced asm [godbolt](https://godbolt.org/z/n3M4az):
> ```asm
> example::atomic_usize_inc:
>         ldr     r2, [r0]
> .LBB0_1:
>         ldrex   r1, [r0]
>         cmp     r1, r2
>         bne     .LBB0_4
>         adds    r2, #1
>         strex   r3, r2, [r0]
>         cbnz    r3, .LBB0_5
>         movs    r2, #1
>         b       .LBB0_6
> .LBB0_4:
>         clrex
> .LBB0_5:
>         movs    r2, #0
> .LBB0_6:
>         cbnz    r2, .LBB0_25
>         ldrex   r2, [r0]
>         cmp     r2, r1
>         bne     .LBB0_10
>         adds    r1, #1
>         strex   r3, r1, [r0]
>         cbnz    r3, .LBB0_11
>         movs    r1, #1
>         b       .LBB0_12
> .LBB0_10:
>         clrex
> .LBB0_11:
>         movs    r1, #0
> .LBB0_12:
>         cbnz    r1, .LBB0_24
>         ldrex   r1, [r0]
>         cmp     r1, r2
>         bne     .LBB0_16
>         adds    r2, #1
>         strex   r3, r2, [r0]
>         cbnz    r3, .LBB0_17
>         movs    r2, #1
>         b       .LBB0_18
> .LBB0_16:
>         clrex
> .LBB0_17:
>         movs    r2, #0
> .LBB0_18:
>         cbnz    r2, .LBB0_25
>         ldrex   r2, [r0]
>         cmp     r2, r1
>         bne     .LBB0_22
>         adds    r1, #1
>         strex   r3, r1, [r0]
>         cbnz    r3, .LBB0_23
>         movs    r1, #1
>         cmp     r1, #0
>         beq     .LBB0_1
>         b       .LBB0_24
> .LBB0_22:
>         clrex
> .LBB0_23:
>         movs    r1, #0
>         cmp     r1, #0
>         beq     .LBB0_1
> .LBB0_24:
>         mov     r1, r2
> .LBB0_25:
>         mov     r0, r1
>         bx      lr
> ```

Code size is very important on Cortex-M targets (thumbv*) because some controllers have only 20KB flash!

Besides the increment instruction (`adds    r2, #1`) was moved into ldrex/strex section. This code lost `compare_exchange` advantage: evaluation before ldrex to reduce tick count when Exclusive Monitor is set.

This code is useless because we can call `fetch_add`. But other tasks can requires `compare_exchange_weak`, for example atomic increment with max condition (pseudocode):
```rust
pub extern "C" fn atomic_usize_inc_with_max(ptr: &AtomicUsize) -> (bool, usize) {
    atomic {
        let old = ptr.load(Ordering::Relaxed);
        if old == max {
            return (false, old);
        }

        ptr.store(old + 1, Ordering::Relaxed);
        return (true, old);
    }
}
```

### Meta
`rustc --version --verbose`:
```
rustc 1.50.0-nightly (1c389ffef 2020-11-24)
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Atomic compare_exchange(_weak) functions produce overly complicated asm code on thumbv7(e)m-none-eabi(hf) targets #79418

Meta

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Atomic compare_exchange(_weak) functions produce overly complicated asm code on thumbv7(e)m-none-eabi(hf) targets #79418

Description

Meta

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions