Skip to content

Missed optimization for unaligned store via shifts #139441

Open
@0f-0b

Description

@0f-0b

This Rust code, when compiled for aarch64-unknown-linux-gnu, generates only one instruction.

#[inline(never)]
pub fn write(out: &mut [u32; 2], a: u64) {
  out[0] = a as u32;
  out[1] = (a >> 32) as u32;
}
example::write::h4c19b1f2c54c5627:
        str     x1, [x0]
        ret

However, inefficient code is emitted when there are 2 or more u64s to store.

#[inline(never)]
pub fn write2(out: &mut [u32; 4], a: u64, b: u64) {
  out[0] = a as u32;
  out[1] = (a >> 32) as u32;
  out[2] = b as u32;
  out[3] = (b >> 32) as u32;
}
example::write2::h650f933056ff8897:
        lsr     x8, x1, #32
        lsr     x9, x2, #32
        stp     w1, w8, [x0]
        stp     w2, w9, [x0, #8]
        ret

Compiler Explorer.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions