Closed
Description
I tried this code (godbolt):
pub type T = [u64; 4];
#[no_mangle]
pub fn swap_32(a: &mut T, b: &mut T) {
std::mem::swap(a, b);
unsafe {
// std::ptr::swap(a, b);
// std::ptr::swap_nonoverlapping(a, b, 1);
}
}
I expected to see this happen: code optimizes to a few <_ x i64>
(target-cpu-dependent) loads and stores on -C opt-level=3
.
Instead, this happened: the code is not vectorized when -C linker-plugin-lto
is also passed to the compiler (this is passed together with -Clto=...
but that has no effect here)
Note that this only happens with mem::swap
/ptr::swap_nonoverlapping
, not ptr::swap
.
Meta
rustc --version --verbose
:
rustc 1.79.0-nightly (f9b161492 2024-04-19)
binary: rustc
commit-hash: f9b16149208c8a8a349c32813312716f6603eb6f
commit-date: 2024-04-19
host: x86_64-unknown-linux-gnu
release: 1.79.0-nightly
LLVM version: 18.1.4
This happens since at least 1.61, but I could not find a way to bisect this, so I opted not to mark this a regression.
@rustbot modify labels: +A-LLVM +A-LTO +I-slow
Metadata
Metadata
Assignees
Labels
No labels