Open
Description
Labels: I-slow
Maybe this is an LLVM issue rather than a rustc issue, but std::ptr::write_volatile
, when used with arrays, results in highly non-optimal assembly on x86_64.
For example
#[repr(align(8))]
pub struct B8 (
[u8; 8]
);
impl B8 {
const ZERO: Self = Self([0; 8]);
}
pub fn zeroize_b8(bytes: &mut B8) {
let ptr: *mut B8 = bytes as *mut B8;
unsafe {
std::ptr::write_volatile(ptr, B8::ZERO);
}
}
results in
playground::zeroize_b8:
pushq %rax
movq $0, (%rsp) # write 0u64 to the stack
movq (%rsp), %rax # read this 0u64 from the stack to rax
movq %rax, (%rdi) # perform the actual write operation
popq %rax
retq
It first writes the value to the stack, than reads it from the stack into a register, and finially writes that register to the memory pointed at by ptr
.
This takes 3 move instructions while we only need 1!
playground::zeroize_b8_handwritten_asm:
movq $0, (%rdi) # perform the actual write operation
retq
Using a u64
instead of an eight byte array works fine, and only issues a single move. (In fact it emits asm identical to the handwritten asm above.)
Try it on playground
Metadata
Metadata
Assignees
Labels
Area: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.Area: Code generationArea: IntrinsicsCategory: An issue highlighting optimization opportunities or PRs implementing suchIssue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.