Open
Description
( Inspired by @erikdesjardins 's example in #115212 (comment) )
When pushing a [u8; 16]
(which is passed indirectly)
pub fn push(vec: &mut Vec<[u8; 16]>, data: [u8; 16]) {
vec.push(data);
}
On nightly today we load data
to an xmm register, then spill it to stack to call reserve_for_push
, then load it again https://godbolt.org/z/4jTK4r3Ko
example::push:
push rbx
sub rsp, 16
mov rbx, rdi
movups xmm0, xmmword ptr [rsi]
mov rsi, qword ptr [rdi + 16]
cmp rsi, qword ptr [rdi + 8]
jne .LBB2_2
mov rdi, rbx
movaps xmmword ptr [rsp], xmm0 ; <-- LOOK
call alloc::raw_vec::RawVec<T,A>::reserve_for_push
movaps xmm0, xmmword ptr [rsp] ; <-- LOOK
mov rsi, qword ptr [rbx + 16]
.LBB2_2:
mov rax, qword ptr [rbx]
mov rcx, rsi
shl rcx, 4
movups xmmword ptr [rax + rcx], xmm0
inc rsi
mov qword ptr [rbx + 16], rsi
add rsp, 16
pop rbx
ret
We should make the stack adjustment unnecessary here -- ideally by loading it after the reserve_for_push
, but other things like tweaking calling conventions could help too.
Metadata
Metadata
Assignees
Labels
Category: An issue highlighting optimization opportunities or PRs implementing suchIssue: Problems and improvements with respect to performance of generated code.Relevant to the compiler team, which will review and decide on the PR/issue.Relevant to the library team, which will review and decide on the PR/issue.