Skip to content

Vec::push should load a passed-by-pointer argument *after* reserve_for_push #115242

Open
@scottmcm

Description

@scottmcm

( Inspired by @erikdesjardins 's example in #115212 (comment) )

When pushing a [u8; 16] (which is passed indirectly)

pub fn push(vec: &mut Vec<[u8; 16]>, data: [u8; 16]) {
    vec.push(data);
}

On nightly today we load data to an xmm register, then spill it to stack to call reserve_for_push, then load it again https://godbolt.org/z/4jTK4r3Ko

example::push:
        push    rbx
        sub     rsp, 16
        mov     rbx, rdi
        movups  xmm0, xmmword ptr [rsi]
        mov     rsi, qword ptr [rdi + 16]
        cmp     rsi, qword ptr [rdi + 8]
        jne     .LBB2_2
        mov     rdi, rbx
        movaps  xmmword ptr [rsp], xmm0    ;    <-- LOOK
        call    alloc::raw_vec::RawVec<T,A>::reserve_for_push
        movaps  xmm0, xmmword ptr [rsp]    ;    <-- LOOK
        mov     rsi, qword ptr [rbx + 16]
.LBB2_2:
        mov     rax, qword ptr [rbx]
        mov     rcx, rsi
        shl     rcx, 4
        movups  xmmword ptr [rax + rcx], xmm0
        inc     rsi
        mov     qword ptr [rbx + 16], rsi
        add     rsp, 16
        pop     rbx
        ret

We should make the stack adjustment unnecessary here -- ideally by loading it after the reserve_for_push, but other things like tweaking calling conventions could help too.

Metadata

Metadata

Assignees

No one assigned

    Labels

    C-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.T-compilerRelevant to the compiler team, which will review and decide on the PR/issue.T-libsRelevant to the library team, which will review and decide on the PR/issue.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions