Skip to content

Poor codegen for derived == on simple 2-field struct #117800

Open
@scottmcm

Description

@scottmcm

Given a basic struct like this,

#[derive(Copy, Clone, PartialEq, Eq)]
pub struct Entity {
    g: u32,
    i: u32
}

The generated == is suboptimal:

#[no_mangle]
pub fn derived_eq(x: &Entity, y: &Entity) -> bool {
    x == y
}
derived_eq:
        movq    xmm0, qword ptr [rdi]
        movq    xmm1, qword ptr [rsi]
        pcmpeqd xmm1, xmm0
        pshufd  xmm0, xmm1, 80
        movmskpd        eax, xmm0
        cmp     eax, 3
        sete    al
        ret

https://rust.godbolt.org/z/1b1xsnzx6

For comparison, not using short-circuiting

#[no_mangle]
pub fn good_eq(x: &Entity, y: &Entity) -> bool {
    (x.g == y.g) & (x.i == y.i)
}

gives a much-simpler codegen

good_eq:
        mov     rax, qword ptr [rsi]
        cmp     qword ptr [rdi], rax
        sete    al
        ret

This appears to be related to LLVM not knowing whether the second field is poison, as Alive2 confirms that LLVM isn't allowed to convert the former into the latter (at least for the optimized forms): https://alive2.llvm.org/ce/z/bAsJGN

Is there maybe some metadata we could put on the parameter attributes to tell LLVM that reading them isn't poison? It appears that just reading them first, like (same godbolt link above)

#[no_mangle]
pub fn failed_workaround(x: &Entity, y: &Entity) -> bool {
    let Entity { g: g1, i: i1 } = *x;
    let Entity { g: g2, i: i2 } = *y;
    g1 == g2 && i1 == i2
}

still isn't enough for it to remove the short-circuiting, as even though that emits the !noundef loads first, it seems like LLVM's SROAPass moves them behind the branch from &&.


FWIW, clang(trunk) has the same codegen difference: https://cpp.godbolt.org/z/bbaz196GP

It might not have a choice, though, since C++ references are mostly just pointers.

Metadata

Metadata

Assignees

No one assigned

    Labels

    A-LLVMArea: Code generation parts specific to LLVM. Both correctness bugs and optimization-related issues.A-codegenArea: Code generationC-optimizationCategory: An issue highlighting optimization opportunities or PRs implementing suchI-slowIssue: Problems and improvements with respect to performance of generated code.T-langRelevant to the language teamT-opsemRelevant to the opsem team

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions