Skip to content

More folding to vpternlogd? #107619

Open
Open
@stephentoub

Description

Based on the description in #91227, I thought each of the following might both compile down to a single vpternlogd:

static Vector512<int> Exp1(Vector512<int> a, Vector512<int> b, Vector512<int> c) =>
    Vector512.ConditionalSelect(a, b & c, b | c);

static Vector512<int> Exp2(Vector512<int> a, Vector512<int> b, Vector512<int> c) =>
    (a & (b & c)) | (~a & (b | c));

but they don't today. The first results in a vpternlogd, but it's the standard one for ConditionalSelect used to choose between the results, and it's thus still computing the and and or separately:

vmovups zmm0, zmmword ptr [r8]
vmovups zmm1, zmmword ptr [r9]
vpandd zmm2, zmm1, zmm0
vpord zmm0, zmm1, zmm0
vpternlogd zmm0, zmm2, zmmword ptr [rdx], -40

The second results in two vpternlogds that are then or'd together:

vmovups zmm0, zmmword ptr [rdx]
vmovups zmm1, zmmword ptr [r8]
vmovups zmm2, zmmword ptr [r9]
vmovaps zmm3, zmm0
vpternlogd zmm3, zmm2, zmm1, -128
vpternlogd zmm2, zmm1, zmm0, 84
vpord zmm0, zmm2, zmm3

rather than a single vpternlogd that handles the whole bitwise operation.

Is this just further opportunity? Or is there something preventing such optimization?

cc: @tannergooding, @EgorBo

Activity

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

No one assigned

    Labels

    area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIhelp wanted[up-for-grabs] Good issue for external contributors

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions