Zero register optimization for AVX-512-VBMI #14241

Whatcookie · 2023-07-22T01:15:23Z

I had an idea for an optimization and decided to write it before I forgot about it.

Take advantage of the fact that AVX instructions zero the upper 128 bits for a nice optimization when one input vector is zeroed, using the same 256 wide vpermb trick. Since bit 0x10 is selecting the second (zeroed) vector, with a 256 wide vpermb, the 0x10 bit conveniently selects the already zeroed bits.

- Take advantage of the fact that AVX instructions zero the upper 128 bits for a nice optimization when one input vector is zeroed

Zero register optimization for AVX-512-VBMI

eabb2e7

- Take advantage of the fact that AVX instructions zero the upper 128 bits for a nice optimization when one input vector is zeroed

Whatcookie force-pushed the shufb_utopia branch from 81aa8ac to eabb2e7 Compare July 22, 2023 01:27

elad335 added the LLVM Related to LLVM instruction decoders label Jul 22, 2023

Merge branch 'master' into shufb_utopia

4796d54

elad335 requested a review from Nekotekina August 6, 2023 08:43

Megamouse assigned Nekotekina Aug 15, 2023

Nekotekina merged commit 290ff5b into RPCS3:master Aug 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Zero register optimization for AVX-512-VBMI #14241

Zero register optimization for AVX-512-VBMI #14241

Whatcookie commented Jul 22, 2023

Zero register optimization for AVX-512-VBMI #14241

Zero register optimization for AVX-512-VBMI #14241

Conversation

Whatcookie commented Jul 22, 2023