This repository was archived by the owner on Dec 22, 2021. It is now read-only.
This repository was archived by the owner on Dec 22, 2021. It is now read-only.
Inefficient x64 codegen for all_true/any_true #189
Open
Description
all_true
checks if all lanes are (unsigned) greater than 0. This requires 4 instructions in cranelift and 6 in v8. Perhaps there is a more granular way to reduce lanes (see movemask) and avoid this inefficiency?
Along these lines, any_true
is 4 instructions in v8 and could be 2 as in cranelift with the use of SETcc
.