Description
Older x86 hardware only had the legacy SSE encoding which encodes two parameters: dst/op1, and op2. These instructions are considered RMW since dst and op1 are encoded in the same parameter. In order to handle this, you generally need to insert an additional move instruction if dst
and op1
were not determined to be the same by the register allocator.
Newer x86 hardware (anything with AVX support) has the newer VEX encoding which takes three parameters: dst, op1, and op2. This encoding is not RMW and does not require an additional move instruction. The encoding is also more efficient and takes up the same number of bytes to encode (for the same allocated registers) or less bytes when dst != op1 (since you don't need to also encode an additional move instruction).
We are already emitting the VEX encoding by default for floating-point instructions; however codegen for non-intrinsic codepaths are not VEX aware and are still treating floating-point operations as RMW and as if the encoding only supports dst/op1
and op2
.
It would be beneficial if the codegen and register allocator were updated to be VEX aware and to call the appropriate emit_SIMD
codepath (which handles VEX vs legacy encoding differences) where possible.
category:cq
theme:floating-point
skill-level:expert
cost:large