Open
Description
We currently always prefer (V)BLENDPS/D over (V)MOVSS/D on SSE41+ targets due to better throughput on SandyBridge, unless we're optimizing for size.
Some AMD targets however have better throughput with (V)MOVSS/D - so it'd be better if we could remove the preference code from DAG and ISEL patterns and let X86FixupInstTunings handle it.
This might make it easier to improve merging of some SSE scalar ops (see #140693)