Closed
Description
Clang 7 added a codegen option, -mprefer-vector-width=
, which allows specifying the preferred vector width for autovectorization. This allows preventing wide vector operations in codegen. Why? Because eagerly using wide vector operations can hurt performance on Intel CPUs in at least two ways:
- Wide vector operations can knock the CPU out of the highest turbo states, reducing the frequency for the next few million cycles;
- Some CPUs implement wide operations by fusing execution ports, so interspersing wider operations with narrower ones can reduce overall throughput.
On the other hand, AVX512 doubles the number of vector registers, so it's useful for performance to enable target_cpu=skylake-avx512
, even without using 512-bit operations.
Unfortunately, there doesn't seem to be a way to use this option with rustc: it's not one of the -C
codegen options, and passing -C llvm-args=-prefer-vector-width=256
gives
rustc: Unknown command line argument '-prefer-vector-width=256'. Try: 'rustc -help'
rustc: Did you mean '-force-vector-width=256'?
It would be great if rustc had a way to set the preferred vector width.