Description
I presume supporting AVX-512 intrinsics is in plan somewhere, but couldn't find an existing issue tracking their addition. There seem to be two parts to this.
- Support for EVEX encoding and use of zmm registers. I'm not entirely clear on compiler versus jit distinctions but perhaps this would allow jit to update existing 128 and 256 bit wide code using the Sse*, Avx*, or other System.Runtime.Intrinsics.X86 classes to EVEX.
- Addition of Avx512 classes with the new instructions at 128, 256, and 512 bit widths.
There is some interface complexity with the (as of this writing) 17 AVX-512 subsets since Knights Landing/Mill, Skylake, Cannon Lake, Cascade Lake, Cooper Lake, and Ice/Tiger Lake all support different variations. To me, it seems most natural to deprioritize support for the Knights (they're no longer in production, so presumably nearly all code targeting them has already been written) and implement something in the direction of
class Avx512FCD : Avx2 // minimum common set across all Intel CPUs with AVX-512
class Avx512VLDQBW : Avx512FCD // common set for enabled Skylake μarch cores and Sunny Cove
plus non-inheriting classes for BITALG, IMFA52, VBMI, VBMI2, VNNI, BF16, and VP2INTERSECT (the remaining four subsets—4FMAPS, 4NNIW, ER, and PF—are specific to Knights). This is similar to the existing model for Bmi1, Bmi2, and Lzcnt and aligns to current hardware in a way which composes with existing inheritance and IsSupported properties. It also helps with incremental roll out.
Finding naming for code readability that's still clear as to which instructions are available where seems somewhat tricky. Personally, I'd be content with idioms like
using Avx512 = System.Runtime.Intrinsics.X86.Avx512VLDQBW; // loose terminology
but hopefully others will have better ideas.