Description
Proposal Details
Motivation
These are features in modern x86 CPUs that enable speedups in computations related to tensors, and, more specifically, accelerate neural network / AI workloads.
While they are not used in Go directly, certain projects written in Go make use of the x/sys/cpu
package to change their behavior based on the CPU features (for example, Ollama selects the model runner that best matches the CPU).
Adding the ability to detect these features will allow such projects to make use of these CPU features (e.g. ollama will be able to select a runner built with VNNI support, something that it can't do now without resorting to parsing /proc/cpuinfo
or similar ugly solutions)
Proposal
Add the following members to the cpu.X86
struct in cpu.go
:
HasAVXIFMA bool // Advanced vector extension Integer Fused Multiply Add
HasAVXVNNI bool // Advanced vector extension Vector Neural Network Instructions
HasAVXVNNIInt8 bool // Advanced vector extension Vector Neural Network Int8 instructions
With the following way to fill them in cpu_x86.go
:
eax7, ebx7, ecx7, edx7 := cpuid(7, 0) // also save eax (to check if we can call with ecx=1)
if eax7 > 0 {
eax71, _, _, edx71 := cpuid(7, 1) // move it out of the X86.HasAVX512 block and save edx
X86.HasAVX512BF16 = isSet(5, eax71) // move this out too, since we check eax7 before calling CPUID with ecx=1
X86.HasAVXIFMA = isSet(23, eax71)
X86.HasAVXVNNI = isSet(4, eax71)
X86.HasAVXVNNIInt8 = isSet(4, edx71)
}
Please note that I used Wikipedia for reference here, so the CPUID bits might be off.