Skip to content

all: add GOAMD64 environment variable #45453

Closed
@mdempsky

Description

@mdempsky

This proposal is to add a GOAMD64 environment variable, with the initial options of "baseline" (default), "v2", and "v3".

Most Go architectures support a GO[arch] environment variable to control architecture-specific options: GO386, GOARM, GOMIPS, GOMIPS64, GOPPC64, GOWASM. However, the AMD64 port (presumably the most common architecture Go is deployed on) still limits itself to the original, now-20-year-old instruction set, with some occasional runtime CPUID detection when the savings is significant enough to merit it. (For comparison, GOPPC64 supports optimizing for power9, which only became available in 2017.)

This is further complicated by x86-64 having accumulated many, many instruction set extensions, with each processor revision having a different set of supported extensions. Making users responsible for deciding what set of extensions to enable doesn't feel very Go-like.

However, in 2020, the x86-64 psABI added four named microarchitecture levels to help group the extensions: "x86-64 (baseline)", "x86-64-v2", "x86-64-v3", and "x86-64-v4". See https://en.wikipedia.org/wiki/X86-64#Microarchitecture_levels or https://developers.redhat.com/blog/2021/01/05/building-red-hat-enterprise-linux-9-for-the-x86-64-v2-microarchitecture-level/ for further details.

The "baseline" corresponds to what Go already supports, while "v2" and "v3" each add some new instructions that could be useful for Go programs (e.g., POPCNT in v2, BMI1/BMI2 in v3).

v2 CPUs appear commonplace today. E.g., RHEL9 will only support v2, per the above blog post; all GCE CPUs support v2, and I believe all AWS and Azure CPUs too.

v3 CPUs are also increasingly common. E.g., only GCE's Ivy Bridge and Sandy Bridge CPUs are limited to v2; Haswell (launched 2013) and newer support v3.

On issue #25489, I reported results from two optimization attempts at using Haswell's BMI instructions (PEXT for varint decoding, LZCNT and a couple others for scanobject). These are optimizations that could benefit from targeting v3 CPUs specifically, but probably wouldn't be worthwhile if they needed to rely on runtime CPUID detection.

It's also been suggested that at process startup, the Go runtime should throw if it's been compiled to assume instruction set extensions that aren't available on the CPU. I think that's a good idea.

Questions:

  • Are "baseline", "v2", and "v3" the best names? "v1" would perhaps be better than "baseline", but the psABI doesn't formally name it that. We could suggest that though?

  • Should we add "v4" too? This only adds AVX512 instructions, which the Go compiler/runtime don't immediately have any use for, and which seem a bit contentious about whether to use them on current processors anyway.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions