-
Notifications
You must be signed in to change notification settings - Fork 18.4k
Description
Similar to: #45453.
Each ARM sub-architecture iteration adds some new instructions, some of which may be useful for Go. For example, ARMv8.2 introduced:
- v8.2: FEAT_LSE, Large System Extensions (though it looks like there have been changes to this extension in later iterations)
LDADD
(atomic add), useful foratomic.Add*
.LDSETAL
(atomic or), see sync/atomic: add OR/AND operators for unsigned types #61395.CAS
(compare and swap), useful foratomic.CompareAndSwap*
SWP
(swap), useful foratomic.Swap*
- ...more similar atomic instructions.
These instructions are already used after a capability check (thanks @prattmic for pointing this out). These capability check jumps predict perfectly, but the blocks aren't tiny either (godbolt):
It might be useful to define some new possible GOARM
values akin to GOAMD
to elide these checks, potentially increasing performance and reducing binary sizes by some tiny amount. I have not done any performance measurements, but received hearsay from colleagues that using these instructions was a significant win for them (but I did not ask whether they used a capability check or whether they were comparing with always using the less capable version).
Other potential candidate:
- v8.1: FEAT_CRC32, Changes to CRC32 instructions (made mandatory)
- v8.8: FEAT_MOPS, Standardization of memory operations, slides).
runtime.memmove
is >1% of cycles when looking at data available here. UsingFEAT_MOPS
unconditionally could also have good icache effects. - v8.8: FEAT_HBC, Hinted conditional branches (perhaps indented branches or anything that can be proven to contain a non-nil error via dataflow could be hinted "unlikely").
Metadata
Metadata
Assignees
Labels
Type
Projects
Status