crypto/sha256: amd64 block assembly smashes frame pointer register #63508
Labels
NeedsInvestigation
Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
OS-Linux
Performance
Milestone
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (
go env
)?linux-amd64
go env
OutputWhat did you do?
Profile the
crypto/sha256
benchmarks, using both the built-in pprof profiling and using Linuxperf
:Then open the profiles with
pprof
and compare the flame graphs:What did you expect to see?
Approximately the same profile.
What did you see instead?
The pprof profile looks normal, as expected:
The perf profile has broken call stacks, displaying
sha256.block
by itself, rather than as a callee ofWrite
.pprof profiling uses the runtime's internal knowledge of frame sizes to find the caller, so it works fine. The Linux kernel doesn't know about Go's metadata and instead typically uses frame pointers (
BP
) to find callees.For some reason, the amd64 implementation of
block
overwritesBP
, which I presume is what is confusing Linux.I admittedly don't understand everything going on in that function (wow, it is complicated!), but writing to
BP
doesn't seem to be necessary? The function doesn't seem to actually adjust the stack or create new frames? It looks likeBP
might just be getting used as a scratch register?It looks like this has existed since the original https://go.dev/cl/28460043.
cc @FiloSottile @rolandshoemaker @felixge @4a6f656c
The text was updated successfully, but these errors were encountered: