Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/binary: add var NativeEndian; also x/sys/cpu.IsBigEndian #57237

Closed
bradfitz opened this issue Dec 11, 2022 · 49 comments
Closed

encoding/binary: add var NativeEndian; also x/sys/cpu.IsBigEndian #57237

bradfitz opened this issue Dec 11, 2022 · 49 comments

Comments

@bradfitz
Copy link
Contributor

bradfitz commented Dec 11, 2022

I'd like to revisit #35398 (proposal: encoding/binary: add NativeEndian). Sorry. I know the history & people's opinions already. But I come with a story! 😄

Go 1.19 added support forGOARCH=loong64 (https://go.dev/doc/go1.19#loong64)

So naturally somebody wanted to compile tailscale.com/cmd/tailscaled with GOOS=linux GOARCH=loong64. Compilation failed.

It turned out we have four different native endian packages in our dependency tree:

So we had to update all four, along with the various requisite go.mod bumps.

Some observations:

  • they're all very similar
  • people don't like taking dependencies on other big packages. @josharian's github.com/josharian/native that he mentioned in proposal: encoding/binary: add NativeEndian #35398 (comment) is closest, but lacks the constant that we ended up needing in Tailscale. So everybody makes their own local copies instead. That works until a new GOARCH comes along. Maybe that's rare enough? But I'm sure more riscv* variants will come along at some point.

x/sys/cpu already has this code:

https://cs.opensource.google/go/x/sys/+/refs/tags/v0.3.0:cpu/byteorder.go;l=44

And it has even more GOARCH values (for gccgo) than any other package has!

So everybody has a different subset of GOARCH values it seems.

I know people don't want to encourage thinking about or abusing endianness, but it's a reality when talking to kernel APIs. And this is kinda ridiculous, having this duplicated incompletely everywhere.

It would've been neat if Go could've added loong64 and had a bunch of code in the Go ecosystem just work right away and not require adjusting build tags.

Alternatively, if std and x/sys/cpu are too objectionable: what about new build tags?

/cc @josharian @mdlayher @hugelgupf @zx2c4 @yetist @jwhited @raggi

@gopherbot gopherbot added this to the Proposal milestone Dec 11, 2022
@mdlayher
Copy link
Member

Yes please. Another option: unsafe.NativeEndian to imply that you shouldn't use this unless you are aware of the implications?

@ianlancetaylor
Copy link
Contributor

CC @robpike

@josharian
Copy link
Contributor

@josharian's github.com/josharian/native that he mentioned in #35398 (comment) is closest, but lacks the constant that we ended up needing in Tailscale.

FWIW, I know the author of that package, and he’s amenable to adding a constant.

But an x/sys/cpu or build tag option would be better. :)

bradfitz added a commit to bradfitz/native that referenced this issue Dec 13, 2022
josharian pushed a commit to josharian/native that referenced this issue Dec 13, 2022
bradfitz added a commit to tailscale/tailscale that referenced this issue Dec 13, 2022
See josharian/native#3

Updates golang/go#57237

Change-Id: I238c04c6654e5b9e7d9cfb81a7bbc5e1043a84a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
bradfitz added a commit to bradfitz/uio that referenced this issue Dec 13, 2022
bradfitz added a commit to bradfitz/uio that referenced this issue Dec 13, 2022
Fixes u-root#7
Updates golang/go#57237

Signed-off-by: Brad Fitzpatrick <brad@danga.com>
bradfitz added a commit to tailscale/tailscale that referenced this issue Dec 13, 2022
See josharian/native#3

Updates golang/go#57237

Change-Id: I238c04c6654e5b9e7d9cfb81a7bbc5e1043a84a2
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
hugelgupf pushed a commit to u-root/uio that referenced this issue Dec 13, 2022
Fixes #7
Updates golang/go#57237

Signed-off-by: Brad Fitzpatrick <brad@danga.com>
@robpike
Copy link
Contributor

robpike commented Dec 13, 2022

You know my position about byte order. Yes, it's sometimes necessary but it's used far more than it should be. I'd prefer to take up @josharian's first suggestion and not have the standard library promote the concept.

bradfitz added a commit to bradfitz/dhcp that referenced this issue Dec 13, 2022
bradfitz added a commit to bradfitz/dhcp that referenced this issue Dec 13, 2022
To pick up this change:
u-root/uio@c353755

Updates golang/go#57237

Signed-off-by: Brad Fitzpatrick <brad@danga.com>
@bradfitz
Copy link
Contributor Author

I'd prefer to take up @josharian's first suggestion and not have the standard library promote the concept.

What about const IsBigEndian = false in x/sys/cpu? That's not the standard library. And that cpu package is very much about telling you what the CPU does.

@robpike
Copy link
Contributor

robpike commented Dec 13, 2022

I'm not going to die on this hill, but know too that big/little endian is not the full story for some architectures. Maybe Go will never support a nutty layout, but who knows?

In other words, is a boolean sufficient? It might be nowadays, but I honestly don't know. It's probably sufficient to support the architectures encoding/binary does, so maybe it's enough.

At least your suggestion doesn't promote the idea of "native" byte order explicitly.

But: A single third-party package that just told you the byte order by computing it (this can only be done unsafely, which is a BIG part of why I dislike the concept) seems like the real right answer to me.

@raggi
Copy link
Contributor

raggi commented Dec 13, 2022

Is part of the problem here more one of naming, e.g. "NativeEndian", perhaps something closer to what this is used for in most good cases is the systems ABI endianness. I'm not sure of a good short name to describe that, but perhaps if we had one it would be less objectionable?

@robpike
Copy link
Contributor

robpike commented Dec 14, 2022

@raggi Not especially. It's not what it's called, it's what it represents, an internal detail that is almost always (not always, but almost always) irrelevant. A long history of bad C code has taught people that it matters more than it does.

Only unsafe code can care.

@DeedleFake
Copy link

DeedleFake commented Dec 14, 2022

I made this exact mistake recently. I'm working on a pure Go Wayland protocol implementation. Wayland, despite being essentially a network protocol, is constrained to Unix domain sockets for various practical reasons. Since everything's on the same machine anyways, I assume, the developers decided to just use native endianness for data being sent. Early on in my project, I created a global var byteOrder binary.ByteOrder that I then set in an init() by detecting the native endianness via unsafe. It was only later that I realized that this was 100% pointless, serving literally no purpose other than hurting the performance ever so slightly. It's pointless because all it's actually doing is getting the raw bytes of various types. This is just as easily, and far more efficiently, accomplished by just, for example, doing something like *(*[4]byte)(unsafe.Pointer(&v)) to get the raw bytes of either a int32 or uint32. I wrote a couple of functions, such as func Bytes[T ~int32 | ~uint32](v T) [4]byte, to simplify it a bit further and that was the end of it.

I have to agree with @robpike. I used to wonder why this was missing but now I just don't see the point. All it would do would be to obscure stuff that probably should be unsafe. I took a look through several packages that use the above referenced native endianness packages and none of them require a native endianness. For example, Tailscale doesn't use it much, and every usage could easily be replaced with an unsafe conversion, the same goes for Cloudflare's usage of it, and I'm not even sure what's going on in this one. On the other hand, wireguard-go has a discussion that talks about unsafe casting, but only seems to consider casting an entire struct as an alternative instead of casting just the slices that they're already handling individually anyways, which seems a bit odd to me if the goal is remove the dependency.

Unless I'm missing something, none of these seem like they actually require any kind of NativeEndian and their insistence on using a library to detect how their system reads and writes memory instead of just reading and writing that memory and letting the computer do what it's designed to do is causing issues that they just simply wouldn't otherwise have. It's also notable that every usage that I saw directly calls the methods on the ByteOrder implementation, rather than passing it to binary.Read() or binary.Write(). Calling the methods directly is even more pointless, since you have to know the size in advance anyways.

Endianness is generally not something that should affect anything outside of network protocols and file formats. In other words, it only really matters for stuff where data could be read by two processes that might be running on different computers. In those cases, the endianness needs to be explicitly declared as something standardized between the two, hence binary.LittleEndian and binary.BigEndian. I don't know what the point of a native endian implementation of binary.ByteOrder would be other than to confuse people into thinking that there's a similarity in the usage between native endianness and predefined endianness.

@zx2c4
Copy link
Contributor

zx2c4 commented Dec 14, 2022

You have to be careful with alignment on some archs with that sort of unsafe cast.

Typically what I've done in C is memcpy to the destination and let gcc figure out whether that has to be byte by byte or can be word-wise, with load store folding removing the insufficiency.

@DeedleFake
Copy link

DeedleFake commented Dec 14, 2022

You have to be careful with alignment on some archs with that sort of unsafe cast.

How so, and how does a NativeEndian implementation help with that? If you can reference the memory, the computer shouldn't care what type you're treating the bytes as I would think. For example, what's the difference between

v.hdrLen = endian.Native.Uint16(b[2:])

and

v.hdrLen = *(*uint16)(unsafe.Pointer(&b[2]))

Genuinely curious. It's reading the same bytes in the same order. If there are issues with reading those bytes that seems like it should be something the compiler should handle transparently.

@josharian
Copy link
Contributor

@DeedleFake the compiler only coalesces byte-by-byte memory reads/writes if it knows that it is safe to do so, based on the architecture, alignment, whatnot. The unsafe version says to the compiler: just do it, whether or not it is safe.

@DeedleFake
Copy link

DeedleFake commented Dec 14, 2022

What actually happens if you attempt to do

v.hdrLen = *(*uint16)(unsafe.Pointer(&b[2]))

on something like MIPS and the alignment isn't correct? Do you just get junk data? Does it panic? That's not something I've run into before, despite working with some relatively low level stuff on MIPS a bit a fair while back.

Edit: Never mind. I think I answered my own question.

coadler pushed a commit to coder/tailscale that referenced this issue Feb 2, 2023
See golang/go#57237

Change-Id: If47ab6de7c1610998a5808e945c4177c561eab45
Signed-off-by: Brad Fitzpatrick <bradfitz@tailscale.com>
matzf added a commit to matzf/scion that referenced this issue Feb 3, 2023
No longer needed.
Native byte order is not often needed, but will eventually show up in
standard library anyway (golang/go#57237).
matzf added a commit to scionproto/scion that referenced this issue Feb 9, 2023
The pkg/private/common, util and xtest packages have rather fuzzy scope,
and have accumulated a bit of cruft and unused or outdated
functionality. Clean this up a bit:

* pkg/private/common: 
    * remove unused constants
    * remove outdated error handling helpers and replace remaining use
    * remove NativeOrder and IsBigEndian: No longer needed.
      Native byte order is not often needed, but will eventually show up
      in standard library anyway (golang/go#57237).
* pkg/private/util:
    * remove unused helper functionality
    * remove Checksum: only used to compute reference value in slayers
      test cases. Use a simpler, non-optimized implementation for this.
      Closes #4262.
    * move RunsInDocker to private/env
    * move ASList to tools/integration
* pkg/private/xtest: 
    * remove unused helpers
    * remove unused Callback and MockCallback
    * replace FailOnErr with require.NoError
    * replace AssertErrorsIs with assert.ErrorIs


There are still more things to clean up in `pkg/private`, in future PRs,
in particular: 
* `common.ErrMsg` should be integrated in `serrors`
* `common.IFIDType` should be removed or renamed and moved somewhere
  more appropriate
* Merge the remainder of `util` and `common` 
* Clean up  `LinkType` and `RevInfo` from `pkg/private/ctrl`
johanbrandhorst pushed a commit to Pryz/go that referenced this issue Feb 12, 2023
Updates golang#57237

Change-Id: Ib626610130cae9c1d1aff5dd2a5035ffde0e127f
Reviewed-on: https://go-review.googlesource.com/c/go/+/463985
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: xie cui <523516579@qq.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Run-TryBot: Ian Lance Taylor <iant@google.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Michael Knyszek <mknyszek@google.com>
@tklauser
Copy link
Member

tklauser commented May 13, 2023

In the current implementation added by CL 463218 we can neither do:

binary.NativeEndian == binary.LittleEndian

or

binary.NativeEndian == binary.Little

because the underlying types nativeEndian and {little,big}Endian don't match and the code won't compile:

./endian_test.go:18:28: invalid operation: binary.NativeEndian == binary.LittleEndian (mismatched types binary.nativeEndian and binary.littleEndian)
./endian_test.go:21:28: invalid operation: binary.NativeEndian == binary.BigEndian (mismatched types binary.nativeEndian and binary.bigEndian)

Is this intended?

Context is that I wanted to add the following test to x/sys/cpu which currently fails on both little and big endian machines:

func TestIsBigEndian(t *testing.T) {
        var want binary.ByteOrder = binary.LittleEndian
        if cpu.IsBigEndian {
                want = binary.BigEndian
        }
        if binary.NativeEndian != want {
                t.Errorf("IsBigEndian = %v not consistent with binary.NativeEndian: got %T, want %T",
                        cpu.IsBigEndian, binary.NativeEndian, want)
        }
}

In Russ' comment #57237 (comment) I noticed:

We can't use a type alias for nativeEndian, nor can do...

var NativeEndian bigEndian // or littleEndian

with build tags because then

x := binary.NativeEndian
x == binary.LittleEndian

or even

binary.NativeEndian == binary.LittleEndian

would only compile on certain machines.

I'm not sure I understand this part of the comment, i.e. why the code above would only compile on certain machines in case nativeEndian is a type alias. Both littleEndian and bigEndian are defined regardless of GOARCH.

I think changing to a type alias would solve the problem state above and make TestIsBigEndian pass on both little and big endian machines. Or am I missing something?

@randall77
Copy link
Contributor

On a little-endian machine, then binary.NativeEndian == binary.LittleEndian would work, but binary.NativeEndian == binary.BigEndian doesn't. It gives the error

invalid operation: binary.NativeEndian == binary.BigEndian (mismatched types binary.littleEndian and binary.bigEndian)

Something like binary.NativeEndian == any(binary.BigEndian) would work though.

And you can always do binary.NativeEndian.Uint16([]byte{0, 1}) == 1.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/494856 mentions this issue: cpu: define IsBigEndian on wasm

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/494858 mentions this issue: cpu: add test for IsBigEndian

gopherbot pushed a commit to golang/sys that referenced this issue May 16, 2023
For golang/go#57237

Change-Id: Ia2aa943e93c8df51cd08003535fa026d2bb8003e
Reviewed-on: https://go-review.googlesource.com/c/sys/+/494856
TryBot-Result: Gopher Robot <gobot@golang.org>
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
gopherbot pushed a commit to golang/sys that referenced this issue May 16, 2023
For golang/go#57237

Change-Id: I11d1c954f942ebd836c4deb9c710c40778dc5599
Reviewed-on: https://go-review.googlesource.com/c/sys/+/494858
Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
Reviewed-by: Heschi Kreinick <heschi@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/501615 mentions this issue: all: update x/sys to HEAD

gopherbot pushed a commit to golang/net that referenced this issue Jun 7, 2023
An update is needed to pull in CL 494856, to allow the bpf test to
build on js and wasip1 after the test changes in CL 501155.

Updates golang/go#55235.
Updates golang/go#57237.

Change-Id: Iff48bad97453932065c27b0c8b4a3706ddcf659a
Reviewed-on: https://go-review.googlesource.com/c/net/+/501615
Reviewed-by: Tobias Klauser <tobias.klauser@gmail.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
Auto-Submit: Bryan Mills <bcmills@google.com>
Reviewed-by: Matt Layher <mdlayher@gmail.com>
Reviewed-by: Ian Lance Taylor <iant@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@gopherbot gopherbot modified the milestones: Go1.21, Go1.22 Aug 8, 2023
@ianlancetaylor
Copy link
Contributor

As far as I can tell this is completed. Please comment if I'm missing something.

@gopherbot
Copy link
Contributor

Change https://go.dev/cl/524675 mentions this issue: _content/doc/go1.21: mention encoding/binary.NativeEndian

@dmitshur
Copy link
Contributor

Also moving back to Go1.21 milestone, since that's when this was completed.
(Gopherbot automatically moved it to the next milestone because it was still open.)

@dmitshur dmitshur modified the milestones: Go1.22, Go1.21 Aug 31, 2023
gopherbot pushed a commit to golang/website that referenced this issue Aug 31, 2023
For golang/go#57237
Fixes golang/go#62349

Change-Id: I2407281b1635e3689f6e225da2ea7f562f321203
Reviewed-on: https://go-review.googlesource.com/c/website/+/524675
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
willpoint pushed a commit to orijtech/website that referenced this issue Oct 17, 2023
For golang/go#57237
Fixes golang/go#62349

Change-Id: I2407281b1635e3689f6e225da2ea7f562f321203
Reviewed-on: https://go-review.googlesource.com/c/website/+/524675
Reviewed-by: Ian Lance Taylor <iant@google.com>
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Auto-Submit: Ian Lance Taylor <iant@google.com>
Reviewed-by: Dmitri Shuralyov <dmitshur@golang.org>
Auto-Submit: Ian Lance Taylor <iant@golang.org>
Reviewed-by: Dmitri Shuralyov <dmitshur@google.com>
@dolmen
Copy link
Contributor

dolmen commented Oct 18, 2023

See also github.com/dolmen-go/endian.Native which exists since 2016 and uses code generation to keep up-to-date with GOARCH.

Also, contrary to the current flawed implementation of binary.NativeEndian my package supported and still supports comparing with == to binary.LittleEndian and binary.BigEndian (which was expected by Rob in #57237 (comment)).

lmb added a commit to lmb/ebpf that referenced this issue Mar 21, 2024
Remove the internal implementation of NativeEndian in favour of
binary.NativeEndian. There is one hiccup: it's not possible to
use a simple comparison to figure out whether NativeEndian is
the same as for example LittleEndian. The IsNativeEndian helper
takes care of this.

See golang/go#57237 (comment)

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
lmb added a commit to lmb/ebpf that referenced this issue Mar 22, 2024
Remove the internal implementation of NativeEndian in favour of
binary.NativeEndian. There is one hiccup: it's not possible to
use a simple comparison to figure out whether NativeEndian is
the same as for example LittleEndian. The IsNativeEndian helper
takes care of this.

See golang/go#57237 (comment)

Signed-off-by: Lorenz Bauer <lmb@isovalent.com>
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/581655 mentions this issue: encoding/binary: fix NativeEndian to support == with Little/BigEndian

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests