-
Notifications
You must be signed in to change notification settings - Fork 1.5k
cranelift: Fix big-endian regression in data_value.rs #3329
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
PR bytecodealliance#3187 introduced a change to the write_to_slice and read_from_slice routines in data_value.rs that switched byte order on big-endian systems: the code used to use native byte order, and now hard-codes little-endian byte order. Fix by using native byte order again.
|
This makes these functions behave differently depending on the endianness. Should they instead get an endianness arg? I presume this will break interpreting clif for a little endian target on an s390x system. |
Interesting. I had only looked at the use from the filetest function runner via write_value_to/read_value_from. Those clearly need to use the current host's native byte order. For the interpreter, given that this is emulating the cranelift IR load/store operations, it should really respect the explicit byte order encoded in the memflags of those operations, if any, and fall back to the default byte order of the target ISA otherwise. This would indeed argue for an endianness arg. On the other hand, I'm not sure whether the interpreter will work correctly in cross-endian mode even then, I suspect there may be other problems hidden. Is the interpreter currently being tested by any CI test cases? I haven't noticed any failures ... |
|
Looking at the interpreter code a bit more, it currently doesn't even seem to have any notion of what the target ISA is? This may require more significant changes to get correct. I'd prefer to get this patch in for now to get back the old behavior of write_value_to/read_value_from and fix the currently existing regression in filetest. Getting the interpreter to work correctly in cross-endian mode can then be done separately. |
Yes, but I don't think there are any byte order tests for the interpreter. |
|
No, the interpreter currently only does native endianness, and I wouldn't be surprised if we assume little endian somewhere in there. We've only recently added support for memory loads/stores in #3187 and #3302 but they don't deal with endianness yet. Since the interpreter is being brought up as a fuzzing oracle, adding a cross-endian mode would be nice, but I agree with @uweigand that we should do that as a separate PR. I've added this to our tracking list for fuzzing in #3050. |
|
@afonso360 here's a slightly wacky idea that might help us ensure interpreter correctness: could we augment the interpreter's heap state to track a "native endian" sort of taint on certain address ranges, and only allow those addresses to be loaded with the equivalent native-endian load? My thought that started this is: we want the interpreter to be completely deterministic, and generally we want CLIF to be completely deterministic. We allowed for "native endian" loads and stores in order to allow frontends to generate efficient CLIF without knowing the endianness of the target. However, it is undefined behavior (at the CLIF level) -- or at least it should be -- to store a native-endian value then actually observe its endianness, i.e., use an explicit-endianness load or load individual bytes or partial words. So the thought is: let's enforce that in the interpreter. This would provide additional fuzzing coverage for any endianness bugs as well, which seems important to me as long as we care about big-endian platforms (and we do!). Thoughts? |
I don't agree with that. IMO using the default endianness should mean the native endian of the target, not an undefined endianness. This is how I interpreted it in cg_clif. I don't ever use explicit endianness markers as I always need the native endianness of the target. |
|
To clarify, the above did not mean that the store would use "an undefined endianness". Rather, I was proposing that we declare that, as a principle, CLIF whose result is endian-dependent should be disallowed (or more precisely, specific-endian loads/stores should not interact with native-endian loads/stores). One could see this as following from the principle that CLIF is platform-independent and its meaning is fixed and deterministic, regardless of the underlying platform; if we don't have that property, then building an interpreter suddenly becomes more difficult. (The open question is what happens when native-endian loads/stores of different widths interact, in the interpreter; @afonso360 thoughts?) If cg_clif only uses native-endian loads/stores, then I think it satisfies the above constraint trivially (native-endian loads/stores will never interact with specific-endian loads/stores if the latter are never used). |
I'm not sure we can get away that easily, even with native only loads/stores we can easily determine the target endianness with loads and stores of different sizes (i.e. This means that pretty much every non trivial CLIF file out there is undefined now that we added the big endian possibility.
My inital plan was to add endianness to the interpreter. When loading or storing to an address, if the endianness of the interpreter is different from the host we I don't know if this fully solves the issue (this was what I first though about when thinking about adding this), but I like this over the alternative of just doing whatever the native arch does, since it means we can emulate BE targets. It shouldn't be too hard to implement if this is all we have to do, all memory accesses go through the same interface, so we only really need to update one place. |
|
Hmm, this is a bit of a tricky design problem... At a high level, I think the "CLIF is platform independent" property is really valuable. It just seems like a continuing source of endianness bugs and confusion if it's possible to write a CLIF file whose results according to the interpreter are different when running on two different systems. That's the sort of base invariant that we want to think really hard about before compromising, because otherwise the implications percolate upward to other issues. The visibility of endianness can arise in two ways:
To define a single expected interpreter result, we have to either disallow both of the above (native loads/stores exist but only ever interact with other native loads/stores of the same width), or remove native loads/stores. I think we've sort of backed ourselves into a corner here. By including native-endian loads/stores, essentially we're making CLIF parametric on endianness: its behavior can depend on the endianness of the platform it's compiled on. The goal of this is to enable a sort of "late binding", where CLIF producers can generate code without needing to know the target's endianness. The alternatives are this either: (i) disallow any CLIF that observes endianness, which I don't think is possible as long as we want to be a general target, or (ii) bind the endianness earlier, i.e. as part of the builder, with some target knowledge. Then a "native" load/store takes its endianness at build time, and all CLIF has explicit endianness. Possibly relevant is that early binding of all sorts of platform properties is the norm in other compilers, AFAIK; e.g. LLVM has the data layout that specifies, among other things, endianness, and is part of the IR. I know that that's a reversal from the endianness discussion we had last year but we haven't really been forced to define the single, platform-independent expected result of CLIF until we had a platform-independent interpreter. Now that we do, the convenience we gained with the "native" ops as shorthand for the platform endianness may be worth less than having well-defined behavior across platforms. Thoughts? |
|
But also, I think we should be mindful of the fact that s390x has a regression right now that this PR fixes, and all of my thoughts above are out-of-scope for this PR, so I think I'm inclined to merge this now and fix the regression, and then open a new issue to discuss further! |
PR #3187 introduced a
change to the write_to_slice and read_from_slice routines in
data_value.rs that switched byte order on big-endian systems:
the code used to use native byte order, and now hard-codes
little-endian byte order.
Fix by using native byte order again.