Pacify Valgrind (by fixing a somewhat theoretical bug). #45
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I'm deliberately filing this here rather than in the upstream repo, to hopefully get it some eyeballs and possibly an explanation of why I'm wrong.
I think Valgrind just found a bug in JSC. JSC contains/generates this code:
The intention of the first five instructions is to change the stack pointer to a known value. However, it does so by first moving the base pointer to the stack pointer, then subtracting the new reservation from the stack pointer:
If a signal happens between those two instructions, the signal handler will wrongly assume that the stack space below the base pointer is no longer used, and overwrite the stack. This is a highly theoretical possibility, since arithmetic operations are unlikely to be interrupted, but Valgrind caught it nonetheless and assumed, correctly, that the values in the intersection of the old and the new stack reservation could no longer be relied upon.
So how could this happen, given that for an assembly programmer, this is a beginner's mistake? The problem here is that JSC uses its own language which is meant to be something like machine-independent assembler code. The source code in this case is:
(t0 is a temporary register, cfr is the base register, sp is the stack pointer, and the "p" prefix marks this as a pointer instruction)
That looks okay, but pre-APX x86 doesn't have a ternary subtraction instruction! It's translated into the unsafe mov + sub sequence above by Ruby code, which assumes it's performing arithmetic on GP registers which can safely hold temporary intermediate values.
I've fixed this locally by rewriting the "subp" instruction into a safe mov-sub sequence, which in turn gets translated into:
Crisis averted, and after a few Bun fixes, Valgrind no longer complains about uninitialized variables.
Again, this is a highly unlikely race condition to hit, it only affects x86_64 for us, requires a signal to be delivered at precisely the wrong moment, and on Linux there's still the Red Zone which might save us. It might be possible to hit in qemu, though.
It's worth pointing out that hardware interrupts usually do not cause userspace signals to be delivered, and bun receives relatively few signals in any case, so the focus here should be on fixing JSC so we can get clean Valgrind runs.