Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pacify Valgrind (by fixing a somewhat theoretical bug). #45

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

argosphil
Copy link

I'm deliberately filing this here rather than in the upstream repo, to hopefully get it some eyeballs and possibly an explanation of why I'm wrong.

I think Valgrind just found a bug in JSC. JSC contains/generates this code:

Dump of assembler code for function op_call_slow_return_location:
   0x0000000008b2c2ad <+0>:	mov    0x10(%rbp),%rdx
   0x0000000008b2c2b1 <+4>:	mov    0x14(%rdx),%edx
   0x0000000008b2c2b4 <+7>:	shl    $0x3,%rdx
   0x0000000008b2c2b8 <+11>:	mov    %rbp,%rsp
   0x0000000008b2c2bb <+14>:	sub    %rdx,%rsp
   0x0000000008b2c2be <+17>:	mov    0x24(%rbp),%r8d
   0x0000000008b2c2c2 <+21>:	movsbq 0x1(%r13,%r8,1),%rsi
   0x0000000008b2c2c8 <+27>:	mov    %rax,0x0(%rbp,%rsi,8)
   0x0000000008b2c2cd <+32>:	movzbl 0x5(%r13,%r8,1),%edx
   0x0000000008b2c2d3 <+38>:	imul   $0xfffffffffffffff0,%rdx,%rdx
   0x0000000008b2c2d7 <+42>:	mov    %rax,-0x10(%r12,%rdx,1)
   0x0000000008b2c2dc <+47>:	add    $0x7,%r8
   0x0000000008b2c2e0 <+51>:	movzbl 0x0(%r13,%r8,1),%eax
   0x0000000008b2c2e6 <+57>:	lea    0x2396ab3(%rip),%rsi        # 0xaec2da0 <g_opcodeMap>
   0x0000000008b2c2ed <+64>:	jmp    *(%rsi,%rax,8)

The intention of the first five instructions is to change the stack pointer to a known value. However, it does so by first moving the base pointer to the stack pointer, then subtracting the new reservation from the stack pointer:

   0x0000000008b2c2b8 <+11>:	mov    %rbp,%rsp
   0x0000000008b2c2bb <+14>:	sub    %rdx,%rsp

If a signal happens between those two instructions, the signal handler will wrongly assume that the stack space below the base pointer is no longer used, and overwrite the stack. This is a highly theoretical possibility, since arithmetic operations are unlikely to be interrupted, but Valgrind caught it nonetheless and assumed, correctly, that the values in the intersection of the old and the new stack reservation could no longer be relied upon.

So how could this happen, given that for an assembly programmer, this is a beginner's mistake? The problem here is that JSC uses its own language which is meant to be something like machine-independent assembler code. The source code in this case is:

    subp cfr, t0, sp

(t0 is a temporary register, cfr is the base register, sp is the stack pointer, and the "p" prefix marks this as a pointer instruction)

That looks okay, but pre-APX x86 doesn't have a ternary subtraction instruction! It's translated into the unsafe mov + sub sequence above by Ruby code, which assumes it's performing arithmetic on GP registers which can safely hold temporary intermediate values.

I've fixed this locally by rewriting the "subp" instruction into a safe mov-sub sequence, which in turn gets translated into:

   0x0000000008b2c318 <+11>:	neg    %rdx
   0x0000000008b2c31b <+14>:	add    %rbp,%rdx
   0x0000000008b2c31e <+17>:	mov    %rdx,%rsp

Crisis averted, and after a few Bun fixes, Valgrind no longer complains about uninitialized variables.

Again, this is a highly unlikely race condition to hit, it only affects x86_64 for us, requires a signal to be delivered at precisely the wrong moment, and on Linux there's still the Red Zone which might save us. It might be possible to hit in qemu, though.

It's worth pointing out that hardware interrupts usually do not cause userspace signals to be delivered, and bun receives relatively few signals in any case, so the focus here should be on fixing JSC so we can get clean Valgrind runs.

@argosphil
Copy link
Author

Here's the WebKit issue, which I decided to file after all:

https://bugs.webkit.org/show_bug.cgi?id=270415

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant