Fuzzer: Fix up null outputs in wasm2js optimized builds #6374

kripken · 2024-03-04T20:32:21Z

This is fallout from #6310 where we moved to use fuzz_shell.js for all fuzzing
purposes. That script doesn't know wasm types, all it has on the JS side is the
number of arguments to a function, and it passes in null for them all
regardless of their type. That normally works fine - null is cast to the right
type upon use - but in wasm2js optimized builds we can remove casts,
which can make that noticeable. Luckily the fixup is very simple.

aheejin · 2024-03-06T02:06:28Z

scripts/fuzz_opt.py

+            # replace null with 0. the fuzzing harness passes in nulls instead
+            # the specific type of a parameter (since null can be cast to
+            # anything without issue, and all fuzz_shell.js knows on the JS side
+            # is the number of parameters), which can be noticeable in a
+            # situation where we optimize and remove casts, like here:
+            #
+            # function foo(x) { return x | 0; }
+            #
+            # When optimizing we can remove that | 0, which is valid if the
+            # input is valid, but as we said, the fuzz harness passes in a value
+            # of the wrong type - which would be cast on use, but if we remove
+            # the casts, we end up returning null here and not 0, which the
+            # fuzzer can notice.
+            x = re.sub(r' null', ' 0', x)


In #6310 (comment), you said null can be cast to 0 but not vice versa so we couldn't use null. Has that changed? Or am I confused and talking about two different things?

The situation there was that null can be cast to 0 but not vice versa, as you said, but the result is that we must use null. That is, if we call

(func $func (export "func") (param $anyref anyref) (param $i32 i32)

then calling it from JS with

exports.func(null, null);

works because the JS VM will convert the second param from null to 0. I think that works because the wasm JS API spec does a "to number" conversion for i32, and in JS, Number(null) is 0 (for whatever historical reason). But

exports.func(0, 0);

would trap because 0 is a number and (for whatever reason, I'm not sure) the JS spec doesn't allow that to convert to an anyref.

So passing in null for all params works out.

Then why does passing null here not work out? Does JS don't do the historic transformation here? What's the difference between this situation and that one?

Passing null works out if there are casts, since the cast is what fixes it up to 0 as needed. But wasm2js can remove such casts in optimized builds.

That never happens in non-wasm2js builds as VMs never remove such casts, so in a sense wasm2js has more flexibility than VMs there. Perhaps another solution here is for wasm2js to also not remove such casts, but it is useful for code size in release builds... that seems like the right tradeoff to me (we don't do it in debug builds of course).

Passing null works out if there are casts, since the cast is what fixes it up to 0 as needed. But wasm2js can remove such casts in optimized builds.

I thought JS somehow converted it from null to 0 internally or something after reading your comment, but was that actually done by us explicitly casting with | 0?

That never happens in non-wasm2js builds as VMs never remove such casts, so in a sense wasm2js has more flexibility than VMs there. Perhaps another solution here is for wasm2js to also not remove such casts, but it is useful for code size in release builds... that seems like the right tradeoff to me (we don't do it in debug builds of course).

If we replace all null with 0, is that OK to replace a real null with 0? You said passing 0 unconditionally doesn't work because there are places we have to supply nulls like references.

I thought JS somehow converted it from null to 0 internally or something after reading your comment, but was that actually done by us explicitly casting with | 0?

The VM does it automatically and we do it manually to simulate that, yes, with | 0. To make sure we're on the same page, in more detail, when we test in a VM, something like this happens:

m = new WebAssembly.Module(...); i = new WebAssembly.Instance(m, ..); i.exports.foo(null);

When we call foo, the VM sees what types are expected. For an i32, the spec says the VM must do "to integer", so it does the JS spec operation "to integer" on null, which turns it into a 0.

On the other hand when we test in wasm2js, all we have is this:

m = (function() { function foo(x) { return x | 0; } return { exports: { "foo": foo } } }); i = m(); i.exports.foo(null);

The wasm module is implemented in JS in a way similar to that. So when we call foo we are really just calling that JS function. JS doesn't do any casting or anything else by itself, so wasm2js emitted | 0 to convert the input to a number manually. That cast mimics what the VM would do for an actual wasm module.

If we replace all null with 0, is that OK to replace a real null with 0? You said passing 0 unconditionally doesn't work because there are places we have to supply nulls like references.

Correct - it would be bad to replace a null with a 0. This PR doesn't do that. Maybe I misunderstood your question though?

I thought JS somehow converted it from null to 0 internally or something after reading your comment, but was that actually done by us explicitly casting with | 0?

The VM does it automatically and we do it manually to simulate that, yes, with | 0. To make sure we're on the same page, in more detail, when we test in a VM, something like this happens:

Wasm2js also runs on the same VM. Then why doesn't the VM do the same thing for wasm2js and we have to simulate it for the VM?

m = new WebAssembly.Module(...); i = new WebAssembly.Instance(m, ..); i.exports.foo(null);

When we call foo, the VM sees what types are expected. For an i32, the spec says the VM must do "to integer", so it does the JS spec operation "to integer" on null, which turns it into a 0.

On the other hand when we test in wasm2js, all we have is this:

m = (function() { function foo(x) { return x | 0; } return { exports: { "foo": foo } } }); i = m(); i.exports.foo(null);

The wasm module is implemented in JS in a way similar to that. So when we call foo we are really just calling that JS function. JS doesn't do any casting or anything else by itself, so wasm2js emitted | 0 to convert the input to a number manually. That cast mimics what the VM would do for an actual wasm module.

If we replace all null with 0, is that OK to replace a real null with 0? You said passing 0 unconditionally doesn't work because there are places we have to supply nulls like references.

Correct - it would be bad to replace a null with a 0. This PR doesn't do that. Maybe I misunderstood your question though?

The first line in the comment says

# replace null with 0.

Also the code is

x = re.sub(r' null', ' 0', x)

Isn't this replacing a null with a 0?

I am very much sure I'm confused about something; I am mostly just trying to understand. Sorry for making you writing a ton for a very small change.. 🥲

Wasm2js also runs on the same VM. Then why doesn't the VM do the same thing for wasm2js and we have to simulate it for the VM?

One is running as wasm, and going through the JS/wasm APIs. The other is pure JavaScript. Specifically, in the first case we have

m = new WebAssembly.Module(...); i = new WebAssembly.Instance(m, ..); i.exports.foo(null);

and when we call i.exports.foo then we are calling a thing of type WebAssembly.Function. That is an object that the spec defines as applying conversions on each parameter, according to the type of the wasm being called. In comparison, in the JS case we have (effectively)

m.exports.foo = function foo(x) { return x; }; i.exports.foo(null);

There is no WebAssembly.Function. Instead, when we call i.exports.foo we are calling a plain JavaScript function foo. The JavaScript spec does not do any conversions here - in fact it can't, as it doesn't know what to convert to - there is no type declared for the parameter x of JS function foo there. x will be of exactly the value that is provided, with no casting, the way JS normally works.

So overall this is a weird corner case of wasm2js implementing wasm in JS: we need to copy the casts that the VM does.

x = re.sub(r' null', ' 0', x)
Isn't this replacing a null with a 0?

Ah, yes, it is - sorry, this is indeed confusing, and I probably mixed them up before. This replaces it in the output, so they end up compared equally (when we compare outputs). But for the input, we must pass in null because of the JS casting issue (the VM will cast null to 0 for an i32 param without complaint, but not vice versa, sadly).

Ah OK.. What I was confused about was the substitution is done in the input vs. output of VM run. What I was talking about was the input and what you substituted was an output. It says fix_output, but I thought this was not an output of the VM run but the output of wasm2js itself, which emits a JS file. So I was assuming we were modifying an input to the VM.

Anyway thanks for the long conversation and sorry for the confusion.

No worries, yeah, looking back the confusion was about the input vs the output, I should have made that clearer in the PR explanation...

aheejin · 2024-03-08T22:33:14Z

scripts/fuzz_opt.py

+            # replace null with 0. the fuzzing harness passes in nulls instead
+            # the specific type of a parameter (since null can be cast to
+            # anything without issue, and all fuzz_shell.js knows on the JS side
+            # is the number of parameters), which can be noticeable in a
+            # situation where we optimize and remove casts, like here:
+            #
+            # function foo(x) { return x | 0; }
+            #
+            # When optimizing we can remove that | 0, which is valid if the
+            # input is valid, but as we said, the fuzz harness passes in a value
+            # of the wrong type - which would be cast on use, but if we remove
+            # the casts, we end up returning null here and not 0, which the
+            # fuzzer can notice.
+            x = re.sub(r' null', ' 0', x)


Ah OK.. What I was confused about was the substitution is done in the input vs. output of VM run. What I was talking about was the input and what you substituted was an output. It says fix_output, but I thought this was not an output of the VM run but the output of wasm2js itself, which emits a JS file. So I was assuming we were modifying an input to the VM.

Anyway thanks for the long conversation and sorry for the confusion.

fix

8511a71

kripken requested a review from tlively March 4, 2024 20:32

aheejin reviewed Mar 6, 2024

View reviewed changes

aheejin approved these changes Mar 8, 2024

View reviewed changes

kripken merged commit b671b6c into WebAssembly:main Mar 8, 2024

kripken deleted the wasm2js.fuzz.null branch March 8, 2024 23:19

gkdn mentioned this pull request Aug 31, 2024

stringconsts gkdn/binaryen#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fuzzer: Fix up null outputs in wasm2js optimized builds #6374

Fuzzer: Fix up null outputs in wasm2js optimized builds #6374

Uh oh!

kripken commented Mar 4, 2024

Uh oh!

aheejin Mar 6, 2024

Uh oh!

kripken Mar 6, 2024

Uh oh!

aheejin Mar 7, 2024

Uh oh!

kripken Mar 7, 2024

Uh oh!

aheejin Mar 7, 2024

Uh oh!

kripken Mar 7, 2024

Uh oh!

aheejin Mar 8, 2024 •

edited

Loading

Uh oh!

kripken Mar 8, 2024

Uh oh!

aheejin Mar 8, 2024 •

edited

Loading

Uh oh!

kripken Mar 8, 2024

Uh oh!

aheejin Mar 8, 2024 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fuzzer: Fix up null outputs in wasm2js optimized builds #6374

Fuzzer: Fix up null outputs in wasm2js optimized builds #6374

Uh oh!

Conversation

kripken commented Mar 4, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aheejin Mar 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aheejin Mar 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aheejin Mar 8, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

aheejin Mar 8, 2024 •

edited

Loading

aheejin Mar 8, 2024 •

edited

Loading

aheejin Mar 8, 2024 •

edited

Loading