Skip to content

Conversation

@kripken
Copy link
Member

@kripken kripken commented Mar 4, 2024

This is fallout from #6310 where we moved to use fuzz_shell.js for all fuzzing
purposes. That script doesn't know wasm types, all it has on the JS side is the
number of arguments to a function, and it passes in null for them all
regardless of their type. That normally works fine - null is cast to the right
type upon use - but in wasm2js optimized builds we can remove casts,
which can make that noticeable. Luckily the fixup is very simple.

@kripken kripken requested a review from tlively March 4, 2024 20:32
Comment on lines +1046 to +1059
# replace null with 0. the fuzzing harness passes in nulls instead
# the specific type of a parameter (since null can be cast to
# anything without issue, and all fuzz_shell.js knows on the JS side
# is the number of parameters), which can be noticeable in a
# situation where we optimize and remove casts, like here:
#
# function foo(x) { return x | 0; }
#
# When optimizing we can remove that | 0, which is valid if the
# input is valid, but as we said, the fuzz harness passes in a value
# of the wrong type - which would be cast on use, but if we remove
# the casts, we end up returning null here and not 0, which the
# fuzzer can notice.
x = re.sub(r' null', ' 0', x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In #6310 (comment), you said null can be cast to 0 but not vice versa so we couldn't use null. Has that changed? Or am I confused and talking about two different things?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The situation there was that null can be cast to 0 but not vice versa, as you said, but the result is that we must use null. That is, if we call

(func $func (export "func")  (param $anyref anyref) (param $i32 i32)

then calling it from JS with

exports.func(null, null);

works because the JS VM will convert the second param from null to 0. I think that works because the wasm JS API spec does a "to number" conversion for i32, and in JS, Number(null) is 0 (for whatever historical reason). But

exports.func(0, 0);

would trap because 0 is a number and (for whatever reason, I'm not sure) the JS spec doesn't allow that to convert to an anyref.

So passing in null for all params works out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then why does passing null here not work out? Does JS don't do the historic transformation here? What's the difference between this situation and that one?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing null works out if there are casts, since the cast is what fixes it up to 0 as needed. But wasm2js can remove such casts in optimized builds.

That never happens in non-wasm2js builds as VMs never remove such casts, so in a sense wasm2js has more flexibility than VMs there. Perhaps another solution here is for wasm2js to also not remove such casts, but it is useful for code size in release builds... that seems like the right tradeoff to me (we don't do it in debug builds of course).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passing null works out if there are casts, since the cast is what fixes it up to 0 as needed. But wasm2js can remove such casts in optimized builds.

I thought JS somehow converted it from null to 0 internally or something after reading your comment, but was that actually done by us explicitly casting with | 0?

That never happens in non-wasm2js builds as VMs never remove such casts, so in a sense wasm2js has more flexibility than VMs there. Perhaps another solution here is for wasm2js to also not remove such casts, but it is useful for code size in release builds... that seems like the right tradeoff to me (we don't do it in debug builds of course).

If we replace all null with 0, is that OK to replace a real null with 0? You said passing 0 unconditionally doesn't work because there are places we have to supply nulls like references.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought JS somehow converted it from null to 0 internally or something after reading your comment, but was that actually done by us explicitly casting with | 0?

The VM does it automatically and we do it manually to simulate that, yes, with | 0. To make sure we're on the same page, in more detail, when we test in a VM, something like this happens:

m = new WebAssembly.Module(...);
i = new WebAssembly.Instance(m, ..);
i.exports.foo(null);

When we call foo, the VM sees what types are expected. For an i32, the spec says the VM must do "to integer", so it does the JS spec operation "to integer" on null, which turns it into a 0.

On the other hand when we test in wasm2js, all we have is this:

m = (function() {
  function foo(x) {
    return x | 0;
  }
  return {
    exports: {
      "foo": foo
    }
  }
});
i = m();
i.exports.foo(null);

The wasm module is implemented in JS in a way similar to that. So when we call foo we are really just calling that JS function. JS doesn't do any casting or anything else by itself, so wasm2js emitted | 0 to convert the input to a number manually. That cast mimics what the VM would do for an actual wasm module.

If we replace all null with 0, is that OK to replace a real null with 0? You said passing 0 unconditionally doesn't work because there are places we have to supply nulls like references.

Correct - it would be bad to replace a null with a 0. This PR doesn't do that. Maybe I misunderstood your question though?

Copy link
Member

@aheejin aheejin Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought JS somehow converted it from null to 0 internally or something after reading your comment, but was that actually done by us explicitly casting with | 0?

The VM does it automatically and we do it manually to simulate that, yes, with | 0. To make sure we're on the same page, in more detail, when we test in a VM, something like this happens:

Wasm2js also runs on the same VM. Then why doesn't the VM do the same thing for wasm2js and we have to simulate it for the VM?

m = new WebAssembly.Module(...);
i = new WebAssembly.Instance(m, ..);
i.exports.foo(null);

When we call foo, the VM sees what types are expected. For an i32, the spec says the VM must do "to integer", so it does the JS spec operation "to integer" on null, which turns it into a 0.

On the other hand when we test in wasm2js, all we have is this:

m = (function() {
  function foo(x) {
    return x | 0;
  }
  return {
    exports: {
      "foo": foo
    }
  }
});
i = m();
i.exports.foo(null);

The wasm module is implemented in JS in a way similar to that. So when we call foo we are really just calling that JS function. JS doesn't do any casting or anything else by itself, so wasm2js emitted | 0 to convert the input to a number manually. That cast mimics what the VM would do for an actual wasm module.

If we replace all null with 0, is that OK to replace a real null with 0? You said passing 0 unconditionally doesn't work because there are places we have to supply nulls like references.

Correct - it would be bad to replace a null with a 0. This PR doesn't do that. Maybe I misunderstood your question though?

The first line in the comment says

            # replace null with 0.

Also the code is

            x = re.sub(r' null', ' 0', x)

Isn't this replacing a null with a 0?

I am very much sure I'm confused about something; I am mostly just trying to understand. Sorry for making you writing a ton for a very small change.. 🥲

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasm2js also runs on the same VM. Then why doesn't the VM do the same thing for wasm2js and we have to simulate it for the VM?

One is running as wasm, and going through the JS/wasm APIs. The other is pure JavaScript. Specifically, in the first case we have

m = new WebAssembly.Module(...);
i = new WebAssembly.Instance(m, ..);
i.exports.foo(null);

and when we call i.exports.foo then we are calling a thing of type WebAssembly.Function. That is an object that the spec defines as applying conversions on each parameter, according to the type of the wasm being called. In comparison, in the JS case we have (effectively)

m.exports.foo = function foo(x) {
  return x;
};
i.exports.foo(null);

There is no WebAssembly.Function. Instead, when we call i.exports.foo we are calling a plain JavaScript function foo. The JavaScript spec does not do any conversions here - in fact it can't, as it doesn't know what to convert to - there is no type declared for the parameter x of JS function foo there. x will be of exactly the value that is provided, with no casting, the way JS normally works.

So overall this is a weird corner case of wasm2js implementing wasm in JS: we need to copy the casts that the VM does.

x = re.sub(r' null', ' 0', x)
Isn't this replacing a null with a 0?

Ah, yes, it is - sorry, this is indeed confusing, and I probably mixed them up before. This replaces it in the output, so they end up compared equally (when we compare outputs). But for the input, we must pass in null because of the JS casting issue (the VM will cast null to 0 for an i32 param without complaint, but not vice versa, sadly).

Copy link
Member

@aheejin aheejin Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK.. What I was confused about was the substitution is done in the input vs. output of VM run. What I was talking about was the input and what you substituted was an output. It says fix_output, but I thought this was not an output of the VM run but the output of wasm2js itself, which emits a JS file. So I was assuming we were modifying an input to the VM.

Anyway thanks for the long conversation and sorry for the confusion.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries, yeah, looking back the confusion was about the input vs the output, I should have made that clearer in the PR explanation...

Comment on lines +1046 to +1059
# replace null with 0. the fuzzing harness passes in nulls instead
# the specific type of a parameter (since null can be cast to
# anything without issue, and all fuzz_shell.js knows on the JS side
# is the number of parameters), which can be noticeable in a
# situation where we optimize and remove casts, like here:
#
# function foo(x) { return x | 0; }
#
# When optimizing we can remove that | 0, which is valid if the
# input is valid, but as we said, the fuzz harness passes in a value
# of the wrong type - which would be cast on use, but if we remove
# the casts, we end up returning null here and not 0, which the
# fuzzer can notice.
x = re.sub(r' null', ' 0', x)
Copy link
Member

@aheejin aheejin Mar 8, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah OK.. What I was confused about was the substitution is done in the input vs. output of VM run. What I was talking about was the input and what you substituted was an output. It says fix_output, but I thought this was not an output of the VM run but the output of wasm2js itself, which emits a JS file. So I was assuming we were modifying an input to the VM.

Anyway thanks for the long conversation and sorry for the confusion.

@kripken kripken merged commit b671b6c into WebAssembly:main Mar 8, 2024
@kripken kripken deleted the wasm2js.fuzz.null branch March 8, 2024 23:19
@gkdn gkdn mentioned this pull request Aug 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants