Skip to content

consider new "coercion" field of export/import declarations #657

Closed
@lukewagner

Description

@lukewagner

Today in asm.js, if you compile
uint32_t f() { return UINT32_MAX; }
in the obvious way, you'll get
function asmModule() { 'use asm'; function f() { return -1 } return f }
which, when called, will return -1 to JS (which is != UINT32_MAX) due to the fact that asm.js hard-codes the conversion from i32 to JS number to interpret the i32 as signed.

This trips up asm.js users in practice (who would naturally expect, if they wrote UINT32_MAX in their C, that they should test against UINT32_MAX in their JS) and it will surely trip up wasm users as well. The same problem applies to arguments passed to FFI/import calls from wasm.

In asm.js, this could have been fixed by allowing functions to return either signed or unsigned integers (although there are tradeoffs), but in wasm, there is only i32. Another fix is to require the toolchain, when compiling the above C code, to provide a light JS shim which explicity coerces to unsigned (>>>0) when the declared C return type is unsigned.

I'd like to consider in this issue a better and more general fix: What if export and import statements provided a new "coercion" field which was a sequence of bytes whose interpretation was, like the other byte-sequence fields of imports/exports, left up to the host environment. When binding to JS, this is where we'd explain how to interpret integer arguments/returns. However, I think there are many other potential uses, so I think we'd want to keep the contents of this field extensible, perhaps starting with a JSON blob like {ret:'u'}. Of course, each wasm type would have a "default" coercion (what we're doing today) so this field could always be left empty.

Some other potential uses of the coercion field:

  • We could allow i64 to be passed/returned from JS today using {low, high} objects and later using int64 value types by using the coercion field specify which one.
  • By default, I think the highly-coercive ToInt32/ToNumber coercions would be applied when converting JS arguments in calls to wasm exports, but it might be nice to have a non-coercive option that threw if, e.g., the given JS value wasn't already a number. This could make more sense when we start talking about GC/reference types.
  • It's really common to want to pass/return strings from wasm linear memory to JS (that produce real JS strings). There's actually not a super-efficient way to do this in JS atm (I see Emscripten's current UTF8ArrayToString does one fromCharCode and concat per character!). A string coercion would thus be both a major usability and performance improvement.
  • If Typed Objects get standardized, it'd be useful to have an option that allows a wasm pointer (i32) return value to be coerced into a transparent Typed Object views that aliases linear memory at that offset. This could provide a much more pleasant way to poke at linear memory from JS w/o requiring a JS shim layer.

Taken together, this set of features could make it a lot more pleasant to use wasm from JS without requiring a JS shim.

Metadata

Metadata

Assignees

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions