Skip to content

Proposal for new Binaryen IR type system #903

Closed
@dschuff

Description

@dschuff

Background: previous general discussion about Binaryen and wasm stack machine is at #663. It turns out Since Binary 0xc, Binaryen's type system has been subtly wrong with respect to the spec, and both V8 and SpiderMonkey were not spec compliant in that way and so nobody noticed. More concretely, @kripken tried to fix that in #899 and that led to some more general discussion about what to do. Probably that discussion should be in an issue and not in a PR, so here's an issue.

@jgravelle-google and I have been discussing it, and we have an idea which essentially carries the discussion in #899 to an end which we think we like, but maybe you can poke holes in it.

Also to recap, the goals are:

  1. Valid wasm is accepted by and convertable to Binaryen IR (but not necessarily losslessly)
  2. Valid Binaryen is convertible to valid wasm
  3. Transforms on Binaryen IR should be context-free rather that context-sensitive (and as a special case, a valid node should be replaceable by another node of the same type and the IR should still be type-valid).

In short, we think we can allow blocks (and everything else in Binaryen IR) to have unreachable type, and basically freeze those types into block signatures (i.e. reify them) when converting to wasm proper. Here are the Binaryen IR properties:

  1. Validation/type check (of nodes other than br) is context-free; in other words a node is valid based on its type and it contents only (the consequence of this is that transforms are also context-free).
  2. Any IR node can have type unreachable.
  3. A node (any kind of node with any type) is valid if:
    a. all operand(s) (including those that follow unreachable nodes in blocks) are recursively valid, and
    b. all operands type-check according to opcode-specific rules (unreachable always type-checks)
    c. A node with unreachable type must have at least one unreachable operand
  4. unreachable nodes may be reified based on their context (see paragraph on conversion to wasm below), and the result remains valid.
  5. br nodes are valid if their operand type matches the type of the target (brs targeting unreachable-typed nodes are invalid)
  6. (special rules for other node kinds?)

On conversion to wasm, types of all block/loop/if must be inferred/reified to concrete types. Inference always happens from context (i.e. outside in) and not from e.g. inside-out. In other words (and in contrast to current Binaryen), block types are not inferred based on their last element, but based on their context; e.g. if they are the operand of a return or are implicitly fall-through-returned, then they are converted to the return type of the function. This happens for every block we emit; we check the context and if it requires a particular type, we emit that type; otherwise it can be void. When this happens, we do not need to reify anything inside (rule 3).

There are a few non-intuitive consequences of these rules.
For example
(i32.add (f32.const 0) (br 0))) is valid (and can have type unreachable) because of rule 3b.

(block f32
 (i32.add 
   (i32.const 1)
   (br 0 (f32.const 0))
 )
)

is valid and the i32.add must have type unreachable. It does not get reified.

(block ;; [unreachable]
 (i32.add 
   (i32.const 1)
   (br 1 (f32.const 0)) ;; targets outer block
 )
)

If the block is reified to i32 then the add can stay unreachable, or have type i32. If the block is reified to something else then it's still valid (it's the first example again). In this nesting, it never needs to take a type (and it's beneficial).

I think this also means that it's a valid transform to take a node with an unreachable child and make it unreachable.
I think this also has the optimization properties that we want, and that conversion from wasm is workable but we haven't thought about all the permutations yet, so... yeah, comments and we can see if it all holds up.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions