Proposal on the spec changes

# Proposal on the Spec Changes

I would like to propose some changes to the current proposal.


## Propsed Changes

### Try with Relative Depth Argument

`try` now can have a relative depth argument as in the case of branches. The
'normal' try - a `try` in which calls unwind to a `catch` next to the `try` -
has a depth of 0.

Here are examples. For brevity, only one `catch` instruction is shown for each
`try` instruction.

```
# Example 1
try 0
  throw
catch i         # Control goes here
  ...
end

# Example 2
try 0
  try 1
    throw
  catch i
    ...
  end
catch i         # Control goes here
  ...
end

# Example 3
try 0
  try 0
    try 2
      throw
    catch i
      ...
    end
  catch i
    ...
  end
catch i         # Control goes here
  ...
end
```


### Catchless Try Block

When an argument (relative depth) of a `try` instruction is greater than 0, its
matching `catch` block does not have any uses. For example,

```
try 0
  try 1
    throw
  catch i       # Not used!
    ...
  end
catch i         # Control goes here
  ...
end
```
In this case, when an exception occurs within `try 1` block, the program control
is transferred to the outer `catch` block. So in this case the inner `catch`
block is not used, so if we do not generate this kind of `catch` blocks, it will
help reduce the code size. Effectively, a catchless `try` block is the same as a
`catch` with an immediate `rethrow`. So this code
```
try 0
  try 1
    throw
  end
catch i         # Control goes here
  ...
end
```
has the same effect as
```
try 0
  try 1
    throw
  catch i
    rethrow 0
  end
catch i         # Control goes here
  ...
end
```

Actually, `try 1` would not have a real use, because code inside `try 1` would
go to the one-level outer catch, in which case we can just omit `try 1` and
place the call inside `try 0` outside.

The relative depth argument of `try` instruction only counts the number of `try`
nests: it does not count `block` or `loop` nests. For example,
```
try 0
  block
    try 1
      block
        throw
      end
    end
  end
catch i         # Control goes here
  ...
end
```
In this case, when the `throw` instruction throws, the control is still
transferred to the outer `catch i` block, even though now there are two `block`
nests in the code.


## Motivation

### Background

In LLVM IR, when a function call can throw, it is represented as an
[`invoke`](http://llvm.org/docs/LangRef.html#invoke-instruction) instruction
which has two successors in CFG: its 'normal' destination BB and 'unwind'
destination BB. When an exception does not occur, the control goes to the
'normal' BB, and when it does, the control goes to the 'unwind' BB. Here is a
couple LLVM-IR level CFG examples:

C++ code:
```
try {
  foo();
} catch (...) {
}
```
LLVM IR-like pseudocode:
```
entry:
  invoke @foo to label %try.cont unwind label %lpad

lpad:
  %0 = landingpad ...
  ...

try.cont:
  ...
```

C++ code:
```
try {
  foo();
  foo();
} catch (int n) {
}
```
LLVM IR-like pseudocode:
```
entry:
  invoke @foo to label %invoke.cont unwind label %lpad

invoke.cont:
  invoke @foo to label %try.cont unwind label %lpad

lpad:
  %0 = landingpad ...
  ...
  if caught type is int br label %catch else br label %eh.resume

catch:
  ...
  br label %try.cont

try.cont:
  ...

eh.resume:
  resume ...
```

`invoke` instructions are lowered to `call`s in the backend, but they still have
a landing pad BB as their successor. `landingpad` instructions disappear in the
lowering phase, and the compiler inserts a `catch` instruction in the beginning
of each landing pad BB.

**In terms of control flow, an `invoke`, or a `call` lowered from it, is similar
to that of a conditional branch `br_if`.** When a branch is taken, `br_if` jumps
out of the current enclosing block(s) by the number of relative depth specified
as an argmuent. When an exception is thrown within a function call, the control
flow jumps out of the current enclosing `try` block. **But the difference, in
the current EH proposal, is it can only break out of a single depth, because
`call` does not take a relative depth as an argument and the VM transfers the
control flow to the nearest matching `catch` instruction.**


### Structured Control Flow

To make a control flow _structured_, there should not be an incoming edge from
outside of a block-like context (`block`, `loop`, or `try`), to the middle of
it. So it is required that the first BB of a block-like context should dominate
the rest of the BBs within it (otherwise there can be an incoming edge to the
middle of the context).

In the [`CFGStackify`
pass](https://github.com/llvm-mirror/llvm/blob/master/lib/Target/WebAssembly/WebAssemblyCFGStackify.cpp),
here is how roughly `block` markers are placed:
1. For each BB that has a non-fallthrough branch to it (this BB will be where
`end` marker will be)
  1. Compute the nearest common dominator of all forward non-fallthrough
     predecessors.
  2. If the nearest common dominator computed is inside a more deeply nested
     context, walk out to the nearest scope which isn't more deeply nested. For
     example,
     ```
     A
     block
       B    <- nearest common dom. is inside this block!
     end
     BB     <- we are processing this BB. end marker will be here
     ```
     In this case, we can't place a `block` marker in `B`. So we walk out of the
     scope to reach `A`.
  3. Place a `block` marker in the discovered block (the nearest common
     dominator of branches or some block found by the process in 2) and place a
     `end` marker in BB.

For loops, a loop header is by definition dominates all the BBs within the loop,
so we just place a `loop` marker there and `end` marker in the latch.


### Problems with the Current Proposal

A try/catch block is divided into two parts: a `try` part and a `catch` part.
What we should do for grouping a `try` part is similar to grouping a `block`,
because we also want `try` to be _structured_.

1. For each landing pad, where `catch` instruction is
  1. Compute the nearest common dominator of all call instructions that has this
     landing pad as its successor
  2. If the nearest common dominator is inside a more deeply nested context,
     walk out to the nearest scope that more isn't nested.
  3. Place a `try` marker in the discovered block.
(Grouping `catch` part is not covered here because it is not relevant)

The problem is, unlike branches, `call` instructions do not have a relative
depth argument so cannot break out of multiple contexts. But from the nearest
common dominator to the landing pad it is possible some call instructions that
might throw unwind to outer landing pads (landing pads ouside of the nearest
common dominator of throwing calls ~ current landingpad scope) or do not unwind
to any landing pad, which means when they throw, the exception should be
propagated out to the caller. For example,
```
try
  try
    call @foo()    # if it throws, unwinds to landing pad 1
    ...
    call @bar()    # if it throws, unwinds to landing pad 2
    ...
    call @baz()    # if it throws, propagates to the caller
  catch i          # landing pad 1
    ...
  ...
catch i            # landing pad 2
  ...
end
```

Because it is not possible for a call instruction that might throw to specify a
relative depth, or in other words, it cannot specify which landing pads to go,
in the current EH proposal, this does not work.


### Why the New Scheme is Better

The only way that can make the current scheme work is to split landing pads
until all the possibly-throwing calls within a `try` block unwind to the a
single landing pad or landing pads that's in the nested context of the `try`
block. Minimizing the number of split landing pads will require nontrivial CFG
analysis, but still, it is expected to increase code size compared to when we
use the new proposed scheme above.

#### Code Size

For a simple example, suppose we have a call that unwinds to an outer landing
pad in case it throws.
```
try
  call @foo    # unwinds to the current landing pad
  call @bar    # unwinds to outer landing pad
  call @baz    # unwinds to the current landing pad
catch i        # current landing pad
  some code
end
```

If we split this landing pad, the code will look like the below. Here we assumed
that we factored out the `some code` part in the original `catch` part to reduce
code size.
```
block
  try
    call @foo
  catch i
    br 1
  end
  call @bar
  try
    call @baz
  catch i
    br 1
  end
end
some code
```
So roughly, when we split a landing pad into n landing pads, there will be n
`try`s + n `catch`s + n `br`s + n `end`s that have to be added.

If we use our new scheme:
```
try 0
  call @foo    # unwinds to the current landing pad
  try 2
    call @bar  # unwinds to outer landing pad
  end
  call @baz    # unwinds to the current landing pad
catch i        # current landing pad
  some code
end
```

In the same case that we should split a landing pad into n, if we use the new
scheme, roughtly we will need to add (n-1) `try`s and (n-1) `end`s. (`try`s now
take an argument, so it may take a bit more space though.)


#### Easier Code Generation

Generating Wasm code is considerably easier for the new scheme. For our current
scheme, the code generation wouldn't be very hard if we attach a `catch`
instruction to every call that might throw, which boils down to a `try/catch`
block for every call. But it is clear that we shouldn't do this and if we want
to optimize the number of split landing pads, we would need a nontrivial CFG
analysis to begin with.

And there are many cases that need separate ad-hoc handlings. For example,
there can be a loop that has two calls that unwind to different landing pads
outside of the loop:
```
loop
  call @foo   # unwinds to landing pad 1
  call @bar   # unwinds to landing pad 2
end
landing pad 1
...
landing pad 2
...
```
It is not clear how to solve this case, because, already a part of a `try` is
inside an existing loop but `catch` part is outside of the loop, and there are
even another call that jumps to a different landing pad that's also outside of
the loop.

There can be ways to solve this, but there are many more tricky cases. Here, the
point is, the code generation algorithm for the new scheme will be a lot easier
and very straightforward. Code generation for the new scheme can be very similar
to that of `block` marker placement in `CFGStackify`. We place `try` markers in
a similar way to placing `block` markers, and if there is a need to break out of
multiple contexts at once, we can wrap those calls in a nested `try N` context
with an appropriate depth `N`. Optimizing the number of newly added `try N`
markers will be also straightforward, because we can linearly scan the code to
check if any adjacent `try` blocks can be merged together.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Proposal on the spec changes #29

Proposal on the Spec Changes

Propsed Changes

Try with Relative Depth Argument

Catchless Try Block

Motivation

Background

Structured Control Flow

Problems with the Current Proposal

Why the New Scheme is Better

Code Size

Easier Code Generation

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Proposal on the spec changes #29

Description

Proposal on the Spec Changes

Propsed Changes

Try with Relative Depth Argument

Catchless Try Block

Motivation

Background

Structured Control Flow

Problems with the Current Proposal

Why the New Scheme is Better

Code Size

Easier Code Generation

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions