Skip to content

Extend number of flat parameters in async lower from 1 to 4 #520

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
104 changes: 57 additions & 47 deletions design/mvp/Async.md
Original file line number Diff line number Diff line change
Expand Up @@ -620,58 +620,68 @@ which to use by default.

### Async Import ABI

Given an imported WIT function:
Given these imported WIT functions (using the fixed-length-list feature 🔧):
```wit
world w {
import foo: func(s: string) -> string;
import foo: func(s: string) -> u32;
import bar: func(s: string) -> string;
import baz: func(t: list<u64; 5>) -> string;
import quux: func(t: list<u32; 17>) -> string;
}
```
the default sync import function signature is:
the default/synchronous lowered import function signatures are:
```wat
;; sync
(func (param $s-ptr i32) (param $s-len i32) (param $out i32))
(func $foo (param $s-ptr i32) (param $s-len i32) (result i32))
(func $bar (param $s-ptr i32) (param $s-len i32) (param $out-ptr i32))
(func $baz (param i64 i64 i64 i64 i64) (param $out-ptr i32))
(func $quux (param $in-ptr i32) (param $out-ptr i32))
```
where `$out` must be a 4-byte-aligned pointer into linear memory into which the
8-byte (pointer, length) of the returned string will be stored.

The new async import function signature is:
Here: `foo`, `bar` and `baz` pass their parameters as "flattened" core value
types while `quux` passes its parameters via the `$in-ptr` linear memory
pointer (due to the Canonical ABI limitation of 16 maximum flattened
parameters). Similarly, `foo` returns its result as a single core value while
`bar`, `baz` and `quux` return their results via the `$out-ptr` linear memory
pointer (due to the current Canonical ABI limitation of 1 maximum flattened
result).

The corresponding asynchronous lowered import function signatures are:
```wat
;; async
(func (param $in i32) (param $out i32) (result i32))
(func $foo (param $s-ptr i32) (param $s-len i32) (param $out-ptr i32) (result i32))
(func $bar (param $s-ptr i32) (param $s-len i32) (param $out-ptr i32) (result i32))
(func $baz (param $in-ptr i32) (param $out-ptr i32) (result i32))
(func $quux (param $in-ptr i32) (param $out-ptr i32) (result i32))
```
where `$in` must be a 4-byte-aligned pointer into linear memory from which the
8-byte (pointer, length) of the string argument will be loaded and `$out` works
the same as in the synchronous case. What's different, however, is *when* `$in`
and `$out` are read or written. In a synchronous call, they are always read or
written before the call returns. In an asynchronous call, there is a set of
possibilities indicated by the `(result i32)` value:
* If the returned `i32` is `2`, then the call returned eagerly without
blocking and so `$in` has been read and `$out` has been written.
* Otherwise, the high 28 bits of the `i32` are the index of a new `Subtask`
in the current component instance's table. The low 4 bits indicate how far
the callee made it before blocking:
* If `1`, the callee didn't even start (due to backpressure), and thus
neither `$in` nor `$out` have been accessed yet.
* If `2`, the callee started by reading `$in`, but blocked before writing
`$out`.

The async signature `(func (param i32 i32) (result i32))` is the same for
almost all WIT function types since the ABI stores everything in linear memory.
However, there are three special cases:
* If the WIT parameter list is empty, `$in` is removed.
* If the WIT parameter list flattens to exactly 1 core value type (`i32` or
otherwise), `$in` uses that core value type and the argument is passed
by value.
* If the WIT result is empty, `$out` is removed.

For example:
Comparing signatures, the differences are:
* Async-lowered functions have a maximum of 4 flat parameters (not 16).
* Async-lowered functions always return their value via linear memory pointer.
* Async-lowered functions always have a single `i32` "status" code.

Additionally, *when* the parameter and result pointers are read/written depends
on the status code:
* If the low 4 bits of the status are `0`, the call didn't even start and so
`$in-ptr` hasn't been read and `$out-ptr` hasn't been written and the high
28 bits are the index of a new async subtask to wait on.
* If the low 4 bits of the status are `1`, the call started, `$in-ptr` was
read, but `$out-ptr` hasn't been written and the high 28 bits are the index
of a new async subtask to wait on.
* If the low 4 bits of the status are `2`, the call returned and so `$in-ptr`
and `$out-ptr` have been read/written and the high 28 bits are `0` because
there is no async subtask to wait on.

When a parameter/result pointer hasn't yet been read/written, the async caller
must take care to keep the region of memory allocated to the call until
receiving an event indicating that the async subtask has started/returned.

Other example asynchronous lowered signatures:

| WIT function type | Async ABI |
| ----------------------------------------- | --------------------- |
| `func()` | `(func (result i32))` |
| `func() -> string` | `(func (param $out i32) (result i32))` |
| `func(s: string)` | `(func (param $in i32) (result i32))` |
| `func(x: f32) -> f32` | `(func (param $in f32) (param $out i32) (result i32))` |
| `func(x: list<list<u8>>) -> list<string>` | `(func (param $in i32) (param $out i32) (result i32))` |
| `func() -> string` | `(func (param $out-ptr i32) (result i32))` |
| `func(x: f32) -> f32` | `(func (param $x f32) (param $out-ptr i32) (result i32))` |
| `func(s: string, t: string)` | `(func (param $s-ptr i32) (param $s-len i32) (result $t-ptr i32) (param $t-len i32) (result i32))` |

`future` and `stream` can appear anywhere in the parameter or result types. For example:
```wit
Expand All @@ -689,11 +699,11 @@ the synchronous ABI has signature:
```
and the asynchronous ABI has the signature:
```wat
(func (param $in i32) (param $out i32) (result i32))
(func (param $f i32) (param $out-ptr i32) (result i32))
```
where, according to the above rules, `$in` is the index of a future in the
current component instance's table (not a pointer to one) while `$out` is a
pointer to a linear memory location that will receive an `i32` index.
where `$f` is the index of a future (not a pointer to one) while while
`$out-ptr` is a pointer to a linear memory location that will receive an `i32`
index.

For the runtime semantics of this `i32` index, see `lift_stream`,
`lift_future`, `lower_stream` and `lower_future` in the [Canonical ABI
Expand Down Expand Up @@ -786,7 +796,7 @@ replaced with `...` to focus on the overall flow of function calls.
(core module $Main
(import "libc" "mem" (memory 1))
(import "libc" "realloc" (func (param i32 i32 i32 i32) (result i32)))
(import "" "fetch" (func $fetch (param i32 i32) (result i32)))
(import "" "fetch" (func $fetch (param i32 i32 i32) (result i32)))
(import "" "waitable-set.new" (func $new_waitable_set (result i32)))
(import "" "waitable-set.wait" (func $wait (param i32 i32) (result i32)))
(import "" "waitable.join" (func $join (param i32 i32)))
Expand All @@ -800,7 +810,7 @@ replaced with `...` to focus on the overall flow of function calls.
...
loop
...
call $fetch ;; pass a pointer-to-string and pointer-to-list-of-bytes outparam
call $fetch ;; pass a string pointer, string length and pointer-to-list-of-bytes outparam
... ;; ... and receive the index of a new async subtask
global.get $wsi
call $join ;; ... and add it to the waitable set
Expand Down Expand Up @@ -878,7 +888,7 @@ not externally-visible behavior.
(core module $Main
(import "libc" "mem" (memory 1))
(import "libc" "realloc" (func (param i32 i32 i32 i32) (result i32)))
(import "" "fetch" (func $fetch (param i32 i32) (result i32)))
(import "" "fetch" (func $fetch (param i32 i32 i32) (result i32)))
(import "" "waitable-set.new" (func $new_waitable_set (result i32)))
(import "" "waitable.join" (func $join (param i32 i32)))
(import "" "task.return" (func $task_return (param i32 i32)))
Expand All @@ -891,7 +901,7 @@ not externally-visible behavior.
...
loop
...
call $fetch ;; pass a pointer-to-string and pointer-to-list-of-bytes outparam
call $fetch ;; pass a string pointer, string length and pointer-to-list-of-bytes outparam
... ;; ... and receive the index of a new async subtask
global.get $wsi
call $join ;; ... and add it to the waitable set
Expand Down
7 changes: 5 additions & 2 deletions design/mvp/CanonicalABI.md
Original file line number Diff line number Diff line change
Expand Up @@ -2446,6 +2446,7 @@ stack), passing in an `i32` pointer as an parameter instead of returning an
Given all this, the top-level definition of `flatten_functype` is:
```python
MAX_FLAT_PARAMS = 16
MAX_FLAT_ASYNC_PARAMS = 4
MAX_FLAT_RESULTS = 1

def flatten_functype(opts, ft, context):
Expand All @@ -2465,12 +2466,14 @@ def flatten_functype(opts, ft, context):
else:
match context:
case 'lift':
if len(flat_params) > MAX_FLAT_PARAMS:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To confirm, should this be MAX_FLAT_ASYNC_PARAMS?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're talking about the incoming direction, the motivation for limiting outgoing to 4 doesn't apply, so I hadn't thought to change it, but does it simplify adapter fusion in cases where the ABI options cause the core lifted/lowered signatures to line up?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like wasip3-prototyping already supports the limit here as 16, and already has adapters for memory->flat, so I think it's reasonable to leave this as-is (and I agree the motivation for 4 is not relevant here).

I'll assume this'll stick to 16 and see how far I get

flat_params = ['i32']
if opts.callback:
flat_results = ['i32']
else:
flat_results = []
case 'lower':
if len(flat_params) > 1:
if len(flat_params) > MAX_FLAT_ASYNC_PARAMS:
flat_params = ['i32']
if len(flat_results) > 0:
flat_params += ['i32']
Expand Down Expand Up @@ -3124,7 +3127,7 @@ always returns control flow back to the caller without blocking:
```python
def on_start():
on_progress()
return lift_flat_values(cx, 1, flat_args, ft.param_types())
return lift_flat_values(cx, MAX_FLAT_ASYNC_PARAMS, flat_args, ft.param_types())

def on_resolve(results):
on_progress()
Expand Down
7 changes: 5 additions & 2 deletions design/mvp/canonical-abi/definitions.py
Original file line number Diff line number Diff line change
Expand Up @@ -1521,6 +1521,7 @@ def lower_async_value(ReadableEndT, cx, v, t):
### Flattening

MAX_FLAT_PARAMS = 16
MAX_FLAT_ASYNC_PARAMS = 4
MAX_FLAT_RESULTS = 1

def flatten_functype(opts, ft, context):
Expand All @@ -1540,12 +1541,14 @@ def flatten_functype(opts, ft, context):
else:
match context:
case 'lift':
if len(flat_params) > MAX_FLAT_PARAMS:
flat_params = ['i32']
if opts.callback:
flat_results = ['i32']
else:
flat_results = []
case 'lower':
if len(flat_params) > 1:
if len(flat_params) > MAX_FLAT_ASYNC_PARAMS:
flat_params = ['i32']
if len(flat_results) > 0:
flat_params += ['i32']
Expand Down Expand Up @@ -1932,7 +1935,7 @@ def on_resolve(results):

def on_start():
on_progress()
return lift_flat_values(cx, 1, flat_args, ft.param_types())
return lift_flat_values(cx, MAX_FLAT_ASYNC_PARAMS, flat_args, ft.param_types())

def on_resolve(results):
on_progress()
Expand Down
39 changes: 39 additions & 0 deletions design/mvp/canonical-abi/run_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -2231,6 +2231,44 @@ async def core_func(task, args):

await canon_lift(sync_opts, inst, ft, core_func, None, lambda:[], lambda _:(), host_on_block)

async def test_async_flat_params():
heap = Heap(1000)
opts = mk_opts(heap.memory, 'utf8', heap.realloc, sync = False)
inst = ComponentInstance()
caller = Task(opts, inst, FuncType([],[]), None, None, None)

ft1 = FuncType([F32Type(), F64Type(), U32Type(), S64Type()],[])
async def f1(task, on_start, on_resolve, on_block):
args = on_start()
assert(len(args) == 4)
assert(args[0] == 1.1)
assert(args[1] == 2.2)
assert(args[2] == 3)
assert(args[3] == 4)
on_resolve([])
[ret] = await canon_lower(opts, ft1, f1, caller, [1.1, 2.2, 3, 4])
assert(ret == Subtask.State.RETURNED)

ft2 = FuncType([U32Type(),U8Type(),U8Type(),U8Type()],[])
async def f2(task, on_start, on_resolve, on_block):
args = on_start()
assert(len(args) == 4)
assert(args == [1,2,3,4])
on_resolve([])
[ret] = await canon_lower(opts, ft2, f2, caller, [1,2,3,4])
assert(ret == Subtask.State.RETURNED)

ft3 = FuncType([U32Type(),U8Type(),U8Type(),U8Type(),U8Type()],[])
async def f3(task, on_start, on_resolve, on_block):
args = on_start()
assert(len(args) == 5)
assert(args == [1,2,3,4,5])
on_resolve([])
heap.memory[12:20] = b'\x01\x00\x00\x00\x02\x03\x04\x05'
[ret] = await canon_lower(opts, ft3, f3, caller, [12])
assert(ret == Subtask.State.RETURNED)


async def run_async_tests():
await test_roundtrips()
await test_handles()
Expand All @@ -2250,6 +2288,7 @@ async def run_async_tests():
await test_futures()
await test_cancel_subtask()
await test_self_empty()
await test_async_flat_params()

asyncio.run(run_async_tests())

Expand Down