Skip to content

[interpreter] Strictify and specify .bin.wast format #1173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 4, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
57 changes: 54 additions & 3 deletions interpreter/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -164,8 +164,8 @@ The implementation consumes a WebAssembly AST given in S-expression syntax. Here

Note: The grammar is shown here for convenience, the definite source is the [specification of the text format](https://webassembly.github.io/spec/core/text/).
```
num: <digit> (_? <digit>)*
hexnum: <hexdigit> (_? <hexdigit>)*
num: <digit>(_? <digit>)*
hexnum: <hexdigit>(_? <hexdigit>)*
nat: <num> | 0x<hexnum>
int: <nat> | +<nat> | -<nat>
float: <num>.<num>?(e|E <num>)? | 0x<hexnum>.<hexnum>?(p|P <num>)?
Expand Down Expand Up @@ -356,11 +356,62 @@ A module of the form `(module quote <string>*)` is given in textual form and wil
There are also a number of meta commands.
The `script` command is a simple mechanism to name sub-scripts themselves. This is mainly useful for converting scripts with the `output` command. Commands inside a `script` will be executed normally, but nested meta are expanded in place (`input`, recursively) or elided (`output`) in the named script.

The `input` and `output` meta commands determine the requested file format from the file name extension. They can handle both `.wasm`, `.wat`, and `.wast` files. In the case of input, a `.wast` script will be recursively executed. Output additionally handles `.js` as a target, which will convert the referenced script to an equivalent, self-contained JavaScript runner. It also recognises `.bin.wast` specially, which creates a script where module definitions are in binary.
The `input` and `output` meta commands determine the requested file format from the file name extension. They can handle both `.wasm`, `.wat`, and `.wast` files. In the case of input, a `.wast` script will be recursively executed. Output additionally handles `.js` as a target, which will convert the referenced script to an equivalent, self-contained JavaScript runner. It also recognises `.bin.wast` specially, which creates a _binary script_ where module definitions are in binary, as defined below.

The interpreter supports a "dry" mode (flag `-d`), in which modules are only validated. In this mode, all actions and assertions are ignored.
It also supports an "unchecked" mode (flag `-u`), in which module definitions are not validated before use.

### Binary Scripts

The grammar of binary scripts is a subset of the grammar for general scripts:
```
binscript: <cmd>*

cmd:
<module> ;; define, validate, and initialize module
( register <string> <name>? ) ;; register module for imports
module with given failure string

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line looks like junk?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you referring to the register command being included? That is needed for tests involving multiple modules.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think he means line 373, "module with given failure string". It seems out of place.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, I see, sorry for being blind. Fixed.

<action> ;; perform action and print results
<assertion> ;; assert result of an action

module:
( module <name>? binary <string>* ) ;; module in binary format (may be malformed)

action:
( invoke <name>? <string> <expr>* ) ;; invoke function export
( get <name>? <string> ) ;; get global export

assertion:
( assert_return <action> <result>* ) ;; assert action has expected results
( assert_trap <action> <failure> ) ;; assert action traps with given failure string
( assert_exhaustion <action> <failure> ) ;; assert action exhausts system resources
( assert_malformed <module> <failure> ) ;; assert module cannot be decoded with given failure string
( assert_invalid <module> <failure> ) ;; assert module is invalid with given failure string
( assert_unlinkable <module> <failure> ) ;; assert module fails to link
( assert_trap <module> <failure> ) ;; assert module traps on instantiation
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also noticed that assert_trap is overloaded for both modules and actions, as it tripped me up here too. Maybe something to address in the future.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right.


result:
( <val_type>.const <numpat> )

numpat:
<value> ;; literal result
nan:canonical ;; NaN in canonical form
nan:arithmetic ;; NaN with 1 in MSB of payload

value: <int> | <float>
int: 0x<hexnum>
float: 0x<hexnum>.<hexnum>?(p|P <num>)?
hexnum: <hexdigit>(_? <hexdigit>)*

name: $(<letter> | <digit> | _ | . | + | - | * | / | \ | ^ | ~ | = | < | > | ! | ? | @ | # | $ | % | & | | | : | ' | `)+
string: "(<char> | \n | \t | \\ | \' | \" | \<hex><hex> | \u{<hex>+})*"
```
This grammar removes meta commands, textual and quoted modules.
All numbers are in hex notation.

Moreover, float values are required to be precise, that is, they may not contain bits that would lead to rounding.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:-)



## Abstract Syntax

The abstract WebAssembly syntax, as described above and in the [design doc](https://github.com/WebAssembly/design/blob/master/Semantics.md), is defined in [ast.ml](syntax/ast.ml).
Expand Down
33 changes: 21 additions & 12 deletions interpreter/exec/float.ml
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,7 @@ sig
val to_float : t -> float
val of_string : string -> t
val to_string : t -> string
val to_hex_string : t -> string
val of_bits : bits -> t
val to_bits : t -> bits
val add : t -> t -> t
Expand Down Expand Up @@ -322,35 +323,43 @@ struct
(* String conversion that groups digits for readability *)

let is_digit c = '0' <= c && c <= '9'
let isnt_digit c = not (is_digit c)
let is_hex_digit c = is_digit c || 'a' <= c && c <= 'f'

let rec add_digits buf s i j k =
let rec add_digits buf s i j k n =
if i < j then begin
if k = 0 then Buffer.add_char buf '_';
Buffer.add_char buf s.[i];
add_digits buf s (i + 1) j ((k + 2) mod 3)
add_digits buf s (i + 1) j ((k + n - 1) mod n) n
end

let group_digits s =
let group_digits is_digit n s =
let isnt_digit c = not (is_digit c) in
let len = String.length s in
let mant = Lib.Option.get (Lib.String.find_from_opt is_digit s 0) len in
let x = Lib.Option.get (Lib.String.find_from_opt ((=) 'x') s 0) 0 in
let mant = Lib.Option.get (Lib.String.find_from_opt is_digit s x) len in
let point = Lib.Option.get (Lib.String.find_from_opt isnt_digit s mant) len in
let frac = Lib.Option.get (Lib.String.find_from_opt is_digit s point) len in
let exp = Lib.Option.get (Lib.String.find_from_opt isnt_digit s frac) len in
let buf = Buffer.create (len*4/3) in
let buf = Buffer.create (len*(n+1)/n) in
Buffer.add_substring buf s 0 mant;
add_digits buf s mant point ((point - mant) mod 3 + 3);
add_digits buf s mant point ((point - mant) mod n + n) n;
Buffer.add_substring buf s point (frac - point);
add_digits buf s frac exp 3;
add_digits buf s frac exp n n;
Buffer.add_substring buf s exp (len - exp);
Buffer.contents buf

let to_string x =
let to_string' convert is_digit n x =
(if x < Rep.zero then "-" else "") ^
if is_nan x then
let payload = Rep.logand (abs x) (Rep.lognot bare_nan) in
"nan:0x" ^ Rep.to_hex_string payload
"nan:0x" ^ group_digits is_hex_digit 4 (Rep.to_hex_string payload)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double-check my understanding, this grouping of the hex digits is just to be consistent with the output format for hex binary numbers as well, correct?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and it was more or less an oversight that the pretty printer didn't already group them before.

else
let s = Printf.sprintf "%.17g" (to_float (abs x)) in
group_digits (if s.[String.length s - 1] = '.' then s ^ "0" else s)
let s = convert (to_float (abs x)) in
group_digits is_digit n
(if s.[String.length s - 1] = '.' then s ^ "0" else s)

let to_string = to_string' (Printf.sprintf "%.17g") is_digit 3
let to_hex_string x =
if is_inf x then to_string x else
to_string' (Printf.sprintf "%h") is_hex_digit 4 x
end
1 change: 1 addition & 0 deletions interpreter/exec/i32.ml
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ include Int.Make
(struct
include Int32
let bitwidth = 32
let to_hex_string = Printf.sprintf "%lx"
end)
1 change: 1 addition & 0 deletions interpreter/exec/i64.ml
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,5 @@ include Int.Make
(struct
include Int64
let bitwidth = 64
let to_hex_string = Printf.sprintf "%Lx"
end)
20 changes: 12 additions & 8 deletions interpreter/exec/int.ml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,7 @@ sig
val of_int : int -> t
val to_int : t -> int
val to_string : t -> string
val to_hex_string : t -> string

val bitwidth : int
end
Expand Down Expand Up @@ -77,6 +78,7 @@ sig
val of_string : string -> t
val to_string_s : t -> string
val to_string_u : t -> string
val to_hex_string : t -> string
end

module Make (Rep : RepType) : S with type bits = Rep.t and type t = Rep.t =
Expand Down Expand Up @@ -277,25 +279,27 @@ struct

(* String conversion that groups digits for readability *)

let rec add_digits buf s i j k =
let rec add_digits buf s i j k n =
if i < j then begin
if k = 0 then Buffer.add_char buf '_';
Buffer.add_char buf s.[i];
add_digits buf s (i + 1) j ((k + 2) mod 3)
add_digits buf s (i + 1) j ((k + n - 1) mod n) n
end

let group_digits s =
let group_digits n s =
let len = String.length s in
let num = if s.[0] = '-' then 1 else 0 in
let buf = Buffer.create (len*4/3) in
let buf = Buffer.create (len*(n+1)/n) in
Buffer.add_substring buf s 0 num;
add_digits buf s num len ((len - num) mod 3 + 3);
add_digits buf s num len ((len - num) mod n + n) n;
Buffer.contents buf

let to_string_s i = group_digits (Rep.to_string i)
let to_string_s i = group_digits 3 (Rep.to_string i)
let to_string_u i =
if i >= Rep.zero then
group_digits (Rep.to_string i)
group_digits 3 (Rep.to_string i)
else
group_digits (Rep.to_string (div_u i ten) ^ Rep.to_string (rem_u i ten))
group_digits 3 (Rep.to_string (div_u i ten) ^ Rep.to_string (rem_u i ten))

let to_hex_string i = "0x" ^ group_digits 4 (Rep.to_hex_string i)
end
53 changes: 32 additions & 21 deletions interpreter/text/arrange.ml
Original file line number Diff line number Diff line change
Expand Up @@ -375,12 +375,20 @@ let module_ = module_with_var_opt None

(* Scripts *)

let literal lit =
let literal mode lit =
match lit.it with
| Values.I32 i -> Node ("i32.const " ^ I32.to_string_s i, [])
| Values.I64 i -> Node ("i64.const " ^ I64.to_string_s i, [])
| Values.F32 z -> Node ("f32.const " ^ F32.to_string z, [])
| Values.F64 z -> Node ("f64.const " ^ F64.to_string z, [])
| Values.I32 i ->
let f = if mode = `Binary then I32.to_hex_string else I32.to_string_s in
Node ("i32.const " ^ f i, [])
| Values.I64 i ->
let f = if mode = `Binary then I64.to_hex_string else I64.to_string_s in
Node ("i64.const " ^ f i, [])
| Values.F32 z ->
let f = if mode = `Binary then F32.to_hex_string else F32.to_string in
Node ("f32.const " ^ f z, [])
| Values.F64 z ->
let f = if mode = `Binary then F64.to_hex_string else F64.to_string in
Node ("f64.const " ^ f z, [])

let definition mode x_opt def =
try
Expand Down Expand Up @@ -410,20 +418,20 @@ let definition mode x_opt def =
let access x_opt n =
String.concat " " [var_opt x_opt; name n]

let action act =
let action mode act =
match act.it with
| Invoke (x_opt, name, lits) ->
Node ("invoke" ^ access x_opt name, List.map literal lits)
Node ("invoke" ^ access x_opt name, List.map (literal mode) lits)
| Get (x_opt, name) ->
Node ("get" ^ access x_opt name, [])

let nan = function
| CanonicalNan -> "nan:canonical"
| ArithmeticNan -> "nan:arithmetic"

let result res =
let result mode res =
match res.it with
| LitResult lit -> literal lit
| LitResult lit -> literal mode lit
| NanResult nanop ->
match nanop.it with
| Values.I32 _ | Values.I64 _ -> assert false
Expand All @@ -433,27 +441,30 @@ let result res =
let assertion mode ass =
match ass.it with
| AssertMalformed (def, re) ->
Node ("assert_malformed", [definition `Original None def; Atom (string re)])
(match mode, def.it with
| `Binary, Quoted _ -> []
| _ ->
[Node ("assert_malformed", [definition `Original None def; Atom (string re)])]
)
| AssertInvalid (def, re) ->
Node ("assert_invalid", [definition mode None def; Atom (string re)])
[Node ("assert_invalid", [definition mode None def; Atom (string re)])]
| AssertUnlinkable (def, re) ->
Node ("assert_unlinkable", [definition mode None def; Atom (string re)])
[Node ("assert_unlinkable", [definition mode None def; Atom (string re)])]
| AssertUninstantiable (def, re) ->
Node ("assert_trap", [definition mode None def; Atom (string re)])
[Node ("assert_trap", [definition mode None def; Atom (string re)])]
| AssertReturn (act, results) ->
Node ("assert_return", action act :: List.map result results)
[Node ("assert_return", action mode act :: List.map (result mode) results)]
| AssertTrap (act, re) ->
Node ("assert_trap", [action act; Atom (string re)])
[Node ("assert_trap", [action mode act; Atom (string re)])]
| AssertExhaustion (act, re) ->
Node ("assert_exhaustion", [action act; Atom (string re)])
[Node ("assert_exhaustion", [action mode act; Atom (string re)])]

let command mode cmd =
match cmd.it with
| Module (x_opt, def) -> definition mode x_opt def
| Register (n, x_opt) ->
Node ("register " ^ name n ^ var_opt x_opt, [])
| Action act -> action act
| Module (x_opt, def) -> [definition mode x_opt def]
| Register (n, x_opt) -> [Node ("register " ^ name n ^ var_opt x_opt, [])]
| Action act -> [action mode act]
| Assertion ass -> assertion mode ass
| Meta _ -> assert false

let script mode scr = List.map (command mode) scr
let script mode scr = Lib.List.concat_map (command mode) scr
4 changes: 4 additions & 0 deletions interpreter/util/lib.ml
Original file line number Diff line number Diff line change
Expand Up @@ -101,6 +101,10 @@ struct
match f x with
| None -> map_filter f xs
| Some y -> y :: map_filter f xs

let rec concat_map f = function
| [] -> []
| x::xs -> f x @ concat_map f xs
end

module List32 =
Expand Down
1 change: 1 addition & 0 deletions interpreter/util/lib.mli
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,7 @@ sig
val index_of : 'a -> 'a list -> int option
val index_where : ('a -> bool) -> 'a list -> int option
val map_filter : ('a -> 'b option) -> 'a list -> 'b list
val concat_map : ('a -> 'b list) -> 'a list -> 'b list
end

module List32 :
Expand Down