Skip to content

Commit 6194785

Browse files
authored
Spec local initialization (WebAssembly#67)
1 parent f831cff commit 6194785

File tree

14 files changed

+387
-133
lines changed

14 files changed

+387
-133
lines changed

document/core/appendix/algorithm.rst

Lines changed: 83 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,10 @@ The algorithm is expressed in typed pseudo code whose semantics is intended to b
2121
Data Structures
2222
~~~~~~~~~~~~~~~
2323

24-
Types are representable as a set of enumerations.
24+
Types
25+
.....
26+
27+
Value types are representable as a set of enumerations.
2528

2629
.. code-block:: pseudo
2730
@@ -65,37 +68,64 @@ Equivalence and subtyping checks can be defined on these types.
6568
return matches_heap(t1.heap, t2.heap) && matches_null(t1.null, t2.null)
6669
6770
func matches(t1 : val_type, t2 : val_type) : bool =
68-
return
69-
(is_num t1 && is_num t2 && t1 = t2) ||
70-
(is_vec t1 && is_vec t2 && t1 = t2) ||
71-
(is_ref t1 && is_ref t2 && matches_ref(t1, t2)) ||
72-
t1 = Bot
71+
switch (t1, t2)
72+
case (Ref(_), Ref(_))
73+
return matches_ref(t1, t2)
74+
case (Bot, _)
75+
return true
76+
case (_, _)
77+
return t1 = t2
78+
79+
Context
80+
.......
81+
82+
Validation requires a :ref:`context <context>` for checking uses of :ref:`indices <syntax-index>`.
83+
For the purpose of presenting the algorithm, it is maintained in a set of global variables:
84+
85+
.. code-block:: pseudo
86+
87+
var locals : array(val_type)
88+
var locals_init : array(bool)
89+
var globals : array(global_type)
90+
var funcs : array(func_type)
91+
var tables : array(table_type)
92+
var mems : array(mem_type)
93+
94+
This assumes suitable representations for the various :ref:`types <syntax-type>` besides :code:`val_type`, which are omitted here.
95+
96+
For locals, there is an additional array recording the initialization status of each local.
97+
98+
Stacks
99+
......
73100

74-
The algorithm uses two separate stacks: the *value stack* and the *control stack*.
75-
The former tracks the :ref:`types <syntax-valtype>` of operand values on the :ref:`stack <stack>`,
76-
the latter surrounding :ref:`structured control instructions <syntax-instr-control>` and their associated :ref:`blocks <syntax-instr-control>`.
101+
The algorithm uses three separate stacks: the *value stack*, the *control stack*, and the *initialization stack*.
102+
The value stack tracks the :ref:`types <syntax-valtype>` of operand values on the :ref:`stack <stack>`.
103+
The control stack tracks surrounding :ref:`structured control instructions <syntax-instr-control>` and their associated :ref:`blocks <syntax-instr-control>`.
104+
The initialization stack records all :ref:`locals <syntax-local>` that have been initialized since the beginning of the function.
77105

78106
.. code-block:: pseudo
79107
80108
type val_stack = stack(val_type)
109+
type init_stack = stack(u32)
81110
82111
type ctrl_stack = stack(ctrl_frame)
83112
type ctrl_frame = {
84113
opcode : opcode
85114
start_types : list(val_type)
86115
end_types : list(val_type)
87-
height : nat
116+
val_height : nat
117+
init_height : nat
88118
unreachable : bool
89119
}
90120
91-
For each value, the value stack records its :ref:`value type <syntax-valtype>`.
92-
For each entered block, the control stack records a *control frame* with the originating opcode, the types on the top of the operand stack at the start and end of the block (used to check its result as well as branches), the height of the operand stack at the start of the block (used to check that operands do not underflow the current block), and a flag recording whether the remainder of the block is unreachable (used to handle :ref:`stack-polymorphic <polymorphism>` typing after branches).
121+
For each entered block, the control stack records a *control frame* with the originating opcode, the types on the top of the operand stack at the start and end of the block (used to check its result as well as branches), the height of the operand stack at the start of the block (used to check that operands do not underflow the current block), the height of the initialization stack at the start of the block (used to reset initialization status at the end of the block), and a flag recording whether the remainder of the block is unreachable (used to handle :ref:`stack-polymorphic <polymorphism>` typing after branches).
93122

94-
For the purpose of presenting the algorithm, the operand and control stacks are simply maintained as global variables:
123+
For the purpose of presenting the algorithm, these stacks are simply maintained as global variables:
95124

96125
.. code-block:: pseudo
97126
98127
var vals : val_stack
128+
var inits : init_stack
99129
var ctrls : ctrl_stack
100130
101131
However, these variables are not manipulated directly by the main checking function, but through a set of auxiliary functions:
@@ -123,7 +153,7 @@ However, these variables are not manipulated directly by the main checking funct
123153
func pop_ref() : ref_type =
124154
let actual = pop_val()
125155
error_if(not is_ref(actual))
126-
if actual = Bot then return Ref(Bot, false)
156+
if (actual = Bot) return Ref(Bot, false)
127157
return actual
128158
129159
func push_vals(types : list(val_type)) = foreach (t in types) push_val(t)
@@ -150,25 +180,49 @@ Finally, there are accumulative functions for pushing or popping multiple operan
150180
so that, e.g., :code:`ctrls[0]` accesses the element pushed last.
151181

152182

183+
The initialization stack and the initialization status of locals is manipulated through the following functions:
184+
185+
.. code-block:: pseudo
186+
187+
func get_local(idx : u32) =
188+
error_if(not (locals_init[idx] || ctrls[0].unreachable))
189+
190+
func set_local(idx : u32) =
191+
if (not locals_init[idx])
192+
inits.push(idx)
193+
locals_init[idx] := true
194+
195+
func reset_locals(height : nat) =
196+
while (inits.size() > height)
197+
locals_init[inits.pop()] := false
198+
199+
Getting a local verifies that it is either known to be initialized, or that that the operation is unreachable in the current stack frame.
200+
When a local is set that was not set already,
201+
then its initialization status is updated and the change is recorded in the initialization stack.
202+
Thus, the initialization status of all locals can be reset to a previous state by denoting a specific height in the initialization stack.
203+
204+
The size of the initialization stack is bounded by the number of (non-defaultable) locals in a function, so can be preallocated by an algorithm.
205+
153206
The control stack is likewise manipulated through auxiliary functions:
154207

155208
.. code-block:: pseudo
156209
157210
func push_ctrl(opcode : opcode, in : list(val_type), out : list(val_type)) =
158-
let frame = ctrl_frame(opcode, in, out, vals.size(), false)
211+
let frame = ctrl_frame(opcode, in, out, vals.size(), inits.size(), false)
159212
ctrls.push(frame)
160213
push_vals(in)
161214
162215
func pop_ctrl() : ctrl_frame =
163216
error_if(ctrls.is_empty())
164217
let frame = ctrls[0]
165218
pop_vals(frame.end_types)
166-
error_if(vals.size() =/= frame.height)
219+
error_if(vals.size() =/= frame.val_height)
220+
reset_locals(frame.init_height)
167221
ctrls.pop()
168222
return frame
169223
170224
func label_types(frame : ctrl_frame) : list(val_types) =
171-
return (if frame.opcode == loop then frame.start_types else frame.end_types)
225+
return (if (frame.opcode = loop) frame.start_types else frame.end_types)
172226
173227
func unreachable() =
174228
vals.resize(ctrls[0].height)
@@ -180,6 +234,7 @@ It allocates a new frame record recording them along with the current height of
180234
Popping a frame first checks that the control stack is not empty.
181235
It then verifies that the operand stack contains the right types of values expected at the end of the exited block and pops them off the operand stack.
182236
Afterwards, it checks that the stack has shrunk back to its initial height.
237+
Finally, it undoes all changes to the initialization status of locals that happend inside the block.
183238

184239
The type of the :ref:`label <syntax-label>` associated with a control frame is either that of the stack at the start or the end of the frame, determined by the opcode that it originates from.
185240

@@ -192,6 +247,7 @@ In that case, all existing operand types are purged from the value stack, in ord
192247
However, a polymorphic stack cannot underflow, but instead generates :code:`Bot` types as needed.
193248

194249

250+
195251
.. index:: opcode
196252

197253
Validation of Opcode Sequences
@@ -200,10 +256,6 @@ Validation of Opcode Sequences
200256
The following function shows the validation of a number of representative instructions that manipulate the stack.
201257
Other instructions are checked in a similar manner.
202258

203-
.. note::
204-
Various instructions not shown here will additionally require the presence of a validation :ref:`context <context>` for checking uses of :ref:`indices <syntax-index>`.
205-
That is an easy addition and therefore omitted from this presentation.
206-
207259
.. code-block:: pseudo
208260
209261
func validate(opcode) =
@@ -222,7 +274,7 @@ Other instructions are checked in a similar manner.
222274
let t2 = pop_val()
223275
error_if(not (is_num(t1) && is_num(t2) || is_vec(t1) && is_vec(t2)))
224276
error_if(t1 =/= t2 && t1 =/= Bot && t2 =/= Bot)
225-
push_val(if (t1 == Bot) t2 else t1)
277+
push_val(if (t1 = Bot) t2 else t1)
226278
227279
case (select t)
228280
pop_val(I32)
@@ -238,6 +290,14 @@ Other instructions are checked in a similar manner.
238290
let rt = pop_ref()
239291
push_val(Ref(rt.heap, false))
240292
293+
case (local.get x)
294+
get_local(x)
295+
push_val(locals[x])
296+
297+
case (local.set x)
298+
pop_val(locals[x])
299+
set_local(x)
300+
241301
case (unreachable)
242302
unreachable()
243303
@@ -296,7 +356,7 @@ Other instructions are checked in a similar manner.
296356
let rt = pop_ref()
297357
if (rt.heap =/= Bot)
298358
error_if(not is_def(rt.heap))
299-
let ft = lookup_func_type(rt.heap.idx) // TODO
359+
let ft = funcs[rt.heap.idx]
300360
pop_vals(ft.params)
301361
push_vals(ft.results)
302362

document/core/appendix/changes.rst

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -138,6 +138,17 @@ Added vector type and instructions that manipulate multiple numeric values in pa
138138
* New injection/projection :ref:`vector instructions <syntax-instr-vec>`: :math:`\K{i}\!N\!\K{x}\!M\!\K{.splat}`, :math:`\K{f}\!N\!\K{x}\!M\!\K{.splat}`, :math:`\K{i}\!N\!\K{x}\!M\!\K{.bitmask}`
139139

140140

141+
Release 2.1
142+
~~~~~~~~~~~
143+
144+
.. index:: reference
145+
146+
Typeful References
147+
..................
148+
149+
.. todo:: describe
150+
151+
141152
.. [#proposal-signext]
142153
https://github.com/WebAssembly/spec/tree/main/proposals/sign-extension-ops/
143154

document/core/appendix/index-rules.rst

Lines changed: 18 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -14,8 +14,14 @@ Typing of Static Constructs
1414
Construct Judgement
1515
=============================================== ===============================================================================
1616
:ref:`Limits <valid-limits>` :math:`C \vdashlimits \limits : k`
17+
:ref:`Numeric type <valid-numtype>` :math:`C \vdashnumtype \numtype \ok`
18+
:ref:`Vector type <valid-vectype>` :math:`C \vdashvectype \vectype \ok`
19+
:ref:`Heap type <valid-heaptype>` :math:`C \vdashheaptype \heaptype \ok`
20+
:ref:`Reference type <valid-reftype>` :math:`C \vdashreftype \reftype \ok`
21+
:ref:`Value type <valid-valtype>` :math:`C \vdashvaltype \valtype \ok`
1722
:ref:`Function type <valid-functype>` :math:`C \vdashfunctype \functype \ok`
18-
:ref:`Block type <valid-blocktype>` :math:`C \vdashblocktype \blocktype \ok`
23+
:ref:`Block type <valid-blocktype>` :math:`C \vdashblocktype \blocktype : \instrtype`
24+
:ref:`Instruction type <valid-instrtype>` :math:`C \vdashinstrtype \instrtype \ok`
1925
:ref:`Table type <valid-tabletype>` :math:`C \vdashtabletype \tabletype \ok`
2026
:ref:`Memory type <valid-memtype>` :math:`C \vdashmemtype \memtype \ok`
2127
:ref:`Global type <valid-globaltype>` :math:`C \vdashglobaltype \globaltype \ok`
@@ -24,6 +30,7 @@ Construct Judgement
2430
:ref:`Instruction sequence <valid-instr-seq>` :math:`S;C \vdashinstrseq \instr^\ast : \functype`
2531
:ref:`Expression <valid-expr>` :math:`C \vdashexpr \expr : \resulttype`
2632
:ref:`Function <valid-func>` :math:`C \vdashfunc \func : \functype`
33+
:ref:`Local <valid-local>` :math:`C \vdashlocal \local : \localtype`
2734
:ref:`Table <valid-table>` :math:`C \vdashtable \table : \tabletype`
2835
:ref:`Memory <valid-mem>` :math:`C \vdashmem \mem : \memtype`
2936
:ref:`Global <valid-global>` :math:`C \vdashglobal \global : \globaltype`
@@ -66,6 +73,16 @@ Construct Judgement
6673
=============================================== ===============================================================================
6774

6875

76+
Defaultability
77+
~~~~~~~~~~~~~~
78+
79+
================================================= ===============================================================================
80+
Construct Judgement
81+
================================================= ===============================================================================
82+
:ref:`Defaultable value type <valid-defaultable>` :math:`C \vdashvaltypedefaultable \valtype \defaultable`
83+
================================================= ===============================================================================
84+
85+
6986
Constantness
7087
~~~~~~~~~~~~
7188

document/core/binary/modules.rst

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -405,15 +405,15 @@ denoting *count* locals of the same value type.
405405
\X{size}{:}\Bu32~~\X{code}{:}\Bfunc
406406
&\Rightarrow& \X{code} & (\iff \X{size} = ||\Bfunc||) \\
407407
\production{function} & \Bfunc &::=&
408-
(t^\ast)^\ast{:}\Bvec(\Blocals)~~e{:}\Bexpr
409-
&\Rightarrow& \concat((t^\ast)^\ast), e
410-
& (\iff |\concat((t^\ast)^\ast)| < 2^{32}) \\
408+
(\local^\ast)^\ast{:}\Bvec(\Blocals)~~e{:}\Bexpr
409+
&\Rightarrow& \concat((\local^\ast)^\ast), e
410+
& (\iff |\concat((\local^\ast)^\ast)| < 2^{32}) \\
411411
\production{locals} & \Blocals &::=&
412-
n{:}\Bu32~~t{:}\Bvaltype &\Rightarrow& t^n \\
412+
n{:}\Bu32~~t{:}\Bvaltype &\Rightarrow& \{ \LTYPE~t \}^n \\
413413
\end{array}
414414
415415
Here, :math:`\X{code}` ranges over pairs :math:`(\valtype^\ast, \expr)`.
416-
The meta function :math:`\concat((t^\ast)^\ast)` concatenates all sequences :math:`t_i^\ast` in :math:`(t^\ast)^\ast`.
416+
The meta function :math:`\concat((\local^\ast)^\ast)` concatenates all sequences :math:`\local_i^\ast` in :math:`(\local^\ast)^\ast`.
417417
Any code for which the length of the resulting sequence is out of bounds of the maximum size of a :ref:`vector <syntax-vec>` is malformed.
418418

419419
.. note::

document/core/exec/instructions.rst

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1043,7 +1043,7 @@ Variable Instructions
10431043

10441044
1. Let :math:`F` be the :ref:`current <exec-notation-textual>` :ref:`frame <syntax-frame>`.
10451045

1046-
2. Assert: due to :ref:`validation <valid-local.get>`, :math:`F.\ALOCALS[x]` exists.
1046+
2. Assert: due to :ref:`validation <valid-local.get>`, :math:`F.\ALOCALS[x]` exists and is non-empty.
10471047

10481048
3. Let :math:`\val` be the value :math:`F.\ALOCALS[x]`.
10491049

@@ -3007,23 +3007,21 @@ Invocation of :ref:`function address <syntax-funcaddr>` :math:`a`
30073007

30083008
3. Let :math:`[t_1^n] \to [t_2^m]` be the :ref:`function type <syntax-functype>` :math:`f.\FITYPE`.
30093009

3010-
4. Let :math:`t^\ast` be the list of :ref:`value types <syntax-valtype>` :math:`f.\FICODE.\FLOCALS`.
3010+
4. Let :math:`\local^\ast` be the list of :ref:`locals <syntax-local>` :math:`f.\FICODE.\FLOCALS`.
30113011

30123012
5. Let :math:`\instr^\ast~\END` be the :ref:`expression <syntax-expr>` :math:`f.\FICODE.\FBODY`.
30133013

30143014
6. Assert: due to :ref:`validation <valid-call>`, :math:`n` values are on the top of the stack.
30153015

30163016
7. Pop the values :math:`\val^n` from the stack.
30173017

3018-
8. Let :math:`\val_0^\ast` be the list of zero values of types :math:`t^\ast`.
3018+
8. Let :math:`F` be the :ref:`frame <syntax-frame>` :math:`\{ \AMODULE~f.\FIMODULE, \ALOCALS~\val^n~(\default_t)^\ast \}`.
30193019

3020-
9. Let :math:`F` be the :ref:`frame <syntax-frame>` :math:`\{ \AMODULE~f.\FIMODULE, \ALOCALS~\val^n~(\default_t)^\ast \}`.
3020+
9. Push the activation of :math:`F` with arity :math:`m` to the stack.
30213021

3022-
10. Push the activation of :math:`F` with arity :math:`m` to the stack.
3022+
10. Let :math:`L` be the :ref:`label <syntax-label>` whose arity is :math:`m` and whose continuation is the end of the function.
30233023

3024-
11. Let :math:`L` be the :ref:`label <syntax-label>` whose arity is :math:`m` and whose continuation is the end of the function.
3025-
3026-
12. :ref:`Enter <exec-instr-seq-enter>` the instruction sequence :math:`\instr^\ast` with label :math:`L`.
3024+
11. :ref:`Enter <exec-instr-seq-enter>` the instruction sequence :math:`\instr^\ast` with label :math:`L`.
30273025

30283026
.. math::
30293027
~\\[-1ex]
@@ -3035,11 +3033,14 @@ Invocation of :ref:`function address <syntax-funcaddr>` :math:`a`
30353033
\begin{array}[t]{@{}r@{~}l@{}}
30363034
(\iff & S.\SFUNCS[a] = f \\
30373035
\wedge & f.\FITYPE = [t_1^n] \to [t_2^m] \\
3038-
\wedge & f.\FICODE = \{ \FTYPE~x, \FLOCALS~t^k, \FBODY~\instr^\ast~\END \} \\
3036+
\wedge & f.\FICODE = \{ \FTYPE~x, \FLOCALS~\{\LTYPE~t\}^k, \FBODY~\instr^\ast~\END \} \\
30393037
\wedge & F = \{ \AMODULE~f.\FIMODULE, ~\ALOCALS~\val^n~(\default_t)^k \})
30403038
\end{array} \\
30413039
\end{array}
30423040
3041+
.. note::
3042+
For non-defaultable types, the respective local is left uninitialized by these rules.
3043+
30433044

30443045
.. _exec-invoke-exit:
30453046

document/core/exec/runtime.rst

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -51,14 +51,16 @@ or *external references* pointing to an uninterpreted form of :ref:`extern addre
5151

5252
.. _default-val:
5353

54-
Each :ref:`value type <syntax-valtype>` has an associated *default value*;
55-
it is the respective value :math:`0` for :ref:`number types <syntax-numtype>`, :math:`0` for :ref:`vector types <syntax-vectype>`, and null for :ref:`reference types <syntax-reftype>`.
54+
:ref:`Value types <syntax-valtype>` can have an associated *default value*;
55+
it is the respective value :math:`0` for :ref:`number types <syntax-numtype>`, :math:`0` for :ref:`vector types <syntax-vectype>`, and null for nullable :ref:`reference types <syntax-reftype>`.
56+
For other references, no default value is defined, :math:`\default_t` hence is an optional value :math:`\val^?`.
5657

5758
.. math::
5859
\begin{array}{lcl@{\qquad}l}
5960
\default_t &=& t{.}\CONST~0 & (\iff t = \numtype) \\
6061
\default_t &=& t{.}\CONST~0 & (\iff t = \vectype) \\
6162
\default_t &=& \REFNULL~t & (\iff t = (\REF~\NULL~\heaptype)) \\
63+
\default_t &=& \epsilon & (\iff t = (\REF~\heaptype)) \\
6264
\end{array}
6365
6466
@@ -493,10 +495,11 @@ and a reference to the function's own :ref:`module instance <syntax-moduleinst>`
493495
\production{(activation)} & \X{activation} &::=&
494496
\FRAME_n\{\frame\} \\
495497
\production{(frame)} & \frame &::=&
496-
\{ \ALOCALS~\val^\ast, \AMODULE~\moduleinst \} \\
498+
\{ \ALOCALS~(\val^?)^\ast, \AMODULE~\moduleinst \} \\
497499
\end{array}
498500
499-
The values of the locals are mutated by respective :ref:`variable instructions <syntax-instr-variable>`.
501+
Locals may be uninitialized, in which case they are empty.
502+
Locals are mutated by respective :ref:`variable instructions <syntax-instr-variable>`.
500503

501504

502505
.. _exec-expand:

document/core/syntax/modules.rst

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -142,7 +142,9 @@ The |MFUNCS| component of a module defines a vector of *functions* with the foll
142142
.. math::
143143
\begin{array}{llll}
144144
\production{function} & \func &::=&
145-
\{ \FTYPE~\typeidx, \FLOCALS~\vec(\valtype), \FBODY~\expr \} \\
145+
\{ \FTYPE~\typeidx, \FLOCALS~\vec(\local), \FBODY~\expr \} \\
146+
\production{local} & \local &::=&
147+
\{ \LTYPE~\valtype \} \\
146148
\end{array}
147149
148150
The |FTYPE| of a function declares its signature by reference to a :ref:`type <syntax-type>` defined in the module.

0 commit comments

Comments
 (0)