Skip to content

tests #30

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 35 commits into from
Closed

tests #30

wants to merge 35 commits into from

Conversation

ktock
Copy link
Owner

@ktock ktock commented Jul 14, 2025

No description provided.

ktock added 30 commits July 12, 2025 14:12
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Wasm backend is implemented based on the TCI backend and utilizes a forked
TCI to execute TBs.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Wasm backend should implement its own disassember for Wasm
instructions.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Now that there is a backend for WebAssembly build (/tcg/wasm32/), the
requirement of --enable-tcg-interpreter in meson.build can be removed.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
WebAssembly instructions vary in size, including single-byte
instructions. This commit sets TCG_TARGET_INSN_UNIT_SIZE to 1 and updates
the TCI fork to use "tcg_insn_unit_tci" (a uint32_t) for 4-byte operations.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements and, or and xor operations using Wasm
instructions. Each TCG variable is mapped to a 64bit Wasm variable. In Wasm,
and/or/xor instructions operate on values by first pushing the operands into
the Wasm's stack using get instructions. The result is left on the stack and
this can be assigned to a variable by popping it using a set instruction.

The Wasm binary format is documented at [1]. In this backend, TCI
instructions are emitted to s->code_ptr, while the corresponding Wasm
instructions are generated into a separated buffer allocated via
tcg_malloc(). These two code buffers must be merged into the final code
buffer before tcg_gen_code returns.

[1] https://webassembly.github.io/spec/core/binary/index.html

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Add, sub and mul operations are implemented using the corresponding
instructions in Wasm.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements shl, shr and sar operations using Wasm
instructions. The Wasm backend uses 64bit variables so the right shift
operation for 32bit values needs to extract the lower 32bit of the operand
before shifting. Additionally, since constant values must be encoded in
LEB128 format, this commit introduces an encoder function implemented
following [1].

[1] https://en.wikipedia.org/wiki/LEB128

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements setcond and movcond operations using Wasm's if/else
instructions. Support for TCG_COND_TSTEQ and TCG_COND_TSTNE is not yet
implemented, so TCG_TARGET_HAS_tst is set to 0.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This implements deposit, sextract and extract operations. The
tcg_out_[s]extract functions are used by several other functions
(e.g. tcg_out_ext*) and are intended to emit TCI code. So they have been
renamed to tcg_tci_out_[s]extract.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements load and store operations using Wasm memory
instructions. Since Wasm's load/store instructions don't support negative
offset, address calculations are performed separately before the memory
access.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements mov/movi instructions. The tcg_out_mov[i] functions
are used by several other functions and are intended to emit TCI code. So
they have been renamed to tcg_tci_out_mov[i].

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements the ext operations using Wasm's extend instructions.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements the bswap operation using Wasm instructions.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements rem and div operations using Wasm's rem/div
instructions.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements andc, orc, eqv, nand and nor operations using Wasm
instructions.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements neg, not and ctpop operations using Wasm
instructions.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements rot, clz and ctz operations using Wasm instructions.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit implements addc and subb operations using Wasm instructions. A
carry flag is introduced as the 16th variable in the module following other
15 variables that represent TCG variables.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Wasm does not support direct jumps to arbitrary code addresses, so
label-based control flow is implemented using Wasm's control flow
instructions. As illustrated in the pseudo-code below, each TB wraps its
instructions inside a large loop. Each set of codes separated by labels is
placed inside an "if" block. Br is implemented by breaking out of the
current block and conditionally entering the target block:

loop
  if
    ... code after label1
  end
  if
    ... code after label2
  end
  ...
end

Each block within the TB is assigned a unique int32 ID. The topmost "if"
block is assigned ID 0, and subsequent blocks are assigned incrementally.

To control br, this commit introduces a 17th Wasm variable BLOCK_PTR_IDX
which holds the ID of the target block. The br instruction sets this
variable to the target block's ID, breaks from the current if block, and
allows the control flow to move forward. Each if block checks whether the
BLOCK_PTR_IDX variable matches its assigned ID. If it does, execution
proceeds within that block.

The start of the global loop and the first if block is generated in
tcg_out_tb_start. To properly close the blocks, this commit also introduces
a new TCG backend callback tcg_out_tb_end which emits the "end" instructions
for the final if block and the loop block in the Wasm backend.

Another new callback tcg_out_label_cb is used to emit block boundaries,
specifically the end of the previous block and the if of the next block, at
label positions. In this callback, the mapping between label IDs and block
IDs is recorded in LabelInfo, which is later used to resolve br
instructions.

Since the block ID for a label might not be known at the time a br
instruction is generated, a placeholder (longer than 32bit and encoded as
LEB128) is emitted instead. These placeholders are tracked in
BlockPlaceholder and resolved later.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
In the Wasm backend, each TB is compiled to a separeted Wasm
module. Control transfer between TBs (i.e. from one Wasm module to
another) is handled by the caller of the module.

The goto_tb and goto_ptr operations are implemented by returning
control to the caller using the return instruction. The destination
TB's pointer is passed to the caller via a shared wasmContext
structure which is accessible from both the Wasm module and the caller. This
wasmContext must be provided to the module as an argument which is
accessible as the local variable at index 0.

If the destination TB is the current TB itself, there is no need to
return control to the caller. Instead, execution can jump directly to
the top of the loop within the TB.

The exit_tb operation sets the pointer in wasmContext to 0, indicating that
there is no destination TB.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
To call QEMU functions from a TB (i.e. a Wasm module), those functions must
be imported into the module.

Wasm's call instruction can invoke an imported function using a locally
assigned function index. When a call TCG operation is generated, the Wasm
backend assigns a unique ID (starting from 0) to the target function. The
mapping between the function pointer and its assigned ID is recorded in the
HelperInfo structure.

Since Wasm's call instruction requires arguments to be pushed onto the Wasm
stack, the backend retrieves the function arguments from TCG's stack array
and pushes them to the stack before the call. After the function returns,
the result is retrieved from the stack and set in the corresponding TCG
variable.

In our Emscripten build configuration with !has_int128_type, a 128-bit value
is represented by the Int128 struct. These values are passed indirectly via
pointer parameters and returned via a prepended pointer argument, as
described in [1].

[1] https://github.com/WebAssembly/tool-conventions/blob/060cf4073e46931160c2e9ecd43177ee1fe93866/BasicCABI.md#function-arguments-and-return-values

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit adds qemu_ld and qemu_st by calling the helper functions
corresponding to MemOp.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit adds initialization of TCG_AREG0 and TCG_REG_CALL_STACK at the
beginning of each TB. The CPUArchState struct and the stack array are passed
from the caller via the wasmContext structure. Since TB execution begins at
the first block, the BLOCK_PTR_IDX variable is initialized to 0.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit updates tcg_out_tb_start and tcg_out_tb_end to emit Wasm
binaries into the TB code buffer. The generated Wasm binary defines a
function of type wasm_tb_func which takes a wasmContext, executes the TB,
and returns a result. In the Wasm backend, each TB starts with a
wasmTBHeader, followed by the following data:

- TCI code
- Wasm code
- Array of function indices imported into the Wasm instance

The wasmTBHeader contains pointers to each of these elements.

tcg_out_tb_start writes the wasmTBHeader to the code buffer. tcg_out_tb_end
generates the full Wasm executable binary by creating the Wasm module header
following the spec[1][2] and copying the Wasm code body from sub_buf to the
code buffer. Wasm binary is placed after the TCI code which was emitted
earlier.

Additionally, an array of imported function pointers is appended to the TB.
They are used during Wasm module instantiation. Function are imported to
Wasm with names like "helper.0", "helper.1", etc., where the number
corresponds to the assigned function IDs.

Each function's type signature must also be encoded in the Wasm module header.
To support this, each call, qemu_ld and qemu_st operation records the target
function's type information to a buffer.

Memory is shared between QEMU and the TBs and is imported to the Wasm module
with the name "env.buffer".

[1] https://webassembly.github.io/spec/core/binary/modules.html
[2] https://github.com/WebAssembly/threads/blob/b2567bff61ee6fbe731934f0ed17a5d48dc9ab01/proposals/threads/Overview.md

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
instantiate_wasm is a function that instantiates a TB's Wasm binary,
importing the functions as specified by its arguments. Following the header
definition in wasm32/tcg-target.c.inc, QEMU's memory is imported into the
module as "env.buffer", and helper functions are imported as
"helper.<id>". The instantiated Wasm module is imported to QEMU using
Emscripten's "addFunction" feature[1] which returns a function pointer. This
allows QEMU to call this module directly from C code via that pointer.

Note Since FireFox 138, WebAssembly.Module no longer accepts a
SharedArrayBuffer as input [2] as reported by Nicolas Vandeginste in my
downstream fork[3]. This commit ensures that WebAssembly.Module() is passed
a Uint8Array created from the binary data on a SharedArrayBuffer.

[1] https://emscripten.org/docs/porting/connecting_cpp_and_javascript/Interacting-with-code.html#calling-javascript-functions-as-function-pointers-from-c
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1965217
[3] #25

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Emscripten's Fiber coroutine implements coroutine switching using the stack
unwinding and rewinding capabilities of Asyncify [1]. When a coroutine
yields (i.e. switches out), Asyncify unwinds the stack, returning control to
Emscripten's JS code (Fiber.trampoline()), which then performs stack
rewinding to resume execution in the target coroutine. Stack unwinding is
implemented by a sequence of immediate function returns, while rewinding
works by re-entering the functions in the call stack, skipping any code
between the top of the function and the original call position [2].

This commit modifies the Wasm TB modules to support Fiber
coroutines. Assuming the TCG CPU loop is executed by only one coroutine per
thread, a TB module must allow helper functions to unwind and be resumed via
rewinding.

Specifically:

- When a helper returns due to an unwind, the module must immediately return
  to its caller, allowing unwinding to propagate.
- When being called again for a rewind, the module must skip any code
  between the top of the function and the call position that triggered the
  unwind, and directly enter the helper.

To support this:

- TBs now check the Asyncify.state JS object after each helper call. If
  unwinding is in progress, the TB immediately returns control to the
  caller.
- Each function call is preceded by a block boundary and an update of the
  BLOCK_PTR_IDX variable. This enables the TB to re-enter execution at the
  correct point during a rewind, skipping earlier blocks.

Additionally, this commit introduces wasmContext.do_init which is a flag
indicating whether the TB should reset the BLOCK_PTR_IDX variable to 0
(i.e. start from the beginning). In call_wasm_tb, this is always set
(ctx.do_init = 1) to ensure normal TB execution begins at the first
block. Once the TB resets the BLOCK_PTR_IDX variable, it also clears
do_init. During a rewind, the C code does not set ctx.do_init to 1, allowing
the TB to preserve the BLOCK_PTR_IDX value from the previous unwind and
correctly resume execution from the last unwound block.

[1] https://emscripten.org/docs/api_reference/fiber.h.html
[2] https://kripken.github.io/blog/wasm/2019/07/16/asyncify.html#new-asyncify

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
ktock added 5 commits July 14, 2025 13:45
This commit enables instantiations of TBs in wasm32.c. Browsers cause out of
memory error if too many Wasm instances are created so the number of
instances needs to be limited. So this commit restricts instantiation only
for TBs that are called many times.

This commit adds a counter (or its array if there are multiple threads) to
the TB. Each time a TB is executed on TCI, the counter on TB is
incremented. If it reaches to a threshold, that TB is instantiated as Wasm
via instantiate_wasm.

The total number of instances are tracked by the instances_global variable
and its max number is limited by MAX_INSTANCES. When a Wasm module is
instantiated, instances_global is incremented and the instance's function
pointer is recorded to an array of wasmInstanceInfo.

Each TB refers to the wasmInstanceInfo via wasmTBHeader's info_ptr (or its
array if there are multiple threads). This allows tcg_qemu_tb_exec to
resolve the instance function pointer from TB.

When a new instantiation risks exceeding the limit, the Wasm backend doesn't
perform the instantiation (i.e. TB continues to be executed on TCI),
instead, removal of older Wasm instances is triggered using Emscripten's
removeFunction function. Once the removal of the instance is detected via
FinalizationRegistry API[1], instances_global is decremented, which allows
instantiation of new modules again.

[1] https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/FinalizationRegistry

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
This commit enables qemu_ld and qemu_st to perform TLB lookups, following
the approach used in other backends such as RISC-V. Unlike other backends,
the Wasm backend cannot use ldst labels, as jumping to specific code
addresses (e.g. raddr) is not possible in Wasm. Instead, each TLB lookup is
followed by a if branch: if the lookup succeeds, the memory is accessed
directly; otherwise, a fallback helper function is invoked. Support for
MO_BSWAP is not yet implemented, so has_memory_bswap is set to false.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Emscripten uses the optimization flag at link time to enable optimizations
via Binaryen [1]. While meson.build currently recognizes the -Doptimization
option, it does not propagate it to the linking. This commit updates
meson.build to propagate the optimization flag to the linking when targeting
WebAssembly.

[1] https://emscripten.org/docs/optimizing/Optimizing-Code.html#how-emscripten-optimizes

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Check if wasm backend can be built in CI.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
@ktock ktock force-pushed the dev-wasm64-tcg-4 branch from 5c1d42a to e5c68e7 Compare August 2, 2025 02:13
@ktock ktock force-pushed the testwasm64-a-tcg branch from 917b3c3 to a04e7d6 Compare August 7, 2025 07:56
@ktock ktock closed this Aug 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant