Skip to content

Inline context.{get,set} in components#13194

Merged
alexcrichton merged 3 commits intobytecodealliance:mainfrom
alexcrichton:inline-context-intrinsics
Apr 27, 2026
Merged

Inline context.{get,set} in components#13194
alexcrichton merged 3 commits intobytecodealliance:mainfrom
alexcrichton:inline-context-intrinsics

Conversation

@alexcrichton
Copy link
Copy Markdown
Member

This commit reimplements the context.{get,set} intrinsics in the component model, introduced in the component-model-async and component-model-threading proposals. The intent of these intrinsics in WASIp3, for example, are intended to replace the globals used for the stack pointer and TLS base in previous modules, for example. The implementation of loading from a global is a single load instruction, whereas the previous implementation of context.get was a full libcall, which is significantly more expensive. The goal of this PR is to ensure that the transition to using context.get and context.set for high-performance uses retains the same performance as the WASIp2 constructs.

Specifically the storage for context.{get,set} slots have been moved into the VMStoreContext structure which has a known layout to compiled code. There still remains storage within each GuestThread because there's only one store, and the idea is that whenever threads are switched between the switch operation is slightly more expensive now where it has to update and maintain the state in the store. The rationale for this is that it'll be far more often that these values are accessed rather than threads being swapped between.

The implementation chosen in this commit is to model the context.{get,set} intrinsics as UnsafeIntrinsics. This is a bit of a shoehorn where they're not actually unsafe, but all of the plumbing and support for UnsafeIntrinsic is effectively exactly what these want. To avoid duplicating lots of infrastructure that's where these now reside.

The concurrent.rs implementation has been updated to save/restore context from the store, and this additionally updates a few other switch points to ensure that the store never switches away or to a deleted thread. This niche situation happened in a few scenarios with no impact from before, but with the switching implementation having to access threads it became load-bearing that these must be valid.

The end result is that with -Cinlining the context.{get,set} instructions are two instructions instead of a full libcall. One instruction is loading VMStoreContext, which is GVN-able and hoist-able, while the other is the actual load/store. This is the same as the performance of the stack pointer being in an imported global, for example.

This commit reimplements the `context.{get,set}` intrinsics in the
component model, introduced in the component-model-async and
component-model-threading proposals. The intent of these intrinsics in
WASIp3, for example, are intended to replace the `global`s used for the
stack pointer and TLS base in previous modules, for example. The
implementation of loading from a `global` is a single load instruction,
whereas the previous implementation of `context.get` was a full libcall,
which is significantly more expensive. The goal of this PR is to ensure
that the transition to using `context.get` and `context.set` for
high-performance uses retains the same performance as the WASIp2
constructs.

Specifically the storage for `context.{get,set}` slots have been moved
into the `VMStoreContext` structure which has a known layout to compiled
code. There still remains storage within each `GuestThread` because
there's only one store, and the idea is that whenever threads are
switched between the switch operation is slightly more expensive now
where it has to update and maintain the state in the store. The
rationale for this is that it'll be far more often that these values are
accessed rather than threads being swapped between.

The implementation chosen in this commit is to model the
`context.{get,set}` intrinsics as `UnsafeIntrinsic`s. This is a bit of a
shoehorn where they're not actually unsafe, but all of the plumbing and
support for `UnsafeIntrinsic` is effectively exactly what these want. To
avoid duplicating lots of infrastructure that's where these now reside.

The `concurrent.rs` implementation has been updated to save/restore
context from the store, and this additionally updates a few other switch
points to ensure that the store never switches away or to a deleted
thread. This niche situation happened in a few scenarios with no impact
from before, but with the switching implementation having to access
threads it became load-bearing that these must be valid.

The end result is that with `-Cinlining` the `context.{get,set}`
instructions are two instructions instead of a full libcall. One
instruction is loading `VMStoreContext`, which is GVN-able and
hoist-able, while the other is the actual load/store. This is the same
as the performance of the stack pointer being in an imported global,
for example.
@alexcrichton alexcrichton requested review from a team as code owners April 24, 2026 22:33
@alexcrichton alexcrichton requested review from fitzgen and removed request for a team April 24, 2026 22:33
@github-actions github-actions Bot added the wasmtime:api Related to the API of the `wasmtime` crate itself label Apr 25, 2026
| UnsafeIntrinsic::ContextSetI32_1 => ir::types::I32,
_ => unreachable!(),
};
let context_slot_size = 4;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be ty.bytes() or something to future proof for memory64?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eventually, yes, but not yet unfortunately. The width of the type being loaded (the above match) and the width of the slot are independent and will need to be configured separately. The width above will be generalized to i64 eventually once the type is plumbed through, and the width of the slot will need to be increased when memory64 support is added. I'll add some comments and defensive assertions too.

UnsafeIntrinsic::ContextGetI32_0 | UnsafeIntrinsic::ContextGetI32_1 => {
let context = self.builder.ins().load(
ty,
MemFlags::trusted(),
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless you set the can move flag, we won't actually hoist these out of loops, which you imply we will do in the commit message.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh the hoistable load was the *mut VMStoreContext pointer load, not this one.

@alexcrichton alexcrichton enabled auto-merge April 27, 2026 20:24
@alexcrichton alexcrichton added this pull request to the merge queue Apr 27, 2026
Merged via the queue into bytecodealliance:main with commit 9dc41f7 Apr 27, 2026
48 checks passed
@alexcrichton alexcrichton deleted the inline-context-intrinsics branch April 27, 2026 21:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

wasmtime:api Related to the API of the `wasmtime` crate itself

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants