-
Notifications
You must be signed in to change notification settings - Fork 5.3k
[Wasm RyuJIT] Add Instruction Encoding for Local Var Declarations and Emit Local Declarations #122425
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[Wasm RyuJIT] Add Instruction Encoding for Local Var Declarations and Emit Local Declarations #122425
Conversation
…al JIT, since the RyuJIT backend is now emitting Wasm
…-arg-initialization
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR introduces support for WebAssembly local variable declarations in the JIT compiler by adding a new instruction descriptor subtype and emission logic for encoding local declarations in the function prolog.
Key Changes
- Adds
instWasmValueTypeenum and mapping functions to convert between JIT types and WASM types - Introduces
instrDescLclVarDeclinstruction descriptor for encoding local declarations with count and type - Implements local declaration emission in the function prolog via
genWasmLocals()
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 12 comments.
Show a summary per file
| File | Description |
|---|---|
| src/coreclr/jit/wasmtypesdef.h | New header defining WASM value types and type conversion functions |
| src/coreclr/jit/registeropswasm.h | Removes trailing blank line |
| src/coreclr/jit/instrswasm.h | Adds local_cnt and local_decl instructions |
| src/coreclr/jit/emitwasm.h | Declares new emission methods for local declarations |
| src/coreclr/jit/emitwasm.cpp | Implements local declaration emission and descriptor handling |
| src/coreclr/jit/emitfmtswasm.h | Adds IF_LOCAL_CNT and IF_LOCAL_DECL instruction formats |
| src/coreclr/jit/emit.h | Adds instrDescLclVarDecl structure and bitfield flag |
| src/coreclr/jit/codegenwasm.cpp | Adds INS_end instruction to epilog |
| src/coreclr/jit/codegencommon.cpp | Implements prolog local count and declaration emission |
| src/coreclr/jit/codegen.h | Declares genWasmLocals() method |
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
src/coreclr/jit/codegenwasm.cpp
Outdated
| // TODO-WASM-CQ: do not emit "return" in case this is the last block | ||
|
|
||
| instGen(INS_return); | ||
| instGen(INS_end); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this end required at the end of methods? We can have multiple epilogs (though we might want to reconsider)... and there's no guarantee even with one epilog that the epilog will be at the end of the instruction stream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it is required. If the epilog isn't guaranteed to be the end, where would be a safe place to ensure that the end comes at the actual end of the instruction stream for a method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably at the end of genCodeForBBlist or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
genFnEpilog is a BasicBlock* parameter. I am pretty sure you can just check block->IsLast() and switch whether you return return or end based on that. That would address the TODO above as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might not be the actual last block when we have funclets.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. This case can be detected by checking if the next block is a funclet entry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding this at the end of genCodeForBBList proved a little bit tricky because of instruction groups. It does look like we track the last used instruction group and could grab that and append after we've finished all method code? Will the last used group be guaranteed to be at the end of the method though? Also, what is the best way to detect if the next block is a funclet? For now I've added an IsLast check to genFnEpilog with a comment that we need to revisit if we're doing funclets.
|
Tagging subscribers to 'arch-wasm': @lewing, @pavelsavara |
Co-authored-by: SingleAccretion <62474226+SingleAccretion@users.noreply.github.com>
…cific debug info field on m_debugInfoSize
AndyAyersMS
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good, just one issue with the epilog.
Co-authored-by: Andy Ayers <andya@microsoft.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking pretty good! Just a few comments.
| } | ||
|
|
||
| //------------------------------------------------------------------------ | ||
| // genWasmLocals: generate wasm locals for all locals |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| // genWasmLocals: generate wasm locals for all locals | |
| // genWasmLocals: generate wasm local declarations |
| CORINFO_SIG_INFO* idCallSig; // Used to report native call site signatures to the EE | ||
| BasicBlock* idTargetBlock; // Target block for branches | ||
| #ifdef TARGET_WASM | ||
| int lclOffset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| int lclOffset; | |
| int lclOffset; // Base index of the WASM locals being declared |
| case IF_LOCAL_DECL: | ||
| { | ||
| assert(idIsLclVarDecl()); | ||
| instrDescLclVarDecl* idl = static_cast<instrDescLclVarDecl*>(const_cast<instrDesc*>(this)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| instrDescLclVarDecl* idl = static_cast<instrDescLclVarDecl*>(const_cast<instrDesc*>(this)); | |
| const instrDescLclVarDecl* idl = static_cast<const instrDescLclVarDecl*>(this); |
Avoid the const_cast?
Or you could use the emitGetLclVarDeclType/... helpers you added here by making them take const instrDesc*.
| struct instrDescLclVarDecl : instrDesc | ||
| { | ||
| instrDescLclVarDecl() = delete; | ||
| cnsval_ssize_t lclCnt; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is technically larger than necessary since the local count can't really exceed 32 bits (and cnsval_ssize_t is a signed 64 bit number). Base instrDesc is 16 bytes, this desc is going to be 32 bytes (due to alignment), but if this field is made to be 32 bits, it can fit into 24 bytes.
(Using an unsigned integer might also get rid of some casts)
| { | ||
| cnsval_ssize_t count = emitGetLclVarDeclCount(id); | ||
| WasmValueType valType = emitGetLclVarDeclType(id); | ||
| int offs = id->idDebugOnlyInfo()->lclOffset; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this down to the if (m_debugInfoSize > 0) branch. Right now it is going to access nominally uninitialized memory with m_debugInfoSize == 0.
| { | ||
| // No debug info case: just print the count and type of the locals | ||
| printf(" count=%llu type=%s", (uint64_t)count, WasmValueTypeName(valType)); | ||
| break; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| break; |
| INST(label, "label", 0, IF_RAW_ULEB128, 0x00) | ||
| INST(local_cnt, "local.cnt", 0, IF_RAW_ULEB128, 0x00) | ||
| INST(local_decl, "local", 0, IF_LOCAL_DECL, 0x00) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| INST(label, "label", 0, IF_RAW_ULEB128, 0x00) | |
| INST(local_cnt, "local.cnt", 0, IF_RAW_ULEB128, 0x00) | |
| INST(local_decl, "local", 0, IF_LOCAL_DECL, 0x00) | |
| INST(label, "label", 0, IF_RAW_ULEB128, 0x00) | |
| INST(local_cnt, "local.cnt", 0, IF_RAW_ULEB128, 0x00) | |
| INST(local_decl, "local", 0, IF_LOCAL_DECL, 0x00) |
Technically the ambition is to have all these aligned across all the instructions but we can make that happen after the list settles down a bit.
This PR adds a new
instrDescsubtype for encoding Wasm local declarations, which have the form count:u32type:valtype, wherevaltypecan be encoded as a single byte for number and vector types.The PR also adds code to emit local count and local declarations as part of the prolog for a function. Some of this may need to change based on what we decide around calling convention and register (locals) allocation.