-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[JIT] Enable EGPRs in JIT by adding REX2 encoding to the backend. #106557
base: main
Are you sure you want to change the base?
Conversation
Update comments. Merge the REX2 changes into the original legacy emit path bug fix: Set REX2.W with correct mask code. register encoding and prefix emitting logics. Add REX2 prefix emit logic bug fixes Add Stress mode for REX2 encoding and some bug fixes resolve comments: 1. add assertion check for UD opcodes. 2. add checks for EGPRs. Add REX2 to emitOutputAM, and let LEA to be REX2 compatible. Add REX2.X encoding for SIB byte But fixes: add REX2 prefix on the path in RI where MOV is specially handled. Enable REX2 encoding for `movups` fixed bugs in REX2 prefix emitting logic when working with map 1 instructions, and enabled REX2 for POPCNT legacy map index-er bug fixes some clean-up Adding initial APX unit testing path. Adding a coredistools dll that has LLVM APX disasm capability. It must be coppied into a CORE_ROOT manually. clean up work for REX2 narrow the REX2 scope to `sub` only some clean up based on the comments. bug fix resolve comment
- SV path is mostly for debugging purposes Added encoding unit tests for instructions with immediates
Code refactoring: AddX86PrefixIfNeeded.
… missing in JIT, may indicate these instructions are not being used in JIT, drop them for now.
…lled before adding any prefix.
Refactor REX2 encoding stress logics.
(this will have side effect that the estimated code will go up and mismatch with actual code size.)
Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it. |
@dotnet/avx512-contrib can we reopen this as a PR ready to review? |
@anthonycanino I re-opened it (it wasn't clear to me if your question implied you did not have permission to do so). Either you or @Ruihan-Yin need to update to latest main and resolve the conflicts, then mark it ready-for-review. |
Overview
This PR is the follow-up PR after #104637, which added the initial CPUID and XSAVE updates for APX.
This PR adds REX2 encoding functionality for legacy instructions which enables the use of EGPR for
add
,sub
, etc. Note that this PR focuses on REX2 encoding only: a follow up PR will enable EGPR support via the register allocator.Specification
REX2 is a 2-byte prefix with a leading byte of
0xD5
, detailed format below:Similar to REX prefix, it provides the extended bits for the MODRM.REG field, REX2.R4/R3, and MODRM.R/M field, REX2.B4/B3, and the index register in SIB byte, REX2.X4/X3, those bits will act as the higher 5th/4th bits and combine with the field in MODRM and SIB byte as a 5-bit binary to access up to 32 registers.
REX2 prefix is generally available for legacy-map-0 and legacy-map-1 instructions, say 1-byte opcode or 2-byte opcode with escape byte 0x0F, with some exceptions.
Like VEX/EVEX, REX2 is considered as the last prefix before the main opcode, so it can not co-exist with REX/VEX/EVEX.
Design
The bulk of the changes occur in the backend emitter.
As there is no existing hardware that has APX support yet, we had some hacks to bypass the CPUID checks. In this PR,
DOTNET_JitStressRex2Encoding
will force all the eligible instructions to be encoded in REX2, regardless the presence of EGPRs in the operand. We had another switchDOTNET_JitBypassAPXCheck
, with which will only bypass the APX CPUID check but JIT will encode REX2 only if needed, this is more useful when the LSRA changes come.Testing
We followed a multi-step testing plan to verify the encoding correctness and the semantic correctness.
Testing results will be presented below.
1. Emitter unit tests
In
codgenxarch.cpp
, similar togenAmd64EmitterUnitTestsSse2
, we used theJitLateDisasm
feature to insert instructions to encode as unit tests for emitter, andLateDisasm
will invoke LLVM to disasm the code stream, this gave us the chance to cross validate the disassembly from JIT and LLVM. The output of this step is to verify the emit paths are generating "correct" code that would not trigger #UD or have wrong semantics.Note that we are using a custom
coredistools.dll
which uses a recent LLVM that supports APX decoding.2. SuperPMI
In this step, we would run the SuperPMI pipeline to get the asmdiffs with REX2 on and off, the inputs are all the MCH files. This step will give us the chance to check if there is any assertion failure or internal error within JIT and since the pipeline will invoke
coredistools.dll
as well, so we can verify the encoding correctness in a larger scope.To ensure the new changes will not hit the existing code path in terms of throughput, we ran tpdiff with base JIT to be the main branch where changes are based on, and diff JIT to be the one with all the REX2 changes.
3. JIT unit tests
The 2 steps mentioned above are mainly verifying the encoding correctness of the generated binary code. Then the last will examine the semantic correctness of the generated code, say since we are simply forcing all the compatible instructions to be encoded in REX2, so the original semantics should not change, so we expect exactly the same output with REX2 on/off.
We used the existing CoreCLR unit test set:
JIT
and run it in the Intel SDE emulator.Follow-up plans
This PR is only intended to provide the REX2 encoding functionality to the JIT backend, in terms of how to properly use it, we are preparing another PR that includes the updates on LSRA such that JIT will be able to allocate EGPRs only when needed, and generate optimal code.