-
Notifications
You must be signed in to change notification settings - Fork 5k
Ensure the JIT view of InstructionSets is correct in the face of JitConfig switches #62420
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Tagging subscribers to this area: @JulieLeeMSFT Issue DetailsThis resolves #60035 Part of the problem in #60035 was that the VM wasn't tracking There are a few reasons why this isn't being done in the VM as suggested #60035 (comment). Reading these flags in the VM is problematic for a few reasons namely that some of these flags control the "baseline" ISAs (X86Base, SSE, SSE2; ArmBase, AdvSimd) and only influence hardware intrinsics. To that end, the VM must report such ISAs as reported while the JIT still needs to support disabling them and having them correctly impact the actual codegen. Likewise, we have cross-targeting AltJit where the VM won't be correct at all and same-target AltJit where you may want to view disassembly for ISAs your hardware doesn't support (which will become more prevalent as we try to support "latest" ISAs including AvxVnni which was added as "experimental" before hardware existed; or potential future support for SVE and AVX-512 where its unlikely devs may support it locally).
|
CC. @echesakovMSFT, @jkotas |
Are the |
Based on the failures, looks like no. The VM ( I expect there is some issue with the crossgen logic here, likely the same that is causing: #61471. The |
Notably, I'd also think that the JIT and Crossgen don't need to agree here. Crossgen is supposed to mark what ISAs are used by a method. If a method uses an ISA that the hardware doesn't support, it has to rejit at which point whatever the JIT is targeting is fine. If the hardware does support it, then it shouldn't matter what the JIT thinks because the code will still run fine. The one notable exception here may be Therefore, if AVX is turned on and a method uses any SIMD instruction; then it will be emitted using the But aside from that, if the method is crossgen'd for SSE4.1 and the hardware supports SSE4.1, then even if the JIT says "no SSE4.1", the method will still execute fine. |
This works fine for opportunistic ISA uses where the user code cannot observe the functional difference. It breaks down when the code can observe
We have seen patterns like this in real-world code. |
How would you recommend handling the That is, I'm running on x64 and want to get disassembly for |
I would either do nothing, or replicate the environment checks when the JIT runs as cross-compiling altjit. I do not think it makes sense to introduce altjit specific methods on JIT/EE interface. |
faddb15
to
1ea5ea2
Compare
#define EXTERNAL_JitEnableAVX_Default 1 | ||
#else // !(defined(TARGET_AMD64) || defined(TARGET_X86) | ||
#define EXTERNAL_JitEnableAVX_Default 0 | ||
#endif // !(defined(TARGET_AMD64) || defined(TARGET_X86) | ||
RETAIL_CONFIG_DWORD_INFO(EXTERNAL_FeatureSIMD, W("FeatureSIMD"), EXTERNAL_FeatureSIMD_Default, "Enable SIMD intrinsics recognition in System.Numerics.dll and/or System.Numerics.Vectors.dll") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to do something about FeatureSIMD
. Originally it was intended to be a workaround in case the System.Numerics.Vector*
logic had issues (back before hardware intrinsics were even a concept).
However, given how critical SIMD
is to the ABI and even basic operations in the JIT for x64 and Arm64; I think it might be better to actually remove the public facing flag here and leave it only a compile level switch for x86 Unix
and Arm32
(the two targets where FEATURE_SIMD == 0
today).
If we don't remove it, then we need to also have the VM take it into account for certain ISAs (basically everything except X86Base
, Lzcnt
, and ArmBase
; Bmi1
and Bmi2
are "scalar" but not included because they depend on EnableAvx=1
due to the VEX encoding).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
-- If we are fine with removing flags, EXTERNAL_EnableSSE3_4
is also legacy and unnecessary as it has the same meaning as EnableSSE3
#endif | ||
|
||
instructionSetFlags = EnsureInstructionSetFlagsAreValid(instructionSetFlags); | ||
opts.setSupportedISAs(instructionSetFlags); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Commenting here because its in compiler.cpp
, but further down on https://github.com/dotnet/runtime/blob/main/src/coreclr/jit/compiler.cpp#L5608-L5637, we have some logic for if (!info.compMatchedVM)
.
It'd be helpful to understand what this means for crossgen
/r2r
and if its better handled as simply if (info.altJit)
. In particular we want to support:
- The ability to have "correct" instruction sets for the actual target when the VM doesn't match (i.e. cross-compiling for Arm64 on an x64 box; crossgen or otherwise)
- To allow for an AltJit to generate disassembly for hardware you don't have (i.e. see and debug
Arm.AdvSimd
on x64; or evenX86.Avx2
on an x64 box that only supportsAvx
)
1ea5ea2
to
ac8e18a
Compare
@davidwrighton, @MichalStrehovsky there is something I'm not understanding about these flags for NativeAOT. There are failures such as |
Actually, I think I figured it out. I seem to have swapped the order of |
After double checking the results reported by No Environment Variables
COMPlus_EnableHWIntrinsic=0
EnableSSE=0
EnableSSE2=0
etc, for the rest of the available |
Nope. @MichalStrehovsky, I'm definitely going to need help with NativeAOT here. Even debugging in the flags look correct, but its still faulting for reasons I can't determine. |
bd6611e
to
fddd424
Compare
Ok, everything looks to be working as expected now including for CG2/ILC. I've rebased onto main and updated |
Thanks to both David and Michal for assisting here! |
implication ,X86 ,LZCNT ,X86Base | ||
implication ,X86 ,PCLMULQDQ ,SSE2 | ||
implication ,X86 ,POPCNT ,SSE42 | ||
implication ,X86 ,Vector128 ,SSE | ||
implication ,X86 ,Vector256 ,AVX | ||
implication ,X86 ,AVXVNNI ,AVX2 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple new "implications" for x86. Nothing major, just ensuring AvxVnni
and Lzcnt
are correctly dependent on the ISAs they inherit from and ensure Vector128
is correctly marked as being dependent on Sse
implication ,ARM64 ,Vector64 ,AdvSimd | ||
implication ,ARM64 ,Vector128 ,AdvSimd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for ARM64. Just ensuring Vector64/128 are correctly marked as being dependent on AdvSimd.
|
||
"); | ||
|
||
foreach (string architecture in _architectures) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably easier to see these changes by looking at https://github.com/dotnet/runtime/pull/62420/files#diff-47b7c76a34a29a49750b1591fcf7ed6196296eb075ca71cc160b03ff5e717f49R21.
This basically just adds InstructionSet_X64
, InstructionSet_X86
, and InstructionSet_ARM64
enums that contain only the ISAs for that architecture. These aren't actually used by anything, they are just returned by three new properties on InstructionSetFlags
to make it simpler to debug.
Without this, you'll end up seeing the ARM64_*
enum values regardless of platform due to how the debugger displays enums with overlapping values. This was making it practically impossible to debug since you'd see ARM64_AdvSimd
rather than X64_SSE
for example.
CI never started, closing and reopening to try and force a refresh. |
Ah, CI is broken due to #63480 |
…tructionSetSupport
…nfoInstructionSet
fddd424
to
41bf5cd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The crossgen2/nativeaot changes look good. Thank you for the debugging improvement!
This resolves #60035
Part of the problem in #60035 was that the VM wasn't tracking
InstructionSet_Vector*
and instead these were added by the JIT. The other problem was thatEnableHWIntrinsic=0
wasn't correctly disabling the baseline ISA for the platform, which lead to an inconsistent world view on some APIs.There are a few reasons why this isn't being done in the VM as suggested #60035 (comment).Reading these flags in the VM is problematic for a few reasons namely that some of these flags control the "baseline" ISAs (X86Base, SSE, SSE2; ArmBase, AdvSimd) and only influence hardware intrinsics. To that end, the VM must report such ISAs as reported while the JIT still needs to support disabling them and having them correctly impact the actual codegen.Likewise, we have cross-targeting AltJit where the VM won't be correct at all and same-target AltJit where you may want to view disassembly for ISAs your hardware doesn't support (which will become more prevalent as we try to support "latest" ISAs including AvxVnni which was added as "experimental" before hardware existed; or potential future support for SVE and AVX-512 where its unlikely devs may support it locally).