-
Notifications
You must be signed in to change notification settings - Fork 5k
JIT: Allow any baseline intrinsics in lowering #115592
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
Just a couple of diffs where the accelerated cast replaces the helper call, though I didn't know any of the SPMI collections had SSE2 disabled... cc @tannergooding, we discussed this a few weeks back, and it turned out not to be too disruptive. |
src/coreclr/jit/hwintrinsicxarch.cpp
Outdated
// | ||
// Return Value: | ||
// true if isa is part of the baseline; otherwise, false | ||
bool HWIntrinsicInfo::isBaselineIsa(CORINFO_InstructionSet isa) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can avoid this and any specialization required in lowering by having rationalization update the opts.compSupportsISA
entries to include these instead.
That is, these baseline
ISAs are always part of the support set of the VM. They don't actually have to be "reported" (technically) as they are part of the required startup checks the VM does regardless.
So we could just "cheat" and have rationalization mark them as part of the support set so that the comp*SupportsIsa
checks return true
from that point onward instead; and all our existing checks "just work".
The above would be a very minimal change.
The more complex, and potentially better long term, change would be removing the VM ability to disable these ISAs altogether. That is, changing the following in codeman.cpp (and equivalent for Arm64):
- if (CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_EnableHWIntrinsic))
- {
- CPUCompileFlags.Set(InstructionSet_X86Base);
- }
-
- if (CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_EnableSSE))
- {
- CPUCompileFlags.Set(InstructionSet_SSE);
- }
-
- if (CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_EnableSSE2))
- {
- CPUCompileFlags.Set(InstructionSet_SSE2);
- }
+ CPUCompileFlags.Set(InstructionSet_X86Base);
+ CPUCompileFlags.Set(InstructionSet_SSE);
+ CPUCompileFlags.Set(InstructionSet_SSE2);
We would then instead check and cache JitConfig.EnableHWIntrinsic() != 0
, JitConfig.EnableSSE() != 0
, and JitConfig.EnableSSE2() != 0
as part of Compiler::compSetProcessor()
, such as in a bool isUserBaselineIsaSupported
field. We can then use that field in Compiler::compSupportsHWIntrinsic
which is only used to detect if use of the ISA for managed code is legal.
This would allow any part of the JIT to use baseline ISAs and only block user code from utilizing it, which seems "better" overall. It still enables users to test their fallbacks while giving the JIT flexibility to target the real baseline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this, but we have quite a lot of asserts in the backend that check for the baseline ISAs, and making them supported unconditionally would render those asserts meaningless. I thought there must be some value to being able to explicitly make the decision to ignore the supported list on a case by case basis.
I do like the simplicity of modifying the supported list post-rationalize, though. If you don't see any value in being able to differentiate the implied ISAs in the backend, I'm happy to make that change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original goal was to ensure that we weren't introducing bugs users couldn't workaround when the support was first being brought online.
At this point, the baseline support is stable and that isn't a concern anymore. Any higher level ISAs can still be opted out of for newer functionality which may have been as extensively used in the wild.
So I think simply allowing any part of the JIT to introduce baseline intrinsic usage will simplify things greatly and likely improve throughput. Users will still be able to ensure BaselineIsa.IsSupported
reports false which will let them test their fallbacks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I opted for the simpler JIT-only option. The change is small, so if there's a compelling reason to move to VM later, it'd be easy enough.
Interesting that the diffs show a larger throughput impact. There are a few additional codegen hits for stack zeroing with this change.
Under the current JIT rules, the config knobs meant for testing HWIntrinsics APIs and fallback code also impact codegen for ordinary casts. Some casts are implemented directly in codegen, where we freely emit SSE/SSE2 code with no ISA checks, while the casts implemented in lowering create HWIntrinsic nodes that end up hitting ISA validation checks in codegen and must, therefore, be gated on
compOpportunisticallyDependsOn
checks.This adds the baseline ISAs to the supported set post-rationalize so that we continue to ensure any HWIntrinsic nodes created in HIR belong to a supported ISA, but allows HWIntrinsic nodes to be introduced in LIR without checks, provided they are part of the guaranteed-supported baseline ISA set.
In addition to simplifying
fgMorphExpandCast
, this change will allowdouble->int/uint
cast helpersulong->floating
fallback cast implementation from codegen to lowering, which will open up additional optimization.