Skip to content

JIT: Allow any baseline intrinsics in lowering #115592

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

saucecontrol
Copy link
Member

@saucecontrol saucecontrol commented May 14, 2025

Under the current JIT rules, the config knobs meant for testing HWIntrinsics APIs and fallback code also impact codegen for ordinary casts. Some casts are implemented directly in codegen, where we freely emit SSE/SSE2 code with no ISA checks, while the casts implemented in lowering create HWIntrinsic nodes that end up hitting ISA validation checks in codegen and must, therefore, be gated on compOpportunisticallyDependsOn checks.

This adds the baseline ISAs to the supported set post-rationalize so that we continue to ensure any HWIntrinsic nodes created in HIR belong to a supported ISA, but allows HWIntrinsic nodes to be introduced in LIR without checks, provided they are part of the guaranteed-supported baseline ISA set.

In addition to simplifying fgMorphExpandCast, this change will allow

  • removal of the double->int/uint cast helpers
  • moving ulong->floating fallback cast implementation from codegen to lowering, which will open up additional optimization.

@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 14, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label May 14, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@saucecontrol
Copy link
Member Author

Just a couple of diffs where the accelerated cast replaces the helper call, though I didn't know any of the SPMI collections had SSE2 disabled...

cc @tannergooding, we discussed this a few weeks back, and it turned out not to be too disruptive.

//
// Return Value:
// true if isa is part of the baseline; otherwise, false
bool HWIntrinsicInfo::isBaselineIsa(CORINFO_InstructionSet isa)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can avoid this and any specialization required in lowering by having rationalization update the opts.compSupportsISA entries to include these instead.

That is, these baseline ISAs are always part of the support set of the VM. They don't actually have to be "reported" (technically) as they are part of the required startup checks the VM does regardless.

So we could just "cheat" and have rationalization mark them as part of the support set so that the comp*SupportsIsa checks return true from that point onward instead; and all our existing checks "just work".

The above would be a very minimal change.


The more complex, and potentially better long term, change would be removing the VM ability to disable these ISAs altogether. That is, changing the following in codeman.cpp (and equivalent for Arm64):

- if (CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_EnableHWIntrinsic))
- {
-     CPUCompileFlags.Set(InstructionSet_X86Base);
- }
- 
- if (CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_EnableSSE))
- {
-     CPUCompileFlags.Set(InstructionSet_SSE);
- }
- 
- if (CLRConfig::GetConfigValue(CLRConfig::EXTERNAL_EnableSSE2))
- {
-     CPUCompileFlags.Set(InstructionSet_SSE2);
- }
+ CPUCompileFlags.Set(InstructionSet_X86Base);
+ CPUCompileFlags.Set(InstructionSet_SSE);
+ CPUCompileFlags.Set(InstructionSet_SSE2);

We would then instead check and cache JitConfig.EnableHWIntrinsic() != 0, JitConfig.EnableSSE() != 0, and JitConfig.EnableSSE2() != 0 as part of Compiler::compSetProcessor(), such as in a bool isUserBaselineIsaSupported field. We can then use that field in Compiler::compSupportsHWIntrinsic which is only used to detect if use of the ISA for managed code is legal.

This would allow any part of the JIT to use baseline ISAs and only block user code from utilizing it, which seems "better" overall. It still enables users to test their fallbacks while giving the JIT flexibility to target the real baseline.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought about this, but we have quite a lot of asserts in the backend that check for the baseline ISAs, and making them supported unconditionally would render those asserts meaningless. I thought there must be some value to being able to explicitly make the decision to ignore the supported list on a case by case basis.

I do like the simplicity of modifying the supported list post-rationalize, though. If you don't see any value in being able to differentiate the implied ISAs in the backend, I'm happy to make that change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The original goal was to ensure that we weren't introducing bugs users couldn't workaround when the support was first being brought online.

At this point, the baseline support is stable and that isn't a concern anymore. Any higher level ISAs can still be opted out of for newer functionality which may have been as extensively used in the wild.

So I think simply allowing any part of the JIT to introduce baseline intrinsic usage will simplify things greatly and likely improve throughput. Users will still be able to ensure BaselineIsa.IsSupported reports false which will let them test their fallbacks.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I opted for the simpler JIT-only option. The change is small, so if there's a compelling reason to move to VM later, it'd be easy enough.

Interesting that the diffs show a larger throughput impact. There are a few additional codegen hits for stack zeroing with this change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants