JIT: Accelerate long -> floating casts on x86 #113930

saucecontrol · 2025-03-26T17:14:01Z

This adds support for using EVEX SIMD conversion instructions to handle long/ulong to float/double casts on x86 rather than going through helper calls. With this, all integral -> floating casts are accelerated on x86 with AVX-512.

Typical diff:

 G_M57389_IG01:        ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref, nogc <-- Prolog IG
-       sub      esp, 8
-       vzeroupper 
-						;; size=6 bbWeight=1 PerfScore 1.25
+						;; size=0 bbWeight=1 PerfScore 0.00
 G_M57389_IG02:        ; bbWeight=1, gcrefRegs=00000000 {}, byrefRegs=00000000 {}, byref
-       push     dword ptr [esp+0x10+0x04]
-       ; npt arg push 0
-       push     dword ptr [esp+0x0C+0x08]
-       ; npt arg push 1
-       call     CORINFO_HELP_LNG2DBL
-       ; gcr arg pop 2
-       fstp     qword ptr [esp]
-       vmovsd   xmm0, qword ptr [esp]
-       vcvtsd2ss xmm0, xmm0, xmm0
+       vmovq    xmm0, qword ptr [esp+0x04]
+       vcvtqq2ps xmm0, xmm0
        vmovd    eax, xmm0
-						;; size=29 bbWeight=1 PerfScore 12.50
+						;; size=16 bbWeight=1 PerfScore 9.00
 G_M57389_IG03:        ; bbWeight=1, epilog, nogc, extend
-       add      esp, 8
        ret      8
-						;; size=6 bbWeight=1 PerfScore 2.25
+						;; size=3 bbWeight=1 PerfScore 2.00

Full diffs

dotnet-policy-service · 2025-03-26T17:14:37Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

src/tests/JIT/Regression/JitBlue/Runtime_106338/Runtime_106338.cs

src/coreclr/jit/morph.cpp

saucecontrol · 2025-03-28T21:10:21Z

This is ready for review. To summarize, it simply replaces helper calls with AVX-512 conversion instructions, as follows:

long->double: vcvtqq2pd
long->float: vcvtqq2ps
ulong->double: vcvtuqq2pd
ulong->float: vcvtuqq2pd+vcvtsd2ss (double conversion preserves existing behavior)

Latest diffs show no regressions other than those expected from inlining.

tannergooding · 2025-04-10T15:56:11Z

src/coreclr/jit/decomposelongs.cpp

@@ -137,7 +137,12 @@ GenTree* DecomposeLongs::DecomposeNode(GenTree* tree)
        }
    }

+#if defined(FEATURE_HW_INTRINSICS) && defined(TARGET_X86)
+    if (!tree->TypeIs(TYP_LONG) &&
+        !(tree->OperIs(GT_CAST) && varTypeIsLong(tree->AsCast()->CastOp()) && varTypeIsFloating(tree)))


nit: negated conditions like this can be hard to read. A small comment covering that we want to handle nodes that produce long or GT_CAST float->long would be beneficial IMO.

Agreed. I actually plan on extending this to handle casts in the opposite direction as well, and that will make this check even more hairy. I'll do something to simplify it then.

tannergooding · 2025-04-10T15:59:55Z

src/coreclr/jit/decomposelongs.cpp

    if (!tree->TypeIs(TYP_LONG))
+#endif // FEATURE_HW_INTRINSICS && TARGET_X86
    {
        return tree->gtNext;
    }


The fact that from this point onwards it can now also be GT_CAST float rather than only some NODE long seems like a tricky thing that might trip people up in the future.

tannergooding · 2025-04-10T16:02:28Z

src/coreclr/jit/decomposelongs.cpp

+        if (m_compiler->compOpportunisticallyDependsOn(InstructionSet_AVX512DQ_VL))
+        {
+            intrinsicId = (dstType == TYP_FLOAT) ? NI_AVX512DQ_VL_ConvertToVector128Single
+                                                 : NI_AVX512DQ_VL_ConvertToVector128Double;
+        }
+        else
+        {
+            assert(m_compiler->compIsaSupportedDebugOnly(InstructionSet_AVX10v1));
+            intrinsicId =
+                (dstType == TYP_FLOAT) ? NI_AVX10v1_ConvertToVector128Single : NI_AVX10v1_ConvertToVector128Double;
+        }


These checks feel like they should be inverted since AVX10v1 is newer, so we should opportunistically check for AVX10v1 and assume otherwise AVX512DQ.VL is supported. This is particularly relevant since the spec for AVX10 is changing to require V512 support.

Actually, with the change to require V512, I think we don't even need the AVX10v1 path anymore, technically. As AVX512DQ.VL will always be available if AVX10v1 is available

CC. @anthonycanino

This is copying the pattern we already use in a lot of other places, but yeah, we can simplify all of them now, I think.

tannergooding

Changes LGTM. Just a couple minor bits of feedback

CC. @dotnet/jit-contrib, @EgorBo for secondary review

tannergooding · 2025-04-28T18:45:29Z

Ping @dotnet/jit-contrib, @EgorBo for secondary review

BruceForstall · 2025-04-29T18:53:19Z

/azp run runtime-coreclr outerloop, Fuzzlyn

azure-pipelines · 2025-04-29T18:53:34Z

Azure Pipelines successfully started running 2 pipeline(s).

use SIMD conversion instructions for long -> floating casts

04b1bbe

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 26, 2025

dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Mar 26, 2025

saucecontrol changed the title ~~JIT: Accelerated long -> floating casts on x86~~ JIT: Accelerate long -> floating casts on x86 Mar 26, 2025

jkotas reviewed Mar 26, 2025

View reviewed changes

src/tests/JIT/Regression/JitBlue/Runtime_106338/Runtime_106338.cs Outdated Show resolved Hide resolved

saucecontrol commented Mar 26, 2025

View reviewed changes

src/coreclr/jit/morph.cpp Outdated Show resolved Hide resolved

This was referenced Mar 26, 2025

System.Net.Requests test timeout #113883

Closed

System.Net.Quic tests timeout #107761

Open

saucecontrol added 3 commits March 27, 2025 20:02

Merge remote-tracking branch 'upstream/main' into x86convert

2214900

move transform to DecomposeLongs, restore double intermediate

ab5d919

formatting

1bd047a

build-analysis bot mentioned this pull request Mar 28, 2025

System.TimeoutException : The operation has timed out. dotnet/dnceng#5279

Open

3 tasks

handle constants

c73acd5

saucecontrol marked this pull request as ready for review March 28, 2025 21:10

build-analysis bot mentioned this pull request Mar 28, 2025

[QUIC & HTTP/3] Handshake Timeout on tests #104426

Open

saucecontrol mentioned this pull request Apr 8, 2025

JIT: Speed up floating to integer casts on x86/x64 #114410

Open

Merge remote-tracking branch 'upstream/main' into x86convert

71d055a

tannergooding reviewed Apr 10, 2025

View reviewed changes

tannergooding approved these changes Apr 10, 2025

View reviewed changes

Merge branch 'main' into x86convert

da0791c

build-analysis bot mentioned this pull request Apr 29, 2025

System.Security.Cryptography.X509Certificates.Tests.PfxTests.ReadMLKem512PrivateKey_NotSupported failing with CryptographicException #115156

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Accelerate long -> floating casts on x86 #113930

JIT: Accelerate long -> floating casts on x86 #113930

saucecontrol commented Mar 26, 2025 •

edited

Loading

dotnet-policy-service bot commented Mar 26, 2025

saucecontrol commented Mar 28, 2025 •

edited

Loading

tannergooding Apr 10, 2025

saucecontrol Apr 10, 2025

tannergooding Apr 10, 2025

tannergooding Apr 10, 2025

tannergooding Apr 10, 2025

tannergooding Apr 10, 2025

saucecontrol Apr 10, 2025

tannergooding left a comment

tannergooding commented Apr 28, 2025

BruceForstall commented Apr 29, 2025

azure-pipelines bot commented Apr 29, 2025

JIT: Accelerate long -> floating casts on x86 #113930

Are you sure you want to change the base?

JIT: Accelerate long -> floating casts on x86 #113930

Conversation

saucecontrol commented Mar 26, 2025 • edited Loading

dotnet-policy-service bot commented Mar 26, 2025

saucecontrol commented Mar 28, 2025 • edited Loading

tannergooding Apr 10, 2025

Choose a reason for hiding this comment

saucecontrol Apr 10, 2025

Choose a reason for hiding this comment

tannergooding Apr 10, 2025

Choose a reason for hiding this comment

tannergooding Apr 10, 2025

Choose a reason for hiding this comment

tannergooding Apr 10, 2025

Choose a reason for hiding this comment

tannergooding Apr 10, 2025

Choose a reason for hiding this comment

saucecontrol Apr 10, 2025

Choose a reason for hiding this comment

tannergooding left a comment

Choose a reason for hiding this comment

tannergooding commented Apr 28, 2025

BruceForstall commented Apr 29, 2025

azure-pipelines bot commented Apr 29, 2025

saucecontrol commented Mar 26, 2025 •

edited

Loading

saucecontrol commented Mar 28, 2025 •

edited

Loading