Skip to content

Comments

JIT: Constant fold SequenceEqual with the help of VN#121985

Open
hez2010 wants to merge 18 commits intodotnet:mainfrom
hez2010:memcmp-fold
Open

JIT: Constant fold SequenceEqual with the help of VN#121985
hez2010 wants to merge 18 commits intodotnet:mainfrom
hez2010:memcmp-fold

Conversation

@hez2010
Copy link
Contributor

@hez2010 hez2010 commented Nov 26, 2025

Try to fold SequenceEqual with the help of VN.

Codegen for

Test.CompareStr();

class Test
{
    public static string Str1 => "abcdef";
    public static string Str2 => "bcdefg";

    [MethodImpl(MethodImplOptions.NoInlining)]
    public static bool CompareStr()
    {
        return Str1.Equals(Str2);
    }
}

Before:

G_M64625_IG01:  ;; offset=0x0000
                                                ;; size=0 bbWeight=1 PerfScore 0.00
G_M64625_IG02:  ;; offset=0x0000
       mov      rax, 0x2DA4A1AEFF0      ; 'abcdef'
       add      rax, 12
       mov      rcx, 0x2DA4A1AF024
       mov      rdx, qword ptr [rax]
       mov      rax, qword ptr [rax+0x04]
       mov      r8, qword ptr [rcx]
       xor      rdx, r8
       xor      rax, qword ptr [rcx+0x04]
       or       rax, rdx
       sete     al
       movzx    rax, al
                                                ;; size=50 bbWeight=1 PerfScore 11.50
G_M64625_IG03:  ;; offset=0x0032
       ret
                                                ;; size=1 bbWeight=1 PerfScore 1.00

After:

G_M64625_IG01:  ;; offset=0x0000
                                                ;; size=0 bbWeight=1 PerfScore 0.00
G_M64625_IG02:  ;; offset=0x0000
       xor      eax, eax
                                                ;; size=2 bbWeight=1 PerfScore 0.25
G_M64625_IG03:  ;; offset=0x0002
       ret
                                                ;; size=1 bbWeight=1 PerfScore 1.00

Copilot AI review requested due to automatic review settings November 26, 2025 12:09
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Nov 26, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Nov 26, 2025
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds a JIT optimization to constant fold SequenceEqual operations using Value Numbering (VN). When the length is known at compile time and within unrolling thresholds, the JIT can either fold to a constant (if both memory regions are immutable and known) or unroll into an efficient load/compare chain using XOR and OR operations.

Key Changes

  • Adds optVNBasedFoldExpr_Call_Memcmp to unroll NI_System_SpanHelpers_SequenceEqual for constant-length comparisons
  • Introduces Memcmp to the UnrollKind enum with platform-specific thresholds (4x maxRegSize, 12x on ARM64)
  • Implements XOR+OR accumulation pattern for comparing memory chunks, similar to existing memcpy/memmove optimizations

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
src/coreclr/jit/compiler.h Adds function declaration for optVNBasedFoldExpr_Call_Memcmp, adds Memcmp to UnrollKind enum, and sets unroll thresholds
src/coreclr/jit/assertionprop.cpp Implements the core optimization logic to unroll SequenceEqual, including constant folding and XOR-based comparison chain generation

@EgorBo
Copy link
Member

EgorBo commented Nov 26, 2025

If you want to fold SequenceEquals to true/false, then you can do that in the valuenum phase.
For the unrolling to a set of IND nodes - I don't see what it can catch that LowerCallMemcmp won't handle to be honest, any example?

@EgorBo
Copy link
Member

EgorBo commented Nov 26, 2025

E.g. the point of optVNBasedFoldExpr_Call_Memmove to exist is that it can see RVA const data and emit const values to be memmoved - LowerMemmove cannot do it. In your case you don't take advantage of seeing string's content when you unroll to indirs (JitOptRepeat probably could help, but we cannot use it today)

@hez2010
Copy link
Contributor Author

hez2010 commented Nov 26, 2025

For the unrolling to a set of IND nodes - I don't see what it can catch that LowerCallMemcmp won't handle to be honest, any example?

Yeah. I noticed this as well. Well there was several minor differences on register selection and unaligned loads btw. Removed the IND nodes unrolling.

@hez2010
Copy link
Contributor Author

hez2010 commented Nov 26, 2025

@MihuBot

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks. I'll move the importervectorization.cpp to that function eventually.

PS: Just Equality to 1/0 could've been done in the VN phase, but this is fine too as it might find more cases after assertions.

@EgorBo
Copy link
Member

EgorBo commented Nov 26, 2025

cc @dotnet/jit-contrib

@AndyAyersMS
Copy link
Member

PS: Just Equality to 1/0 could've been done in the VN phase, but this is fine too as it might find more cases after assertions.

But doing it in VN might unblock RBO, etc...

@EgorBo
Copy link
Member

EgorBo commented Nov 26, 2025

PS: Just Equality to 1/0 could've been done in the VN phase, but this is fine too as it might find more cases after assertions.

But doing it in VN might unblock RBO, etc...

Yeah, pros & cons. Doing this in VN won't handle e.g.

if (len == 42)
{
    SequenceEqual(..,.., len);
}

The best option is to do it in VN and then run JitOptRepeat if SeqEquals's len became a constant. But we can't rely on JitOptRepeat today. Also, if we move it to VN, we'll have to do some work in VN constant propagation that AP does - I think it doesn't handle calls today.

Overall I don't have a preference. Judging by the diffs, it's a fairly niche case.

@hez2010
Copy link
Contributor Author

hez2010 commented Nov 27, 2025

@AndyAyersMS PTAL

@hez2010
Copy link
Contributor Author

hez2010 commented Dec 7, 2025

Anything else I need to do before it can be merged?

@EgorBo
Copy link
Member

EgorBo commented Jan 12, 2026

I think it's fine as is and I'll re-use this function to remove some early string API vectorization to this function problably, let's see what the diffs look like

@EgorBo
Copy link
Member

EgorBo commented Jan 12, 2026

@MihuBot

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated no new comments.

Copy link
Member

@EgorBo EgorBo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Copilot AI review requested due to automatic review settings February 23, 2026 02:31
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 1 comment.

constexpr size_t maxLen = 65536; // Arbitrary threshold to avoid large buffer allocations
if (len > maxLen)
{
JITDUMP("...length is too big (%u bytes) - bail out.\n", (unsigned)len);
Copy link

Copilot AI Feb 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The format specifier %u expects an unsigned int, but (unsigned)len casts a size_t to unsigned. On 64-bit platforms where size_t is 64 bits, this cast could truncate values larger than UINT_MAX. Since maxLen is already limited to 65536, this is not a practical issue here, but for correctness and consistency with other code in the codebase, consider using %zu (which directly accepts size_t) instead of %u with a cast.

Suggested change
JITDUMP("...length is too big (%u bytes) - bail out.\n", (unsigned)len);
JITDUMP("...length is too big (%zu bytes) - bail out.\n", len);

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants