JIT: Constant fold SequenceEqual with the help of VN#121985
JIT: Constant fold SequenceEqual with the help of VN#121985hez2010 wants to merge 18 commits intodotnet:mainfrom
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
There was a problem hiding this comment.
Pull request overview
This PR adds a JIT optimization to constant fold SequenceEqual operations using Value Numbering (VN). When the length is known at compile time and within unrolling thresholds, the JIT can either fold to a constant (if both memory regions are immutable and known) or unroll into an efficient load/compare chain using XOR and OR operations.
Key Changes
- Adds
optVNBasedFoldExpr_Call_Memcmpto unrollNI_System_SpanHelpers_SequenceEqualfor constant-length comparisons - Introduces
Memcmpto theUnrollKindenum with platform-specific thresholds (4x maxRegSize, 12x on ARM64) - Implements XOR+OR accumulation pattern for comparing memory chunks, similar to existing memcpy/memmove optimizations
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
| File | Description |
|---|---|
| src/coreclr/jit/compiler.h | Adds function declaration for optVNBasedFoldExpr_Call_Memcmp, adds Memcmp to UnrollKind enum, and sets unroll thresholds |
| src/coreclr/jit/assertionprop.cpp | Implements the core optimization logic to unroll SequenceEqual, including constant folding and XOR-based comparison chain generation |
|
If you want to fold SequenceEquals to true/false, then you can do that in the valuenum phase. |
|
E.g. the point of |
Yeah. I noticed this as well. Well there was several minor differences on register selection and unaligned loads btw. Removed the IND nodes unrolling. |
EgorBo
left a comment
There was a problem hiding this comment.
LGTM, thanks. I'll move the importervectorization.cpp to that function eventually.
PS: Just Equality to 1/0 could've been done in the VN phase, but this is fine too as it might find more cases after assertions.
|
cc @dotnet/jit-contrib |
But doing it in VN might unblock RBO, etc... |
Yeah, pros & cons. Doing this in VN won't handle e.g. if (len == 42)
{
SequenceEqual(..,.., len);
}The best option is to do it in VN and then run JitOptRepeat if SeqEquals's len became a constant. But we can't rely on JitOptRepeat today. Also, if we move it to VN, we'll have to do some work in VN constant propagation that AP does - I think it doesn't handle calls today. Overall I don't have a preference. Judging by the diffs, it's a fairly niche case. |
|
@AndyAyersMS PTAL |
|
Anything else I need to do before it can be merged? |
|
I think it's fine as is and I'll re-use this function to remove some early string API vectorization to this function problably, let's see what the diffs look like |
| constexpr size_t maxLen = 65536; // Arbitrary threshold to avoid large buffer allocations | ||
| if (len > maxLen) | ||
| { | ||
| JITDUMP("...length is too big (%u bytes) - bail out.\n", (unsigned)len); |
There was a problem hiding this comment.
The format specifier %u expects an unsigned int, but (unsigned)len casts a size_t to unsigned. On 64-bit platforms where size_t is 64 bits, this cast could truncate values larger than UINT_MAX. Since maxLen is already limited to 65536, this is not a practical issue here, but for correctness and consistency with other code in the codebase, consider using %zu (which directly accepts size_t) instead of %u with a cast.
| JITDUMP("...length is too big (%u bytes) - bail out.\n", (unsigned)len); | |
| JITDUMP("...length is too big (%zu bytes) - bail out.\n", len); |
Try to fold
SequenceEqualwith the help of VN.Codegen for
Before:
After: