Skip to content

Conversation

amanasifkhalid
Copy link
Contributor

Addresses some of the regressions in #116486. If a loop has bounds checks in it, it might benefit from cloning, so perhaps we ought to tolerate going a bit over the inversion size limit to enable downstream optimizations. I ought to do something for GDV checks; perhaps as a follow-up, I'll move the checks in loop cloning to something I can reuse here. The 1.25x figure isn't all that scientific -- I found it to be the smallest factor necessary to make a dent in the regressions from less cloning I looked at, and despite the size increases, it's a net PerfScore win across collections.

@Copilot Copilot AI review requested due to automatic review settings July 26, 2025 00:03
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 26, 2025
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves JIT loop inversion logic by allowing slightly oversized loops to be inverted when they contain bounds checks that could benefit from downstream loop cloning optimizations. This addresses performance regressions introduced in #116486 where stricter size limits prevented beneficial optimizations.

Key changes:

  • Modified loop inversion size checking to use a 1.25x multiplier for loops with bounds checks
  • Added logic to detect loops that might benefit from cloning (those with bounds checks)
  • Enhanced debug output for better traceability of inversion decisions

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
src/coreclr/jit/optimizer.cpp Enhanced loop inversion logic with bounds check detection and liberal size limits
src/coreclr/jit/loopcloning.cpp Added debug output when loops exceed cloning size limits
src/coreclr/jit/compiler.hpp Simplified return logic in optLoopComplexityExceeds and removed debug output
Comments suppressed due to low confidence (1)

src/coreclr/jit/optimizer.cpp:1915

  • [nitpick] The lambda parameter name 'tree' is generic. Consider renaming it to 'node' to be more consistent with the function name 'countNode'.
        auto countNode = [&mightBenefitFromCloning, &loopSize](GenTree* tree) -> unsigned {

Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@amanasifkhalid
Copy link
Contributor Author

Diffs show six-digit size increases, though much of that is coming from libraries_tests (ditto the TP regressions). This looks like a net PerfScore win. cc @dotnet/jit-contrib, @AndyAyersMS does this seem like too much trouble to take at this point?

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have specific large regression cases you're targeting here? It would be good to call them out.

Co-authored-by: Andy Ayers <andya@microsoft.com>
@amanasifkhalid
Copy link
Contributor Author

Do you have specific large regression cases you're targeting here? It would be good to call them out.

Yes, the largest ones from #116486 are Span.Sorting.BubbleSortArray(Size: 512) (comment) and Benchstone.BenchI.AddArray2 (comment), and considering the frequency with which we loop over arrays in our microbenchmarks, I'm sure there are many more. The largest of these regressions is the former on Viper Ubuntu, which regressed by 80%. It's not nearly as bad as the 10x regression fixed by #117829, but this code pattern seems common enough that we ought to try to address it in .NET 10.

@amanasifkhalid
Copy link
Contributor Author

@AndyAyersMS are you ok with this going in as-is?

Copy link
Member

@AndyAyersMS AndyAyersMS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, let's take this.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants