JIT: Add post-morph profile repair phase #113896

amanasifkhalid · 2025-03-25T21:00:18Z

Part of #107749. The latter half of the JIT frontend contains numerous optimizations that rely on block weights to compute the profitability of their transformations, and may benefit from more consistent profile data. Morph and flow opts frequently redirect flow out of blocks and into other ones, and the corresponding tweaks to the profile are usually too local to reduce/increase flow along affected paths. This gives downstream phases inaccurate ideas about which blocks are newly cold/hot, and other profile transformations (such as fgExpandRarelyRunBlocks) can further propagate inaccuracies. Thus, re-running synthesis shortly after morph seems like an opportune and cheap place to fix the profile. Like the late profile synthesis run, we aren't interested in changing edge likelihoods here -- we just want to propagate changes in block weights through the flowgraph.

I also included some dead code cleanup I meant to do in a previous PR.

Copilot

Copilot wasn't able to review any files in this pull request.

Files not reviewed (3)

src/coreclr/jit/compiler.cpp: Language not supported
src/coreclr/jit/compiler.h: Language not supported
src/coreclr/jit/fgbasic.cpp: Language not supported

dotnet-policy-service · 2025-03-25T21:01:09Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

amanasifkhalid · 2025-03-26T17:25:33Z

Collections are a bit impoverished at the moment; I'll wait for the next collection run

amanasifkhalid · 2025-03-27T15:21:21Z

Looking at aspnet, I see large size increases and decreases due to diffs in loop cloning. For example, here's the diffed loop stats for Microsoft.EntityFrameworkCore.Metadata.Conventions.Internal.ConventionDispatcher+ImmediateConventionScope:OnEntityTypeAdded(Microsoft.EntityFrameworkCore.Metadata.Builders.IConventionEntityTypeBuilder):Microsoft.EntityFrameworkCore.Metadata.Builders.IConventionEntityTypeBuilder:this:

LoopsFoundDuringOpts : 2
LoopsInverted : 0
- LoopsCloned : 1
+ LoopsCloned : 0
LoopsUnrolled : 0

And for System.Threading.SpinWait:SpinUntil(System.Func`1[ubyte],int):ubyte:

LoopsFoundDuringOpts : 1
LoopsInverted : 0
- LoopsCloned : 1
+ LoopsCloned : 0
LoopsUnrolled : 0

There are plenty of instances where we clone more as well, like in System.Linq.Enumerable:TryGetSingle[System.__Canon](System.Collections.Generic.IEnumerable`1[System.__Canon],System.Func`2[System.__Canon,ubyte],byref):System.__Canon:

LoopsFoundDuringOpts : 1
LoopsInverted : 2
- LoopsCloned : 0
+ LoopsCloned : 1
LoopsUnrolled : 0

(Note that lexical loop inversion found more loop candidates than the graph-based approach did.) Also System.Collections.Immutable.ImmutableSortedDictionary`2+Node[int,System.ValueTuple`2[System.Nullable`1[int],System.Nullable`1[int]]]:Search(int,System.Collections.Generic.IComparer`1[int]):System.Collections.Immutable.ImmutableSortedDictionary`2+Node[int,System.ValueTuple`2[System.Nullable`1[int],System.Nullable`1[int]]]:this:

LoopsFoundDuringOpts : 1
LoopsInverted : 0
- LoopsCloned : 0
+ LoopsCloned : 1
LoopsUnrolled : 0

Multiple collections have duplicate contexts with diffs in cloning, thus inflating the size diffs in both directions. We also have smaller diffs due to IV opts and CSEs being newly (un)profitable. For example, in System.RuntimeType+RuntimeTypeCache+MemberInfoCache`1[System.__Canon]:MergeWithGlobalListInOrder(System.__Canon[]):this:

- LoopsIVWidened : 1
- WidenedIVs : 1
- UnusedIVsRemoved : 0
- CseCount : 1
+ LoopsIVWidened : 2
+ WidenedIVs : 2
+ UnusedIVsRemoved : 1
+ CseCount : 3

Because synthesis does not mark handler blocks as rarely run, I'm seeing optimizations that use BasicBlock::isRunRarely as a cutoff -- like TLS expansion -- kicking in more often. I'm also seeing fgOptimizeBranch kicking in more/less often due to changes in profitability. All of this seems expected, and we aren't allowing synthesis to change edge likelihoods here, so this pass isn't destructive to the profile.

amanasifkhalid · 2025-03-27T15:23:32Z

cc @dotnet/jit-contrib, @AndyAyersMS PTAL. Diffs show a TP cost of up to 0.5% (some of this is probably from some opts like loop cloning kicking in more often now). I plan to disable fgExpandRarelyRunBlocks when we have PGO data as a follow-up to this, so we can expect to get some TP back. Thanks!

AndyAyersMS · 2025-03-27T17:34:59Z

Can you look at System.Buffers.SharedArrayPool1[System.__Canon]:Trim():ubyte:this (Tier1)` ( benchmarks-pgo) and see if the changes there look reasonable?

The GDV-based cloning heuristics are profile sensitive, and I'd like to be sure we think the new decision here is the right one.

AndyAyersMS · 2025-03-27T17:36:43Z

src/coreclr/jit/compiler.cpp


+        // Re-establish profile consistency, now that inlining and morph have run.
+        //
+        DoPhase(this, PHASE_REPAIR_PROFILE, &Compiler::fgRepairProfile);


We generally use unique phase IDs for each phase, so can you add a new one for this?

amanasifkhalid · 2025-03-27T21:47:27Z

Can you look at

Sure, thanks for pointing this particular example out: I noticed that for the post-importation synthesis calls, if the profile is still inconsistent after recomputing the block weights, we defer the post-phase checks. The importer has a check for re-enabling these checks under the assumption that bad IL can create flow that trips up synthesis, so any code that passes importation should have a reconcilable profile. If later synthesis calls fail to make the profile consistent, the deferred checks are never re-enabled since that's an importer-specific quirk. It turns out the later profile repair phases have been quietly failing to make the profile consistent in several cases, including the above method. In particular, the flow into a call-finally pair doesn't always make it out, and this loss of flow is then propagated to downstream blocks. This might be why we're no longer finding loop cloning profitable in so many cases.

Let me try fixing this in another PR first, since this will affect the pre-layout profile repair phase too.

amanasifkhalid · 2025-03-28T16:29:19Z

Let me try fixing this in another PR first

#114016

amanasifkhalid · 2025-04-04T20:15:40Z

/azp run runtime-coreclr libraries-pgo

azure-pipelines · 2025-04-04T20:15:55Z

Azure Pipelines successfully started running 1 pipeline(s).

AndyAyersMS

LGTM

amanasifkhalid · 2025-04-05T00:46:32Z

libraries-pgo dead-lettered on linux-x64. Otherwise, everything is clean.

amanasifkhalid · 2025-04-05T00:47:30Z

Diffs are still big, though the TP impact is lessened somewhat by synthesis being able to reuse the flowgraph annotations from previous phases.

amanasifkhalid added 2 commits March 25, 2025 16:49

Add post-morph profile repair phase

75c6e78

[no-diff] Remove fgConnectFallthrough

60dca0d

Copilot AI review requested due to automatic review settings March 25, 2025 21:00

Copilot AI reviewed Mar 25, 2025

View reviewed changes

ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 25, 2025

dotnet-policy-service bot assigned amanasifkhalid Mar 25, 2025

This was referenced Mar 25, 2025

System.Net.Quic tests timeout #107761

Closed

System.TimeoutException : The operation has timed out. dotnet/dnceng#5279

Closed

System.Net.Requests test timeout #113883

Closed

Merge branch 'main' into profile-repair-post-morph

89f5857

AndyAyersMS reviewed Mar 27, 2025

View reviewed changes

amanasifkhalid added 2 commits April 4, 2025 12:05

Merge branch 'main' into profile-repair-post-morph

ea03018

Add unique phase IDs; reuse flowgraph annotations

0a891a4

AndyAyersMS approved these changes Apr 4, 2025

View reviewed changes

amanasifkhalid merged commit b3e1d4a into dotnet:main Apr 5, 2025
115 of 119 checks passed

amanasifkhalid deleted the profile-repair-post-morph branch April 5, 2025 00:48

amanasifkhalid mentioned this pull request Apr 6, 2025

JIT: Skip rarely-run block expansion during flow opts when we have profile data #114311

Merged

This was referenced Apr 8, 2025

[Perf] Windows/x64: 12 Regressions on 4/5/2025 12:48:28 AM +00:00 #114382

Closed

[Perf] Linux/arm64: 1 Improvement on 4/5/2025 1:56:39 AM +00:00 dotnet/perf-autofiling-issues#53316

Closed

This was referenced Apr 24, 2025

[Perf] Windows/arm64: 1 Improvement on 4/5/2025 5:40:23 AM +00:00 dotnet/perf-autofiling-issues#54352

Closed

[Perf] Windows/arm64: 2 Improvements on 4/5/2025 1:56:39 AM +00:00 dotnet/perf-autofiling-issues#54351

Closed

amanasifkhalid mentioned this pull request Apr 25, 2025

[Perf] Linux/x64: Span.IndexerBench Regression + Related Issues on 2/6/2025 4:29:54 PM +00:00 #112430

Closed

github-actions bot locked and limited conversation to collaborators May 5, 2025

JIT: Add post-morph profile repair phase #113896

JIT: Add post-morph profile repair phase #113896

Uh oh!

Conversation

amanasifkhalid commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

dotnet-policy-service bot commented Mar 25, 2025

Uh oh!

amanasifkhalid commented Mar 26, 2025

Uh oh!

amanasifkhalid commented Mar 27, 2025

Uh oh!

amanasifkhalid commented Mar 27, 2025

Uh oh!

AndyAyersMS commented Mar 27, 2025

Uh oh!

AndyAyersMS Mar 27, 2025

Choose a reason for hiding this comment

Uh oh!

amanasifkhalid Mar 27, 2025

Choose a reason for hiding this comment

Uh oh!

amanasifkhalid commented Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amanasifkhalid commented Mar 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

amanasifkhalid commented Apr 4, 2025

Uh oh!

azure-pipelines bot commented Apr 4, 2025

Uh oh!

AndyAyersMS left a comment

Choose a reason for hiding this comment

Uh oh!

amanasifkhalid commented Apr 5, 2025

Uh oh!

amanasifkhalid commented Apr 5, 2025

Uh oh!

Uh oh!

Uh oh!

amanasifkhalid commented Mar 25, 2025 •

edited

Loading

amanasifkhalid commented Mar 27, 2025 •

edited

Loading

amanasifkhalid commented Mar 28, 2025 •

edited

Loading