Skip to content

JIT: Inlining via splitting#124870

Draft
jakobbotsch wants to merge 26 commits intodotnet:mainfrom
jakobbotsch:inlining-via-splitting
Draft

JIT: Inlining via splitting#124870
jakobbotsch wants to merge 26 commits intodotnet:mainfrom
jakobbotsch:inlining-via-splitting

Conversation

@jakobbotsch
Copy link
Member

Reviving my prototype as I have some runtime async work that this would simplify...

Copilot AI review requested due to automatic review settings February 25, 2026 19:59
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Feb 25, 2026
@dotnet-policy-service
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR revives/implements an alternative inlining pipeline in the CoreCLR JIT based on splitting trees/blocks around inline candidates (instead of using GT_RET_EXPR placeholders), with follow-on updates across importer, inliner, and various optimizations that previously depended on GT_RET_EXPR.

Changes:

  • Remove GT_RET_EXPR from the IR and update related utilities (node lists, cloning, side-effect checks, class-handle queries, etc.).
  • Rework inlining to inline directly from GT_CALL sites using statement/tree splitting plus new setup/teardown statement list plumbing.
  • Update guarded devirtualization/fat calli transformations, struct return/retbuf handling, and instrumentation to work without GT_RET_EXPR.

Reviewed changes

Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
src/coreclr/jit/simd.cpp Stop treating GT_RET_EXPR as a call-like SIMD stack value; normalize only GT_CALL.
src/coreclr/jit/lclvars.cpp Add a null-layout guard to struct promotion eligibility checks.
src/coreclr/jit/inline.h Adjust inline policy/debug hooks, InlineResult ctor signature, introduce InlineIRResult, add StatementListBuilder, track retbuf arg info.
src/coreclr/jit/inline.cpp Wire InlineResult construction to accept an InlineContext* and update policy notes accordingly.
src/coreclr/jit/indirectcalltransformer.cpp Refactor call discovery and operand spilling; handle STORE_LCL_VAR(call) statement shapes.
src/coreclr/jit/importervectorization.cpp Remove GT_RET_EXPR-based span/call handling paths.
src/coreclr/jit/importercalls.cpp Rework importer handling for inline/GDV candidates without GT_RET_EXPR; adjust some intrinsic patterns and candidate bookkeeping.
src/coreclr/jit/importer.cpp Remove various GT_RET_EXPR-specific normalization/spill/return handling paths; adapt retbuf and inline return plumbing.
src/coreclr/jit/gtstructs.h Remove RetExpr node struct mapping.
src/coreclr/jit/gtlist.h Remove GTNODE(RET_EXPR, ...) from the node list.
src/coreclr/jit/gentree.h Remove GenTreeRetExpr definition.
src/coreclr/jit/gentree.cpp Remove GT_RET_EXPR handling across layout queries, cloning constraints, leaf display, side-effect logic; extend gtSplitTree signature/behavior.
src/coreclr/jit/fgstmt.cpp Add helpers to splice whole statement lists into blocks (before/at end).
src/coreclr/jit/fgprofile.cpp Instrumentation temporarily links inlinee “return” IR using InlineIRResult instead of GT_RET_EXPR.
src/coreclr/jit/fgopt.cpp Simplify async-call scanning by removing GT_RET_EXPR recursion.
src/coreclr/jit/fginline.cpp Major rewrite: new walker that inlines from GT_CALL via splitting + statement/block insertion; adds setup/teardown list building.
src/coreclr/jit/fgbasic.cpp Add fgSplitBlockBeforeStatement helper.
src/coreclr/jit/debuginfo.cpp Disable (comment out) debug-info validation checks.
src/coreclr/jit/compiler.hpp Remove GT_RET_EXPR from operand visitation cases.
src/coreclr/jit/compiler.h Update signatures (gtSplitTree, inline helpers), add friendship and statement-list splicing APIs.
Comments suppressed due to low confidence (2)

src/coreclr/jit/inline.cpp:636

  • The comment for InlineResult::InlineResult still describes a "stmt" parameter, but the constructor now takes an InlineContext* instead. Update the parameter documentation to reflect the new signature to avoid confusion for future call sites.
// Arguments:
//   compiler      - the compiler instance examining a call for inlining
//   call          - the call in question
//   stmt          - statement containing the call (if known)
//   description   - string describing the context of the decision

src/coreclr/jit/importer.cpp:405

  • In impAppendStmt, the inner if (call->ShouldHaveRetBufArg()) is redundant because it's nested under if (call->TypeIs(TYP_VOID) && call->ShouldHaveRetBufArg()). As written, the else branch is unreachable; consider simplifying to a single path (or, if the two branches were meant to distinguish different shapes, adjust the condition so both are reachable as intended).
            if (call->TypeIs(TYP_VOID) && call->ShouldHaveRetBufArg())
            {
                GenTree* retBuf;
                if (call->ShouldHaveRetBufArg())
                {
                    assert(call->gtArgs.HasRetBuffer());
                    retBuf = call->gtArgs.GetRetBufferArg()->GetNode();
                }
                else
                {
                    assert(!call->gtArgs.HasThisPointer());
                    retBuf = call->gtArgs.GetArgByIndex(0)->GetNode();
                }

Copilot AI review requested due to automatic review settings February 25, 2026 21:43
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 5 comments.

Comments suppressed due to low confidence (3)

src/coreclr/jit/inline.cpp:1314

  • Line 1314 contains a commented-out DebugInfo validation call that appears to be intentionally disabled. This should either be removed if no longer needed, or uncommented if the validation is important. The comment doesn't explain why it's disabled, which makes it unclear whether this is temporary debugging code or intentionally disabled for a reason.
    // DebugInfo(parentContext, context->m_Location).Validate();

src/coreclr/jit/indirectcalltransformer.cpp:997

  • On line 996, lvSingleDef is unconditionally set to false when doesReturnValue is true, but this is done after potentially demoting the call to non-inline in the if block above (lines 968-978). If the call was demoted and the statement was already added, we're still modifying lvSingleDef. While this may be correct (since multiple defs will exist after GDV transformation), the logic would be clearer if this was moved inside the else block starting at line 980 where we know the call remains an inline candidate.
            if (doesReturnValue)
            {
                compiler->lvaGetDesc(stmt->GetRootNode()->AsLclVarCommon())->lvSingleDef = false;
            }

src/coreclr/jit/importer.cpp:789

  • On line 785, the condition checks !srcCall->IsInlineCandidate() before marking the local as DNER. However, this means that if a call IS an inline candidate, the local won't be marked DNER even though it's being used as a hidden buffer struct arg. If the inline later fails, the local may still be in an enregisterable state which could cause issues. The old code didn't have this condition. Consider whether this check is correct or if it should be removed.
            if (destAddr->OperIs(GT_LCL_ADDR) && !srcCall->IsInlineCandidate())
            {
                lvaSetVarDoNotEnregister(destAddr->AsLclVarCommon()->GetLclNum()
                                             DEBUGARG(DoNotEnregisterReason::HiddenBufferStructArg));
            }

Comment on lines +594 to +599
// TODO: move to InlineInfo
struct InlineIRResult
{
GenTree* substExpr;
BasicBlock* substBB;
};
Copy link

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO comment "TODO: move to InlineInfo" on line 594 suggests this is incomplete work. The InlineIRResult struct should likely be moved into InlineInfo rather than being a separate struct that's embedded in InlineCandidateInfo. This would make the structure more logical since the result is specific to each inline attempt.

Copilot uses AI. Check for mistakes.
@jakobbotsch
Copy link
Member Author

jakobbotsch commented Feb 25, 2026

One source of regressions is that for stfld we create IR of the shape

[000011] nACXG---R--                           STORE_BLK struct<System.TimeSpan, 8> (copy)
[000010] ---X-------                         ├──▌  FIELD_ADDR byref  <unknown class>:<unknown field>
[000006] -----------                           └──▌  LCL_VAR   ref    V00 this         
[000009] I-C-G------                         └──▌  CALL      struct System.TimeSpan:FromSeconds(long):System.TimeSpan (exactContextHandle=0x00007FF94C1F7CE1)
[000008] ----------- arg0                       └──▌  CNS_INT   long   10

Note GTF_EXCEPT on [000010]. That's clearly wrong when [000009] is the data that came from the IL stack and should be evaluated before [000010].

We make up for it in morph since we usually fold the NRE side effect of the address computation into the store itself, and when we do that we end up evaluating the call first. But when the call is an inline candidate we now end up spilling [000010] when we split, whereas before it would be a RET_EXPR and the rest of the inlinee statements would end up before the entire store.

Some kind of fix is needed here (as a separate PR). We could set GTF_REVERSE_OPS on the store, but that is not something we typically do during import.

(this issue is similar to #77650)

Copilot AI review requested due to automatic review settings February 26, 2026 13:48
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 1 comment.

Copilot AI review requested due to automatic review settings February 26, 2026 13:58
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 2 comments.

Comment on lines 1258 to 1260
// Detach the GT_CALL tree from the original statement by
// hanging a "nothing" node to it. Later the "nothing" node will be removed
// and the original GT_CALL tree will be picked up by the GT_RET_EXPR node.
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment describes the old GT_RET_EXPR-based detachment/reattachment flow ("picked up by the GT_RET_EXPR node"), but GT_RET_EXPR is removed in this PR and the substitution is now represented via InlineCandidateInfo::result. Please update the comment to reflect the new mechanism (or remove it) to avoid confusion when debugging inline failures.

Suggested change
// Detach the GT_CALL tree from the original statement by
// hanging a "nothing" node to it. Later the "nothing" node will be removed
// and the original GT_CALL tree will be picked up by the GT_RET_EXPR node.
// Mark this candidate as having no successful inline substitution by
// clearing the recorded inline result. This ensures the original GT_CALL
// remains in the IR instead of being replaced by an inlined expression.

Copilot uses AI. Check for mistakes.
Comment on lines +343 to +351
JITDUMP("Inlining candidate [%06u] in " FMT_STMT "\n", Compiler::dspTreeID(call), m_statement->GetID());
DISPSTMT(m_statement);

fgWalkResult PostOrderVisit(GenTree** use, GenTree* user)
{
LateDevirtualization(use, user);
return fgWalkResult::WALK_CONTINUE;
}
if (InsertMidStatement(inlineInfo, use))
{
m_nextBlock = m_block;
m_nextStatement = m_statement;
return true;
}
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On successful inlines (including the splitting/CFG rewrite paths), this method returns true without setting m_madeChanges. As a result, fgInline() can incorrectly return MODIFIED_NOTHING even though it mutated IR/CFG, which can break downstream phase gating and debug asserts. Set m_madeChanges = true whenever an inline succeeds/changes IR (e.g., before returning true from TryInline, and/or have fgInline() treat any restart as a modification).

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants