Skip to content

JIT: Flow Graph Modernization and Improved Block Layout #93020

Closed

Description

Overview

The current block layout algorithm in the JIT is based on local permutations of block order. It is complicated and likely far from optimal. We would like to improve the overall block layout algorithm used by the JIT, in particular adopting a global cost-minimizing approach to layout—for instance, one in the style of Young et. al.'s Near-optimal intraprocedural branch alignment. Additional complexities arise in our case because of various EH reporting requirements, so the JIT cannot freely reorder all blocks, but we should be able to apply the global ordering techniques within EH regions.

Before we can tackle this problem there are several important (and sizeable) prerequisites, which we can lump together as "flow graph modernization." There are a lot of details here, but at a high level:

  • Update the flow graph to make fall-through behavior explicit for most phases (until layout). This means disallowing BBJ_NONE and adding explicit fall through successors for BBJ_COND and BBJ_SWITCH. More on this below.
  • Defer most block reordering work until late in the JIT phase pipeline (ideally, perhaps, waiting until after LSRA has run, so we can properly position any new blocks it introduces).
  • Leverage the work done in .NET 8 to make flow edges persistent and ensure that those edges accurately describe successor likelihoods. Possibly make successor edge enumeration a first-class concept.

It is not yet clear how much progress we can make during .NET 9. The list of items below is preliminary and subject to change.

Motivation

Past studies have shown that the two phases that benefit most from block-level PGO data are inlining and block layout. In a previous compiler project, the net benefit from PGO was on the order of 15%, with about 12% attributable to inlining, and 2% to improved layout.

The current JIT is likely seeing a much smaller benefit from layout. The goal here is to ensure that we are using the accurate PGO data to make informed decisions about the ordering of blocks, with the hope of realizing perhaps a 1 or 2% net benefit across a wide range of applications (with some benefiting much more, and others, not at all).

Flow Graph Modernization

Block Layout

  • Look into running existing layout after LSRA. Note there may be some tricky interactions, if say reversing a conditional perturbs the allocation for some reason
  • Building on the work above, leverage block weights and successor edge likelihoods to build a good initial layout via something like a greedy RPO
  • Implement some sort of layout score describing costs of a layout
  • (Possibly) find a way of estimating the optimal layout score
  • (stretch) Implement a scheme to improve layout based on lowering score; one such scheme is described in Near-optimal intraprocedural branch alignment
  • Consider removing Compiler::fgFindInsertPoint, and similar logic that attempts to maintain reasonable orderings before block layout is run (see comment).
  • Consider skipping loop compaction and relying on the new layout algorithm to place loop bodies contiguously.
  • Think about the interaction of layout and hot/cold splitting

cc @amanasifkhalid @dotnet/jit-contrib

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Assignees

Labels

User StoryA single user-facing feature. Can be grouped under an epic.area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Type

No type

Projects

Relationships

None yet

Development

No branches or pull requests

Issue actions