Skip to content

Commit

Permalink
Permute the pass pipeline to coalesce before setting up the matmul (#…
Browse files Browse the repository at this point in the history
…2956)

Required for #2834 

Two reasons to do this - one, it properly tags the layouts with their
memory order very early in the TTGIR pipeline. And two, it moves our
TTGIR pipeline closer to upstream. I am splitting the change to isolate
any regressions or undesired behavior caused by this change vs changing
the DPAS layouts in #2834.

cc #2354
  • Loading branch information
alexbaden authored Dec 9, 2024
1 parent ddecf19 commit cacfe10
Showing 1 changed file with 4 additions and 2 deletions.
6 changes: 4 additions & 2 deletions third_party/intel/backend/compiler.py
Original file line number Diff line number Diff line change
Expand Up @@ -239,15 +239,17 @@ def make_ttgir(mod, metadata, opt, properties):
return XPUBackend.AdvancedPath.make_ttgir(mod, metadata, opt)

passes.ttir.add_convert_to_ttgpuir(pm, "xpu", opt.num_warps, opt.threads_per_warp, opt.num_ctas)
# optimize TTGIR
intel.passes.ttgpuir.add_coalesce(pm)
intel.passes.ttgpuir.add_remove_layout_conversions(pm)

intel.passes.ttgpuir.add_accelerate_matmul(pm)
intel.passes.ttgpuir.add_remove_layout_conversions(pm)
intel.passes.ttgpuir.add_materialize_block_pointer(pm)
if os.getenv("TRITON_INTEL_REWRITE_TENSOR_POINTER", "0") == "1":
intel.passes.ttgpuir.add_rewrite_tensor_pointer(pm)
intel.passes.ttgpuir.add_pipeline(pm, opt.num_stages, False)

intel.passes.ttgpuir.add_coalesce(pm)
intel.passes.ttgpuir.add_remove_layout_conversions(pm)
passes.ttgpuir.add_optimize_thread_locality(pm)
passes.ttgpuir.add_optimize_dot_operands(pm, True)
passes.common.add_cse(pm)
Expand Down

0 comments on commit cacfe10

Please sign in to comment.