[InstCombine] Always rewrite multi-use GEP for pointer difference #142787

nikic · 2025-06-04T15:14:21Z

Resolve the TODO in OptimizePointerDifference: Always rewrite (non-trivial, multi-use) GEPs to reuse the offset arithmetic, even if there is only one GEP involved in the subtraction.

I plan to extend this code to deeper GEP chains, in which case always rewriting seems like a good idea.

Resolve the TODO in OptimizePointerDifference: Always rewrite (non-trivial, multi-use) GEPs to reuse the offset arithmetic, even if there is only one GEP.

llvmbot · 2025-06-04T15:15:10Z

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

Resolve the TODO in OptimizePointerDifference: Always rewrite (non-trivial, multi-use) GEPs to reuse the offset arithmetic, even if there is only one GEP involved in the subtraction.

I plan to extend this code to deeper GEP chains, in which case always rewriting seems like a good idea.

Full diff: https://github.com/llvm/llvm-project/pull/142787.diff

2 Files Affected:

(modified) llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp (+4-7)
(modified) llvm/test/Transforms/InstCombine/sub-gep.ll (+32)

diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
index a9ac5ff9b9c89..fda56c93451e8 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
@@ -2101,21 +2101,18 @@ Value *InstCombinerImpl::OptimizePointerDifference(Value *LHS, Value *RHS,
   if (!GEP1)
     return nullptr;
 
+  // Emit the offset of the GEP as an intptr_t.
   // To avoid duplicating the offset arithmetic, rewrite the GEP to use the
   // computed offset. This may erase the original GEP, so be sure to cache the
   // nowrap flags before emitting the offset.
-  // TODO: We should probably do this even if there is only one GEP.
-  bool RewriteGEPs = GEP2 != nullptr;
-
-  // Emit the offset of the GEP and an intptr_t.
   GEPNoWrapFlags GEP1NW = GEP1->getNoWrapFlags();
-  Value *Result = EmitGEPOffset(GEP1, RewriteGEPs);
+  Value *Result = EmitGEPOffset(GEP1, /*RewriteGEP=*/true);
 
   // If this is a single inbounds GEP and the original sub was nuw,
   // then the final multiplication is also nuw.
   if (auto *I = dyn_cast<Instruction>(Result))
     if (IsNUW && !GEP2 && !Swapped && GEP1NW.isInBounds() &&
-        I->getOpcode() == Instruction::Mul)
+        I->getOpcode() == Instruction::Mul && I->use_empty())
       I->setHasNoUnsignedWrap();
 
   // If we have a 2nd GEP of the same base pointer, subtract the offsets.
@@ -2123,7 +2120,7 @@ Value *InstCombinerImpl::OptimizePointerDifference(Value *LHS, Value *RHS,
   // If both GEPs are nuw and the original sub is nuw, the new sub is also nuw.
   if (GEP2) {
     GEPNoWrapFlags GEP2NW = GEP2->getNoWrapFlags();
-    Value *Offset = EmitGEPOffset(GEP2, RewriteGEPs);
+    Value *Offset = EmitGEPOffset(GEP2, /*RewriteGEP=*/true);
     Result = Builder.CreateSub(Result, Offset, "gepdiff",
                                IsNUW && GEP1NW.hasNoUnsignedWrap() &&
                                    GEP2NW.hasNoUnsignedWrap(),
diff --git a/llvm/test/Transforms/InstCombine/sub-gep.ll b/llvm/test/Transforms/InstCombine/sub-gep.ll
index c86a1a37bd7ad..e6f6498e23389 100644
--- a/llvm/test/Transforms/InstCombine/sub-gep.ll
+++ b/llvm/test/Transforms/InstCombine/sub-gep.ll
@@ -760,6 +760,38 @@ entry:
   ret i64 %ret
 }
 
+declare void @use(ptr)
+
+define i64 @sub_multi_use(ptr %base, i64 %idx) {
+; CHECK-LABEL: @sub_multi_use(
+; CHECK-NEXT:    [[P2_IDX:%.*]] = shl nsw i64 [[IDX:%.*]], 2
+; CHECK-NEXT:    [[P2:%.*]] = getelementptr inbounds i8, ptr [[BASE:%.*]], i64 [[P2_IDX]]
+; CHECK-NEXT:    call void @use(ptr [[P2]])
+; CHECK-NEXT:    ret i64 [[P2_IDX]]
+;
+  %p2 = getelementptr inbounds [0 x i32], ptr %base, i64 0, i64 %idx
+  call void @use(ptr %p2)
+  %i1 = ptrtoint ptr %base to i64
+  %i2 = ptrtoint ptr %p2 to i64
+  %d = sub i64 %i2, %i1
+  ret i64 %d
+}
+
+define i64 @sub_multi_use_nuw(ptr %base, i64 %idx) {
+; CHECK-LABEL: @sub_multi_use_nuw(
+; CHECK-NEXT:    [[P2_IDX:%.*]] = shl nsw i64 [[IDX:%.*]], 2
+; CHECK-NEXT:    [[P2:%.*]] = getelementptr inbounds i8, ptr [[BASE:%.*]], i64 [[P2_IDX]]
+; CHECK-NEXT:    call void @use(ptr [[P2]])
+; CHECK-NEXT:    ret i64 [[P2_IDX]]
+;
+  %p2 = getelementptr inbounds [0 x i32], ptr %base, i64 0, i64 %idx
+  call void @use(ptr %p2)
+  %i1 = ptrtoint ptr %base to i64
+  %i2 = ptrtoint ptr %p2 to i64
+  %d = sub nuw i64 %i2, %i1
+  ret i64 %d
+}
+
 define i1 @_gep_phi1(ptr %str1) {
 ; CHECK-LABEL: @_gep_phi1(
 ; CHECK-NEXT:  entry:

nikic · 2025-06-05T12:45:49Z

I plan to extend this code to deeper GEP chains, in which case always rewriting seems like a good idea.

Based on the llvm-opt-benchmark results for #142958 this doesn't seem to be strictly necessary in practice. So I've made that change independentl from this one. The llvm-opt-benchmark diffs for this one look fairly neutral overall, but it's a bit hard to judge with this kind of change.

nikic added 2 commits June 4, 2025 17:08

[InstCombine] Add test for sub gep with multi-use

6820867

[InstCombine] Always rewrite multi-use GEP for pointer difference

1583757

Resolve the TODO in OptimizePointerDifference: Always rewrite (non-trivial, multi-use) GEPs to reuse the offset arithmetic, even if there is only one GEP.

nikic requested a review from dtcxzyw June 4, 2025 15:14

llvmbot added llvm:instcombine llvm:transforms labels Jun 4, 2025

nikic changed the title ~~Instcombine always rewrite gep~~ [InstCombine] Always rewrite multi-use GEP for pointer difference Jun 4, 2025

nikic mentioned this pull request Jun 4, 2025

Task submission dtcxzyw/llvm-opt-benchmark#1312

Open

dtcxzyw mentioned this pull request Jun 4, 2025

pre-commit: PR142787 dtcxzyw/llvm-opt-benchmark#2397

Open

Clone instruction to add nuw flag

86d30ee

dtcxzyw mentioned this pull request Jun 5, 2025

pre-commit: PR142787 dtcxzyw/llvm-opt-benchmark#2401

Open

Don't bother cloning if nuw is already set (from gep nuw)

11c8483

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[InstCombine] Always rewrite multi-use GEP for pointer difference #142787

[InstCombine] Always rewrite multi-use GEP for pointer difference #142787

nikic commented Jun 4, 2025

Uh oh!

llvmbot commented Jun 4, 2025

Uh oh!

nikic commented Jun 5, 2025

Uh oh!

Uh oh!

[InstCombine] Always rewrite multi-use GEP for pointer difference #142787

Are you sure you want to change the base?

[InstCombine] Always rewrite multi-use GEP for pointer difference #142787

Conversation

nikic commented Jun 4, 2025

Uh oh!

llvmbot commented Jun 4, 2025

Uh oh!

nikic commented Jun 5, 2025

Uh oh!

Uh oh!