Skip to content

[InstCombine] Always rewrite multi-use GEP for pointer difference #142787

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Jun 4, 2025

Resolve the TODO in OptimizePointerDifference: Always rewrite (non-trivial, multi-use) GEPs to reuse the offset arithmetic, even if there is only one GEP involved in the subtraction.

I plan to extend this code to deeper GEP chains, in which case always rewriting seems like a good idea.

nikic added 2 commits June 4, 2025 17:08
Resolve the TODO in OptimizePointerDifference: Always rewrite
(non-trivial, multi-use) GEPs to reuse the offset arithmetic,
even if there is only one GEP.
@nikic nikic requested a review from dtcxzyw June 4, 2025 15:14
@nikic nikic changed the title Instcombine always rewrite gep [InstCombine] Always rewrite multi-use GEP for pointer difference Jun 4, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 4, 2025

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

Resolve the TODO in OptimizePointerDifference: Always rewrite (non-trivial, multi-use) GEPs to reuse the offset arithmetic, even if there is only one GEP involved in the subtraction.

I plan to extend this code to deeper GEP chains, in which case always rewriting seems like a good idea.


Full diff: https://github.com/llvm/llvm-project/pull/142787.diff

2 Files Affected:

  • (modified) llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp (+4-7)
  • (modified) llvm/test/Transforms/InstCombine/sub-gep.ll (+32)
diff --git a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
index a9ac5ff9b9c89..fda56c93451e8 100644
--- a/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
+++ b/llvm/lib/Transforms/InstCombine/InstCombineAddSub.cpp
@@ -2101,21 +2101,18 @@ Value *InstCombinerImpl::OptimizePointerDifference(Value *LHS, Value *RHS,
   if (!GEP1)
     return nullptr;
 
+  // Emit the offset of the GEP as an intptr_t.
   // To avoid duplicating the offset arithmetic, rewrite the GEP to use the
   // computed offset. This may erase the original GEP, so be sure to cache the
   // nowrap flags before emitting the offset.
-  // TODO: We should probably do this even if there is only one GEP.
-  bool RewriteGEPs = GEP2 != nullptr;
-
-  // Emit the offset of the GEP and an intptr_t.
   GEPNoWrapFlags GEP1NW = GEP1->getNoWrapFlags();
-  Value *Result = EmitGEPOffset(GEP1, RewriteGEPs);
+  Value *Result = EmitGEPOffset(GEP1, /*RewriteGEP=*/true);
 
   // If this is a single inbounds GEP and the original sub was nuw,
   // then the final multiplication is also nuw.
   if (auto *I = dyn_cast<Instruction>(Result))
     if (IsNUW && !GEP2 && !Swapped && GEP1NW.isInBounds() &&
-        I->getOpcode() == Instruction::Mul)
+        I->getOpcode() == Instruction::Mul && I->use_empty())
       I->setHasNoUnsignedWrap();
 
   // If we have a 2nd GEP of the same base pointer, subtract the offsets.
@@ -2123,7 +2120,7 @@ Value *InstCombinerImpl::OptimizePointerDifference(Value *LHS, Value *RHS,
   // If both GEPs are nuw and the original sub is nuw, the new sub is also nuw.
   if (GEP2) {
     GEPNoWrapFlags GEP2NW = GEP2->getNoWrapFlags();
-    Value *Offset = EmitGEPOffset(GEP2, RewriteGEPs);
+    Value *Offset = EmitGEPOffset(GEP2, /*RewriteGEP=*/true);
     Result = Builder.CreateSub(Result, Offset, "gepdiff",
                                IsNUW && GEP1NW.hasNoUnsignedWrap() &&
                                    GEP2NW.hasNoUnsignedWrap(),
diff --git a/llvm/test/Transforms/InstCombine/sub-gep.ll b/llvm/test/Transforms/InstCombine/sub-gep.ll
index c86a1a37bd7ad..e6f6498e23389 100644
--- a/llvm/test/Transforms/InstCombine/sub-gep.ll
+++ b/llvm/test/Transforms/InstCombine/sub-gep.ll
@@ -760,6 +760,38 @@ entry:
   ret i64 %ret
 }
 
+declare void @use(ptr)
+
+define i64 @sub_multi_use(ptr %base, i64 %idx) {
+; CHECK-LABEL: @sub_multi_use(
+; CHECK-NEXT:    [[P2_IDX:%.*]] = shl nsw i64 [[IDX:%.*]], 2
+; CHECK-NEXT:    [[P2:%.*]] = getelementptr inbounds i8, ptr [[BASE:%.*]], i64 [[P2_IDX]]
+; CHECK-NEXT:    call void @use(ptr [[P2]])
+; CHECK-NEXT:    ret i64 [[P2_IDX]]
+;
+  %p2 = getelementptr inbounds [0 x i32], ptr %base, i64 0, i64 %idx
+  call void @use(ptr %p2)
+  %i1 = ptrtoint ptr %base to i64
+  %i2 = ptrtoint ptr %p2 to i64
+  %d = sub i64 %i2, %i1
+  ret i64 %d
+}
+
+define i64 @sub_multi_use_nuw(ptr %base, i64 %idx) {
+; CHECK-LABEL: @sub_multi_use_nuw(
+; CHECK-NEXT:    [[P2_IDX:%.*]] = shl nsw i64 [[IDX:%.*]], 2
+; CHECK-NEXT:    [[P2:%.*]] = getelementptr inbounds i8, ptr [[BASE:%.*]], i64 [[P2_IDX]]
+; CHECK-NEXT:    call void @use(ptr [[P2]])
+; CHECK-NEXT:    ret i64 [[P2_IDX]]
+;
+  %p2 = getelementptr inbounds [0 x i32], ptr %base, i64 0, i64 %idx
+  call void @use(ptr %p2)
+  %i1 = ptrtoint ptr %base to i64
+  %i2 = ptrtoint ptr %p2 to i64
+  %d = sub nuw i64 %i2, %i1
+  ret i64 %d
+}
+
 define i1 @_gep_phi1(ptr %str1) {
 ; CHECK-LABEL: @_gep_phi1(
 ; CHECK-NEXT:  entry:

@nikic
Copy link
Contributor Author

nikic commented Jun 5, 2025

I plan to extend this code to deeper GEP chains, in which case always rewriting seems like a good idea.

Based on the llvm-opt-benchmark results for #142958 this doesn't seem to be strictly necessary in practice. So I've made that change independentl from this one. The llvm-opt-benchmark diffs for this one look fairly neutral overall, but it's a bit hard to judge with this kind of change.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants