Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[InstCombine] Infer nusw + nneg -> nuw for getelementptr #111144

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

nikic
Copy link
Contributor

@nikic nikic commented Oct 4, 2024

If the gep is nusw (usually via inbounds) and the offset is non-negative, we can infer nuw.

Unfortunately this inference does have some compile-time overhead: https://llvm-compile-time-tracker.com/compare.php?from=37e5319a12ba47c18049728804d3d1e1b10c4eb4&to=af56d73d6543f05b1e5205b96934e2427bb24d72&stat=instructions:u

Proof: https://alive2.llvm.org/ce/z/ihztLy

@llvmbot llvmbot added clang Clang issues not falling into any other category backend:AMDGPU backend:PowerPC backend:SystemZ PGO Profile Guided Optimizations coroutines C++20 coroutines llvm:analysis llvm:transforms labels Oct 4, 2024
@llvmbot
Copy link
Collaborator

llvmbot commented Oct 4, 2024

@llvm/pr-subscribers-backend-amdgpu
@llvm/pr-subscribers-backend-systemz

@llvm/pr-subscribers-backend-powerpc

Author: Nikita Popov (nikic)

Changes

If the gep is nusw (usually via inbounds) and the offset is non-negative, we can infer nuw.

Unfortunately this inference does have some compile-time overhead: https://llvm-compile-time-tracker.com/compare.php?from=37e5319a12ba47c18049728804d3d1e1b10c4eb4&to=af56d73d6543f05b1e5205b96934e2427bb24d72&stat=instructions:u

Proof: https://alive2.llvm.org/ce/z/ihztLy


Patch is 1.28 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111144.diff

183 Files Affected:

  • (modified) clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c (+5-5)
  • (modified) clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c (+14-14)
  • (modified) clang/test/CodeGen/aarch64-ls64-inline-asm.c (+10-10)
  • (modified) clang/test/CodeGen/arm64_32-vaarg.c (+6-6)
  • (modified) clang/test/CodeGen/attr-counted-by-pr110385.c (+2-2)
  • (modified) clang/test/CodeGen/attr-counted-by.c (+106-110)
  • (modified) clang/test/CodeGen/math-libcalls-tbaa.c (+7-7)
  • (modified) clang/test/CodeGen/union-tbaa1.c (+2-2)
  • (modified) clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu (+3-3)
  • (modified) clang/test/CodeGenCXX/auto-var-init.cpp (+4-4)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-dynamic-cast.cpp (+9-9)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-typeid.cpp (+1-1)
  • (modified) clang/test/CodeGenOpenCL/amdgpu-nullptr.cl (+2-2)
  • (modified) clang/test/CodeGenOpenCL/builtins-amdgcn.cl (+4-4)
  • (modified) clang/test/CodeGenOpenCLCXX/array-type-infinite-loop.clcpp (+12-12)
  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+9)
  • (modified) llvm/test/Analysis/BasicAA/featuretest.ll (+1-1)
  • (modified) llvm/test/Analysis/ValueTracking/phi-known-bits.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/implicit-arg-v5-opt.ll (+6-6)
  • (modified) llvm/test/CodeGen/AMDGPU/reqd-work-group-size.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/vector-alloca-bitcast.ll (+6-6)
  • (modified) llvm/test/Transforms/Coroutines/coro-async.ll (+12-12)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-alloca-opaque-ptr.ll (+1-1)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-alloca.ll (+1-1)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-once-value.ll (+3-3)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-resume-values.ll (+10-10)
  • (modified) llvm/test/Transforms/Coroutines/coro-swifterror.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/2007-03-25-BadShiftMask.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/2009-01-08-AlignAlloca.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/X86/x86-addsub-inseltpoison.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/X86/x86-addsub.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/array.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/assume-align.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/assume-loop-align.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/assume-redundant.ll (+5-1)
  • (modified) llvm/test/Transforms/InstCombine/assume.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/call-cast-target.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/cast_phi.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/cast_ptr.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/catchswitch-phi.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/compare-alloca.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/compare-unescaped.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/dependent-ivs.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/extractvalue.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/fmul.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/fsh.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/gep-addrspace.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/gep-canonicalize-constant-indices.ll (+9-9)
  • (modified) llvm/test/Transforms/InstCombine/gep-combine-loop-invariant.ll (+14-14)
  • (modified) llvm/test/Transforms/InstCombine/gep-merge-constant-indices.ll (+12-12)
  • (modified) llvm/test/Transforms/InstCombine/gep-vector-indices.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/gepphigep.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/getelementptr.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/icmp-custom-dl.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/icmp-gep.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/icmp.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/inbounds-gep.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/indexed-gep-compares.ll (+9-5)
  • (modified) llvm/test/Transforms/InstCombine/intptr1.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/intptr7.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/load-bitcast-select.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/mem-par-metadata-memcpy.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/memccpy.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/memcpy_alloca.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/mempcpy.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/memset2.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/opaque-ptr.ll (+10-6)
  • (modified) llvm/test/Transforms/InstCombine/phi-equal-incoming-pointers.ll (+14-14)
  • (modified) llvm/test/Transforms/InstCombine/phi-timeout.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/phi.ll (+22-22)
  • (modified) llvm/test/Transforms/InstCombine/ptr-replace-alloca.ll (+8-8)
  • (modified) llvm/test/Transforms/InstCombine/ptrmask.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/remove-loop-phi-multiply-by-zero.ll (+10-10)
  • (modified) llvm/test/Transforms/InstCombine/select-cmp-br.ll (+8-8)
  • (modified) llvm/test/Transforms/InstCombine/select-gep.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/sink_sideeffecting_instruction.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/snprintf-2.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/snprintf-3.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/snprintf-4.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/snprintf.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/sprintf-1.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/stpncpy-1.ll (+32-32)
  • (modified) llvm/test/Transforms/InstCombine/str-int.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/strlcpy-1.ll (+14-14)
  • (modified) llvm/test/Transforms/InstCombine/strlen-1.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/struct-assign-tbaa-2.ll (+4-2)
  • (modified) llvm/test/Transforms/InstCombine/sub.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/unpack-fca.ll (+26-26)
  • (modified) llvm/test/Transforms/InstCombine/vec_gep_scalar_arg-inseltpoison.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/vec_gep_scalar_arg.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/vscale_gep.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-1.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-3.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-5.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll (+24-24)
  • (modified) llvm/test/Transforms/LoopUnroll/ARM/upperbound.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopUnroll/WebAssembly/basic-unrolling.ll (+31-31)
  • (modified) llvm/test/Transforms/LoopUnroll/peel-loop.ll (+6-6)
  • (modified) llvm/test/Transforms/LoopUnroll/runtime-unroll-remainder.ll (+8-8)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/deterministic-type-shrinkage.ll (+54-54)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-cond-inv-loads.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-gather-scatter.ll (+25-25)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+198-198)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-masked-accesses.ll (+95-95)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-phi.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-epilogue.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-no-scalar-interleave.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-too-many-deps.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt.ll (+14-14)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/uniform-args-call-variants.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/AMDGPU/packed-math.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/ARM/mve-reductions.ll (+12-12)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll (+36-36)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/interleaving.ll (+44-44)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/invariant-load-gather.ll (+24-24)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll (+51-51)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/metadata-enable.ll (+474-474)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/pr23997.ll (+8-8)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll (+325-325)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll (+22-22)
  • (modified) llvm/test/Transforms/LoopVectorize/extract-last-veclane.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/float-induction.ll (+150-150)
  • (modified) llvm/test/Transforms/LoopVectorize/forked-pointers.ll (+15-15)
  • (modified) llvm/test/Transforms/LoopVectorize/histograms.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/induction.ll (+42-42)
  • (modified) llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVectorize/invariant-store-vectorization-2.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/invariant-store-vectorization.ll (+11-11)
  • (modified) llvm/test/Transforms/LoopVectorize/loop-scalars.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction-inloop-uf4.ll (+229-229)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction-inloop.ll (+4-4)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/runtime-check.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/scalar_after_vectorization.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/trunc-reductions.ll (+92-32)
  • (modified) llvm/test/Transforms/LoopVectorize/vector-geps.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVersioningLICM/loopversioningLICM1.ll (+13-13)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/hoist-runtime-checks.ll (+5-5)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/hoisting-sinking-required-for-vectorization.ll (+10-10)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/indvars-vectorization.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/loopflatten.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/matrix-extract-insert.ll (+23-23)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/quant_4x4.ll (+48-48)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/sinking-vs-if-conversion.ll (+7-7)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/slpordering.ll (+8-8)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll (+12-12)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/hoist-load-of-baseptr.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/merge-functions2.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/merge-functions3.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/pixel-splat.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/pr50555.ll (+5-5)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/simplifycfg-late.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/speculation-vs-tbaa.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/spurious-peeling.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll (+24-24)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vec-load-combine.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vec-shift.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vector-reduction-known-first-value.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/basic.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/bitcast-store-branch.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/dce-after-argument-promotion-loads.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/gvn-replacement-vs-hoist.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/loop-access-checks.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/lto-licm.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/pr39282.ll (+10-10)
  • (modified) llvm/test/Transforms/PhaseOrdering/pr98799-inline-simplifycfg-ub.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/scev-custom-dl.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/simplifycfg-options.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/single-iteration-loop-sroa.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll (+2-2)
  • (modified) llvm/test/Transforms/RewriteStatepointsForGC/intrinsics.ll (+12-12)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/gather-cost.ll (+6-6)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll (+2-2)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/getelementptr2.ll (+2-2)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/loadorder.ll (+32-32)
  • (modified) llvm/test/Transforms/SLPVectorizer/WebAssembly/no-vectorize-rotate.ll (+1-1)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/minimum-sizes.ll (+4-4)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/opt.ll (+3-3)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/pr47629-inseltpoison.ll (+131-131)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/pr47629.ll (+131-131)
  • (modified) llvm/test/Transforms/SampleProfile/pseudo-probe-instcombine.ll (+5-5)
  • (modified) llvm/test/Transforms/SimpleLoopUnswitch/AMDGPU/uniform-unswitch.ll (+1-1)
  • (modified) llvm/test/Transforms/SimplifyCFG/Hexagon/switch-to-lookup-table.ll (+1-1)
diff --git a/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c b/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c
index 5422d993ff1575..08ff936a0a797b 100644
--- a/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c
+++ b/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c
@@ -25,13 +25,13 @@ void test1(unsigned char *vqp, unsigned char *vpp, vector unsigned char vc, unsi
 // CHECK-NEXT:    [[TMP2:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 0
 // CHECK-NEXT:    store <16 x i8> [[TMP2]], ptr [[RESP:%.*]], align 16
 // CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 1
-// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 16
+// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 16
 // CHECK-NEXT:    store <16 x i8> [[TMP3]], ptr [[TMP4]], align 16
 // CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 2
-// CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 32
+// CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 32
 // CHECK-NEXT:    store <16 x i8> [[TMP5]], ptr [[TMP6]], align 16
 // CHECK-NEXT:    [[TMP7:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 3
-// CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 48
+// CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 48
 // CHECK-NEXT:    store <16 x i8> [[TMP7]], ptr [[TMP8]], align 16
 // CHECK-NEXT:    ret void
 //
@@ -60,7 +60,7 @@ void test3(unsigned char *vqp, unsigned char *vpp, vector unsigned char vc, unsi
 // CHECK-NEXT:    [[TMP2:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 0
 // CHECK-NEXT:    store <16 x i8> [[TMP2]], ptr [[RESP:%.*]], align 16
 // CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 1
-// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 16
+// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 16
 // CHECK-NEXT:    store <16 x i8> [[TMP3]], ptr [[TMP4]], align 16
 // CHECK-NEXT:    ret void
 //
@@ -1072,7 +1072,7 @@ void test76(unsigned char *vqp, unsigned char *vpp, vector unsigned char vc, uns
 // CHECK-NEXT:    [[TMP2:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 0
 // CHECK-NEXT:    store <16 x i8> [[TMP2]], ptr [[RESP:%.*]], align 16
 // CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 1
-// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 16
+// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 16
 // CHECK-NEXT:    store <16 x i8> [[TMP3]], ptr [[TMP4]], align 16
 // CHECK-NEXT:    ret void
 //
diff --git a/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c b/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c
index 6194c9b1804fb0..2d9629eff3c98c 100644
--- a/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c
+++ b/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c
@@ -48,21 +48,21 @@ void test_indexing(struct Foo *f) {
 
 void test_indexing_2(struct Foo *f) {
   // X64-LABEL: define void @test_indexing_2(ptr noundef %f)
-  // X64: getelementptr inbounds i8, ptr addrspace(1) {{%[0-9]}}, i32 16
-  // X64: getelementptr inbounds i8, ptr {{%[0-9]}}, i64 24
+  // X64: getelementptr inbounds nuw i8, ptr addrspace(1) {{%[0-9]}}, i32 16
+  // X64: getelementptr inbounds nuw i8, ptr {{%[0-9]}}, i64 24
   f->cp64 = ((char *** __ptr32 *)1028)[1][2][3];
   use_foo(f);
 }
 
 unsigned long* test_misc() {
   // X64-LABEL: define ptr @test_misc()
-  // X64: %arrayidx = getelementptr inbounds i8, ptr addrspace(1) %0, i32 88
+  // X64: %arrayidx = getelementptr inbounds nuw i8, ptr addrspace(1) %0, i32 88
   // X64-NEXT: %1 = load ptr, ptr addrspace(1) %arrayidx
-  // X64-NEXT: %arrayidx1 = getelementptr inbounds i8, ptr %1, i64 8
+  // X64-NEXT: %arrayidx1 = getelementptr inbounds nuw i8, ptr %1, i64 8
   // X64-NEXT: %2 = load ptr, ptr %arrayidx1
-  // X64-NEXT: %arrayidx2 = getelementptr inbounds i8, ptr %2, i64 904
+  // X64-NEXT: %arrayidx2 = getelementptr inbounds nuw i8, ptr %2, i64 904
   // X64-NEXT: %3 = load ptr, ptr %arrayidx2
-  // X64-NEXT: %arrayidx3 = getelementptr inbounds i8, ptr %3, i64 1192
+  // X64-NEXT: %arrayidx3 = getelementptr inbounds nuw i8, ptr %3, i64 1192
   unsigned long* x = (unsigned long*)((char***** __ptr32*)1208)[0][11][1][113][149];
   return x;
 }
@@ -71,9 +71,9 @@ char* __ptr32* __ptr32 test_misc_2() {
   // X64-LABEL: define ptr addrspace(1) @test_misc_2()
   // X64: br i1 %cmp, label %if.then, label %if.end
   // X64: %1 = load ptr addrspace(1), ptr inttoptr (i64 16 to ptr)
-  // X64-NEXT: %arrayidx = getelementptr inbounds i8, ptr addrspace(1) %1, i32 544
+  // X64-NEXT: %arrayidx = getelementptr inbounds nuw i8, ptr addrspace(1) %1, i32 544
   // X64-NEXT: %2 = load ptr addrspace(1), ptr addrspace(1) %arrayidx
-  // X64-NEXT: %arrayidx1 = getelementptr inbounds i8, ptr addrspace(1) %2, i32 24
+  // X64-NEXT: %arrayidx1 = getelementptr inbounds nuw i8, ptr addrspace(1) %2, i32 24
   // X64-NEXT: %3 = load ptr addrspace(1), ptr addrspace(1) %arrayidx1
   // X64-NEXT: store ptr addrspace(1) %3, ptr @test_misc_2.res
   // X64: ret ptr addrspace(1)
@@ -88,7 +88,7 @@ unsigned short test_misc_3() {
   // X64-LABEL: define zeroext i16 @test_misc_3()
   // X64: %0 = load ptr addrspace(1), ptr inttoptr (i64 548 to ptr)
   // X64-NEXT: %1 = addrspacecast ptr addrspace(1) %0 to ptr
-  // X64-NEXT: %arrayidx = getelementptr inbounds i8, ptr %1, i64 36
+  // X64-NEXT: %arrayidx = getelementptr inbounds nuw i8, ptr %1, i64 36
   // X64-NEXT: %2 = load i16, ptr %arrayidx, align 2
   // X64-NEXT: ret i16 %2
   unsigned short this_asid = ((unsigned short*)(*(char* __ptr32*)(0x224)))[18];
@@ -97,10 +97,10 @@ unsigned short test_misc_3() {
 
 int test_misc_4() {
   // X64-LABEL: define signext range(i32 0, 2) i32 @test_misc_4()
-  // X64: getelementptr inbounds i8, ptr addrspace(1) {{%[0-9]}}, i32 88
-  // X64: getelementptr inbounds i8, ptr {{%[0-9]}}, i64 8
-  // X64: getelementptr inbounds i8, ptr {{%[0-9]}}, i64 984
-  // X64: getelementptr inbounds i8, ptr %3, i64 80
+  // X64: getelementptr inbounds nuw i8, ptr addrspace(1) {{%[0-9]}}, i32 88
+  // X64: getelementptr inbounds nuw i8, ptr {{%[0-9]}}, i64 8
+  // X64: getelementptr inbounds nuw i8, ptr {{%[0-9]}}, i64 984
+  // X64: getelementptr inbounds nuw i8, ptr %3, i64 80
   // X64: icmp sgt i32 {{.*[0-9]}}, 67240703
   // X64: ret i32
   int a = (*(int*)(80 + ((char**** __ptr32*)1208)[0][11][1][123]) > 0x040202FF);
@@ -189,7 +189,7 @@ int test_function_ptr32_is_32bit() {
 int get_processor_count() {
   // X64-LABEL: define signext range(i32 -128, 128) i32 @get_processor_count()
   // X64: load ptr addrspace(1), ptr inttoptr (i64 16 to ptr)
-  // X64-NEXT: [[ARR_IDX1:%[a-z].*]] = getelementptr inbounds i8, ptr addrspace(1) %0, i32 660
+  // X64-NEXT: [[ARR_IDX1:%[a-z].*]] = getelementptr inbounds nuw i8, ptr addrspace(1) %0, i32 660
   // X64: load ptr addrspace(1), ptr addrspace(1) [[ARR_IDX1]]
   // X64: load i8, ptr addrspace(1) {{%[a-z].*}}
   // X64: sext i8 {{%[0-9]}} to i32
diff --git a/clang/test/CodeGen/aarch64-ls64-inline-asm.c b/clang/test/CodeGen/aarch64-ls64-inline-asm.c
index a01393525bcd42..8aa0684dba14d0 100644
--- a/clang/test/CodeGen/aarch64-ls64-inline-asm.c
+++ b/clang/test/CodeGen/aarch64-ls64-inline-asm.c
@@ -5,7 +5,7 @@ struct foo { unsigned long long x[8]; };
 
 // CHECK-LABEL: @load(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call i512 asm sideeffect "ld64b $0,[$1]", "=r,r,~{memory}"(ptr [[ADDR:%.*]]) #[[ATTR1:[0-9]+]], !srcloc !2
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call i512 asm sideeffect "ld64b $0,[$1]", "=r,r,~{memory}"(ptr [[ADDR:%.*]]) #[[ATTR1:[0-9]+]], !srcloc [[META2:![0-9]+]]
 // CHECK-NEXT:    store i512 [[TMP0]], ptr [[OUTPUT:%.*]], align 8
 // CHECK-NEXT:    ret void
 //
@@ -17,7 +17,7 @@ void load(struct foo *output, void *addr)
 // CHECK-LABEL: @store(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    [[TMP0:%.*]] = load i512, ptr [[INPUT:%.*]], align 8
-// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[TMP0]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc !3
+// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[TMP0]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc [[META3:![0-9]+]]
 // CHECK-NEXT:    ret void
 //
 void store(const struct foo *input, void *addr)
@@ -29,25 +29,25 @@ void store(const struct foo *input, void *addr)
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[IN:%.*]], align 4, !tbaa [[TBAA4:![0-9]+]]
 // CHECK-NEXT:    [[CONV:%.*]] = sext i32 [[TMP0]] to i64
-// CHECK-NEXT:    [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 4
+// CHECK-NEXT:    [[ARRAYIDX1:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 4
 // CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[ARRAYIDX1]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV2:%.*]] = sext i32 [[TMP1]] to i64
-// CHECK-NEXT:    [[ARRAYIDX4:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 16
+// CHECK-NEXT:    [[ARRAYIDX4:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 16
 // CHECK-NEXT:    [[TMP2:%.*]] = load i32, ptr [[ARRAYIDX4]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV5:%.*]] = sext i32 [[TMP2]] to i64
-// CHECK-NEXT:    [[ARRAYIDX7:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 64
+// CHECK-NEXT:    [[ARRAYIDX7:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 64
 // CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[ARRAYIDX7]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV8:%.*]] = sext i32 [[TMP3]] to i64
-// CHECK-NEXT:    [[ARRAYIDX10:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 100
+// CHECK-NEXT:    [[ARRAYIDX10:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 100
 // CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[ARRAYIDX10]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV11:%.*]] = sext i32 [[TMP4]] to i64
-// CHECK-NEXT:    [[ARRAYIDX13:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 144
+// CHECK-NEXT:    [[ARRAYIDX13:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 144
 // CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[ARRAYIDX13]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV14:%.*]] = sext i32 [[TMP5]] to i64
-// CHECK-NEXT:    [[ARRAYIDX16:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 196
+// CHECK-NEXT:    [[ARRAYIDX16:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 196
 // CHECK-NEXT:    [[TMP6:%.*]] = load i32, ptr [[ARRAYIDX16]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV17:%.*]] = sext i32 [[TMP6]] to i64
-// CHECK-NEXT:    [[ARRAYIDX19:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 256
+// CHECK-NEXT:    [[ARRAYIDX19:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 256
 // CHECK-NEXT:    [[TMP7:%.*]] = load i32, ptr [[ARRAYIDX19]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV20:%.*]] = sext i32 [[TMP7]] to i64
 // CHECK-NEXT:    [[S_SROA_10_0_INSERT_EXT:%.*]] = zext i64 [[CONV20]] to i512
@@ -72,7 +72,7 @@ void store(const struct foo *input, void *addr)
 // CHECK-NEXT:    [[S_SROA_0_0_INSERT_EXT:%.*]] = zext i64 [[CONV]] to i512
 // CHECK-NEXT:    [[S_SROA_0_0_INSERT_MASK:%.*]] = or disjoint i512 [[S_SROA_4_0_INSERT_MASK]], [[S_SROA_4_0_INSERT_SHIFT]]
 // CHECK-NEXT:    [[S_SROA_0_0_INSERT_INSERT:%.*]] = or i512 [[S_SROA_0_0_INSERT_MASK]], [[S_SROA_0_0_INSERT_EXT]]
-// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[S_SROA_0_0_INSERT_INSERT]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc !8
+// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[S_SROA_0_0_INSERT_INSERT]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc [[META8:![0-9]+]]
 // CHECK-NEXT:    ret void
 //
 void store2(int *in, void *addr)
diff --git a/clang/test/CodeGen/arm64_32-vaarg.c b/clang/test/CodeGen/arm64_32-vaarg.c
index 3f1f4443436da1..72c23d4967d2d3 100644
--- a/clang/test/CodeGen/arm64_32-vaarg.c
+++ b/clang/test/CodeGen/arm64_32-vaarg.c
@@ -10,7 +10,7 @@ typedef struct {
 int test_int(OneInt input, va_list *mylist) {
 // CHECK-LABEL: define{{.*}} i32 @test_int(i32 %input
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 4
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 4
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[RES:%.*]] = load i32, ptr [[START]]
@@ -28,9 +28,9 @@ typedef struct {
 long long test_longlong(OneLongLong input, va_list *mylist) {
   // CHECK-LABEL: define{{.*}} i64 @test_longlong(i64 %input
   // CHECK: [[STARTPTR:%.*]] = load ptr, ptr %mylist
-  // CHECK: [[ALIGN_TMP:%.+]] = getelementptr inbounds i8, ptr [[STARTPTR]], i32 7
+  // CHECK: [[ALIGN_TMP:%.+]] = getelementptr inbounds nuw i8, ptr [[STARTPTR]], i32 7
   // CHECK: [[ALIGNED_ADDR:%.+]] = tail call align 8 ptr @llvm.ptrmask.p0.i32(ptr nonnull [[ALIGN_TMP]], i32 -8)
-  // CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[ALIGNED_ADDR]], i32 8
+  // CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[ALIGNED_ADDR]], i32 8
   // CHECK: store ptr [[NEXT]], ptr %mylist
 
   // CHECK: [[RES:%.*]] = load i64, ptr [[ALIGNED_ADDR]]
@@ -49,7 +49,7 @@ float test_hfa(va_list *mylist) {
 // CHECK-LABEL: define{{.*}} float @test_hfa
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
 
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 16
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 16
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[RES:%.*]] = load float, ptr [[START]]
@@ -76,7 +76,7 @@ typedef struct {
 long long test_bigstruct(BigStruct input, va_list *mylist) {
 // CHECK-LABEL: define{{.*}} i64 @test_bigstruct(ptr
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 4
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 4
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[ADDR:%.*]] = load ptr, ptr [[START]]
@@ -97,7 +97,7 @@ short test_threeshorts(ThreeShorts input, va_list *mylist) {
 // CHECK-LABEL: define{{.*}} signext i16 @test_threeshorts([2 x i32] %input
 
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 8
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 8
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[RES:%.*]] = load i16, ptr [[START]]
diff --git a/clang/test/CodeGen/attr-counted-by-pr110385.c b/clang/test/CodeGen/attr-counted-by-pr110385.c
index e120dcc583578d..c2ff032334fe27 100644
--- a/clang/test/CodeGen/attr-counted-by-pr110385.c
+++ b/clang/test/CodeGen/attr-counted-by-pr110385.c
@@ -31,7 +31,7 @@ void init(void * __attribute__((pass_dynamic_object_size(0))));
 // CHECK-NEXT:    [[GROWABLE:%.*]] = getelementptr inbounds nuw i8, ptr [[FOO]], i64 8
 // CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[GROWABLE]], align 8, !tbaa [[TBAA2:![0-9]+]]
 // CHECK-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP0]], i64 12
-// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 8
+// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP0]], i64 8
 // CHECK-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // CHECK-NEXT:    [[TMP1:%.*]] = sext i32 [[DOT_COUNTED_BY_LOAD]] to i64
 // CHECK-NEXT:    [[TMP2:%.*]] = shl nsw i64 [[TMP1]], 1
@@ -48,7 +48,7 @@ void test1(struct bucket *foo) {
 // CHECK-SAME: ptr noundef [[FOO:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[FOO]], i64 16
-// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[FOO]], i64 12
+// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[FOO]], i64 12
 // CHECK-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = sext i32 [[DOT_COUNTED_BY_LOAD]] to i64
 // CHECK-NEXT:    [[TMP1:%.*]] = shl nsw i64 [[TMP0]], 1
diff --git a/clang/test/CodeGen/attr-counted-by.c b/clang/test/CodeGen/attr-counted-by.c
index 4a130c5e3d401f..1028bffaf896d7 100644
--- a/clang/test/CodeGen/attr-counted-by.c
+++ b/clang/test/CodeGen/attr-counted-by.c
@@ -60,13 +60,13 @@ struct anon_struct {
 // SANITIZE-WITH-ATTR-SAME: ptr noundef [[P:%.*]], i32 noundef [[INDEX:%.*]], i32 noundef [[VAL:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SANITIZE-WITH-ATTR-NEXT:  entry:
 // SANITIZE-WITH-ATTR-NEXT:    [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// SANITIZE-WITH-ATTR-NEXT:    [[DOTCOUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8
+// SANITIZE-WITH-ATTR-NEXT:    [[DOTCOUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 8
 // SANITIZE-WITH-ATTR-NEXT:    [[DOTCOUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOTCOUNTED_BY_GEP]], align 4
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = zext i32 [[DOTCOUNTED_BY_LOAD]] to i64, !nosanitize [[META2:![0-9]+]]
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP1:%.*]] = icmp ult i64 [[IDXPROM]], [[TMP0]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    br i1 [[TMP1]], label [[CONT3:%.*]], label [[HANDLER_OUT_OF_BOUNDS:%.*]], !prof [[PROF3:![0-9]+]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       handler.out_of_bounds:
-// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[IDXPROM]]) #[[ATTR10:[0-9]+]], !nosanitize [[META2]]
+// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[IDXPROM]]) #[[ATTR9:[0-9]+]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    unreachable, !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       cont3:
 // SANITIZE-WITH-ATTR-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
@@ -108,13 +108,13 @@ void test1(struct annotated *p, int index, int val) {
 // SANITIZE-WITH-ATTR-LABEL: define dso_local void @test2(
 // SANITIZE-WITH-ATTR-SAME: ptr noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SANITIZE-WITH-ATTR-NEXT:  entry:
-// SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8
+// SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 8
 // SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = zext i32 [[DOT_COUNTED_BY_LOAD]] to i64, !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP1:%.*]] = icmp ult i64 [[INDEX]], [[TMP0]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    br i1 [[TMP1]], label [[CONT3:%.*]], label [[HANDLER_OUT_OF_BOUNDS:%.*]], !prof [[PROF3]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       handler.out_of_bounds:
-// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB3:[0-9]+]], i64 [[INDEX]]) #[[ATTR10]], !nosanitize [[META2]]
+// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB3:[0-9]+]], i64 [[INDEX]]) #[[ATTR9]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    unreachable, !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       cont3:
 // SANITIZE-WITH-ATTR-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
@@ -128,7 +128,7 @@ void test1(struct annotated *p, int index, int val) {
 // NO-SANITIZE-WITH-ATTR-LABEL: define dso_local void @test2(
 // NO-SANITIZE-WITH-ATTR-SAME: ptr nocapture noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR1:[0-9]+]] {
 // NO-SANITIZE-WITH-ATTR-NEXT:  entry:
-// NO-SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8
+// NO-SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 8
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = shl i32 [[DOT_COUNTED_BY_LOAD]], 2
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[DO...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 4, 2024

@llvm/pr-subscribers-llvm-transforms

Author: Nikita Popov (nikic)

Changes

If the gep is nusw (usually via inbounds) and the offset is non-negative, we can infer nuw.

Unfortunately this inference does have some compile-time overhead: https://llvm-compile-time-tracker.com/compare.php?from=37e5319a12ba47c18049728804d3d1e1b10c4eb4&amp;to=af56d73d6543f05b1e5205b96934e2427bb24d72&amp;stat=instructions:u

Proof: https://alive2.llvm.org/ce/z/ihztLy


Patch is 1.28 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111144.diff

183 Files Affected:

  • (modified) clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c (+5-5)
  • (modified) clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c (+14-14)
  • (modified) clang/test/CodeGen/aarch64-ls64-inline-asm.c (+10-10)
  • (modified) clang/test/CodeGen/arm64_32-vaarg.c (+6-6)
  • (modified) clang/test/CodeGen/attr-counted-by-pr110385.c (+2-2)
  • (modified) clang/test/CodeGen/attr-counted-by.c (+106-110)
  • (modified) clang/test/CodeGen/math-libcalls-tbaa.c (+7-7)
  • (modified) clang/test/CodeGen/union-tbaa1.c (+2-2)
  • (modified) clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu (+3-3)
  • (modified) clang/test/CodeGenCXX/auto-var-init.cpp (+4-4)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-dynamic-cast.cpp (+9-9)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-typeid.cpp (+1-1)
  • (modified) clang/test/CodeGenOpenCL/amdgpu-nullptr.cl (+2-2)
  • (modified) clang/test/CodeGenOpenCL/builtins-amdgcn.cl (+4-4)
  • (modified) clang/test/CodeGenOpenCLCXX/array-type-infinite-loop.clcpp (+12-12)
  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+9)
  • (modified) llvm/test/Analysis/BasicAA/featuretest.ll (+1-1)
  • (modified) llvm/test/Analysis/ValueTracking/phi-known-bits.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/implicit-arg-v5-opt.ll (+6-6)
  • (modified) llvm/test/CodeGen/AMDGPU/reqd-work-group-size.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/vector-alloca-bitcast.ll (+6-6)
  • (modified) llvm/test/Transforms/Coroutines/coro-async.ll (+12-12)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-alloca-opaque-ptr.ll (+1-1)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-alloca.ll (+1-1)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-once-value.ll (+3-3)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-resume-values.ll (+10-10)
  • (modified) llvm/test/Transforms/Coroutines/coro-swifterror.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/2007-03-25-BadShiftMask.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/2009-01-08-AlignAlloca.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/X86/x86-addsub-inseltpoison.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/X86/x86-addsub.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/array.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/assume-align.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/assume-loop-align.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/assume-redundant.ll (+5-1)
  • (modified) llvm/test/Transforms/InstCombine/assume.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/call-cast-target.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/cast_phi.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/cast_ptr.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/catchswitch-phi.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/compare-alloca.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/compare-unescaped.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/dependent-ivs.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/extractvalue.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/fmul.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/fsh.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/gep-addrspace.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/gep-canonicalize-constant-indices.ll (+9-9)
  • (modified) llvm/test/Transforms/InstCombine/gep-combine-loop-invariant.ll (+14-14)
  • (modified) llvm/test/Transforms/InstCombine/gep-merge-constant-indices.ll (+12-12)
  • (modified) llvm/test/Transforms/InstCombine/gep-vector-indices.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/gepphigep.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/getelementptr.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/icmp-custom-dl.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/icmp-gep.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/icmp.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/inbounds-gep.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/indexed-gep-compares.ll (+9-5)
  • (modified) llvm/test/Transforms/InstCombine/intptr1.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/intptr7.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/load-bitcast-select.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/mem-par-metadata-memcpy.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/memccpy.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/memcpy_alloca.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/mempcpy.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/memset2.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/opaque-ptr.ll (+10-6)
  • (modified) llvm/test/Transforms/InstCombine/phi-equal-incoming-pointers.ll (+14-14)
  • (modified) llvm/test/Transforms/InstCombine/phi-timeout.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/phi.ll (+22-22)
  • (modified) llvm/test/Transforms/InstCombine/ptr-replace-alloca.ll (+8-8)
  • (modified) llvm/test/Transforms/InstCombine/ptrmask.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/remove-loop-phi-multiply-by-zero.ll (+10-10)
  • (modified) llvm/test/Transforms/InstCombine/select-cmp-br.ll (+8-8)
  • (modified) llvm/test/Transforms/InstCombine/select-gep.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/sink_sideeffecting_instruction.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/snprintf-2.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/snprintf-3.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/snprintf-4.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/snprintf.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/sprintf-1.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/stpncpy-1.ll (+32-32)
  • (modified) llvm/test/Transforms/InstCombine/str-int.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/strlcpy-1.ll (+14-14)
  • (modified) llvm/test/Transforms/InstCombine/strlen-1.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/struct-assign-tbaa-2.ll (+4-2)
  • (modified) llvm/test/Transforms/InstCombine/sub.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/unpack-fca.ll (+26-26)
  • (modified) llvm/test/Transforms/InstCombine/vec_gep_scalar_arg-inseltpoison.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/vec_gep_scalar_arg.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/vscale_gep.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-1.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-3.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-5.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll (+24-24)
  • (modified) llvm/test/Transforms/LoopUnroll/ARM/upperbound.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopUnroll/WebAssembly/basic-unrolling.ll (+31-31)
  • (modified) llvm/test/Transforms/LoopUnroll/peel-loop.ll (+6-6)
  • (modified) llvm/test/Transforms/LoopUnroll/runtime-unroll-remainder.ll (+8-8)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/deterministic-type-shrinkage.ll (+54-54)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-cond-inv-loads.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-gather-scatter.ll (+25-25)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+198-198)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-masked-accesses.ll (+95-95)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-phi.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-epilogue.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-no-scalar-interleave.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-too-many-deps.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt.ll (+14-14)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/uniform-args-call-variants.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/AMDGPU/packed-math.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/ARM/mve-reductions.ll (+12-12)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll (+36-36)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/interleaving.ll (+44-44)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/invariant-load-gather.ll (+24-24)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll (+51-51)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/metadata-enable.ll (+474-474)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/pr23997.ll (+8-8)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll (+325-325)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll (+22-22)
  • (modified) llvm/test/Transforms/LoopVectorize/extract-last-veclane.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/float-induction.ll (+150-150)
  • (modified) llvm/test/Transforms/LoopVectorize/forked-pointers.ll (+15-15)
  • (modified) llvm/test/Transforms/LoopVectorize/histograms.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/induction.ll (+42-42)
  • (modified) llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVectorize/invariant-store-vectorization-2.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/invariant-store-vectorization.ll (+11-11)
  • (modified) llvm/test/Transforms/LoopVectorize/loop-scalars.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction-inloop-uf4.ll (+229-229)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction-inloop.ll (+4-4)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/runtime-check.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/scalar_after_vectorization.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/trunc-reductions.ll (+92-32)
  • (modified) llvm/test/Transforms/LoopVectorize/vector-geps.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVersioningLICM/loopversioningLICM1.ll (+13-13)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/hoist-runtime-checks.ll (+5-5)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/hoisting-sinking-required-for-vectorization.ll (+10-10)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/indvars-vectorization.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/loopflatten.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/matrix-extract-insert.ll (+23-23)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/quant_4x4.ll (+48-48)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/sinking-vs-if-conversion.ll (+7-7)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/slpordering.ll (+8-8)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll (+12-12)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/hoist-load-of-baseptr.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/merge-functions2.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/merge-functions3.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/pixel-splat.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/pr50555.ll (+5-5)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/simplifycfg-late.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/speculation-vs-tbaa.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/spurious-peeling.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll (+24-24)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vec-load-combine.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vec-shift.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vector-reduction-known-first-value.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/basic.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/bitcast-store-branch.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/dce-after-argument-promotion-loads.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/gvn-replacement-vs-hoist.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/loop-access-checks.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/lto-licm.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/pr39282.ll (+10-10)
  • (modified) llvm/test/Transforms/PhaseOrdering/pr98799-inline-simplifycfg-ub.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/scev-custom-dl.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/simplifycfg-options.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/single-iteration-loop-sroa.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll (+2-2)
  • (modified) llvm/test/Transforms/RewriteStatepointsForGC/intrinsics.ll (+12-12)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/gather-cost.ll (+6-6)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll (+2-2)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/getelementptr2.ll (+2-2)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/loadorder.ll (+32-32)
  • (modified) llvm/test/Transforms/SLPVectorizer/WebAssembly/no-vectorize-rotate.ll (+1-1)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/minimum-sizes.ll (+4-4)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/opt.ll (+3-3)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/pr47629-inseltpoison.ll (+131-131)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/pr47629.ll (+131-131)
  • (modified) llvm/test/Transforms/SampleProfile/pseudo-probe-instcombine.ll (+5-5)
  • (modified) llvm/test/Transforms/SimpleLoopUnswitch/AMDGPU/uniform-unswitch.ll (+1-1)
  • (modified) llvm/test/Transforms/SimplifyCFG/Hexagon/switch-to-lookup-table.ll (+1-1)
diff --git a/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c b/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c
index 5422d993ff1575..08ff936a0a797b 100644
--- a/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c
+++ b/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c
@@ -25,13 +25,13 @@ void test1(unsigned char *vqp, unsigned char *vpp, vector unsigned char vc, unsi
 // CHECK-NEXT:    [[TMP2:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 0
 // CHECK-NEXT:    store <16 x i8> [[TMP2]], ptr [[RESP:%.*]], align 16
 // CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 1
-// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 16
+// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 16
 // CHECK-NEXT:    store <16 x i8> [[TMP3]], ptr [[TMP4]], align 16
 // CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 2
-// CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 32
+// CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 32
 // CHECK-NEXT:    store <16 x i8> [[TMP5]], ptr [[TMP6]], align 16
 // CHECK-NEXT:    [[TMP7:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 3
-// CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 48
+// CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 48
 // CHECK-NEXT:    store <16 x i8> [[TMP7]], ptr [[TMP8]], align 16
 // CHECK-NEXT:    ret void
 //
@@ -60,7 +60,7 @@ void test3(unsigned char *vqp, unsigned char *vpp, vector unsigned char vc, unsi
 // CHECK-NEXT:    [[TMP2:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 0
 // CHECK-NEXT:    store <16 x i8> [[TMP2]], ptr [[RESP:%.*]], align 16
 // CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 1
-// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 16
+// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 16
 // CHECK-NEXT:    store <16 x i8> [[TMP3]], ptr [[TMP4]], align 16
 // CHECK-NEXT:    ret void
 //
@@ -1072,7 +1072,7 @@ void test76(unsigned char *vqp, unsigned char *vpp, vector unsigned char vc, uns
 // CHECK-NEXT:    [[TMP2:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 0
 // CHECK-NEXT:    store <16 x i8> [[TMP2]], ptr [[RESP:%.*]], align 16
 // CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 1
-// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 16
+// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 16
 // CHECK-NEXT:    store <16 x i8> [[TMP3]], ptr [[TMP4]], align 16
 // CHECK-NEXT:    ret void
 //
diff --git a/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c b/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c
index 6194c9b1804fb0..2d9629eff3c98c 100644
--- a/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c
+++ b/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c
@@ -48,21 +48,21 @@ void test_indexing(struct Foo *f) {
 
 void test_indexing_2(struct Foo *f) {
   // X64-LABEL: define void @test_indexing_2(ptr noundef %f)
-  // X64: getelementptr inbounds i8, ptr addrspace(1) {{%[0-9]}}, i32 16
-  // X64: getelementptr inbounds i8, ptr {{%[0-9]}}, i64 24
+  // X64: getelementptr inbounds nuw i8, ptr addrspace(1) {{%[0-9]}}, i32 16
+  // X64: getelementptr inbounds nuw i8, ptr {{%[0-9]}}, i64 24
   f->cp64 = ((char *** __ptr32 *)1028)[1][2][3];
   use_foo(f);
 }
 
 unsigned long* test_misc() {
   // X64-LABEL: define ptr @test_misc()
-  // X64: %arrayidx = getelementptr inbounds i8, ptr addrspace(1) %0, i32 88
+  // X64: %arrayidx = getelementptr inbounds nuw i8, ptr addrspace(1) %0, i32 88
   // X64-NEXT: %1 = load ptr, ptr addrspace(1) %arrayidx
-  // X64-NEXT: %arrayidx1 = getelementptr inbounds i8, ptr %1, i64 8
+  // X64-NEXT: %arrayidx1 = getelementptr inbounds nuw i8, ptr %1, i64 8
   // X64-NEXT: %2 = load ptr, ptr %arrayidx1
-  // X64-NEXT: %arrayidx2 = getelementptr inbounds i8, ptr %2, i64 904
+  // X64-NEXT: %arrayidx2 = getelementptr inbounds nuw i8, ptr %2, i64 904
   // X64-NEXT: %3 = load ptr, ptr %arrayidx2
-  // X64-NEXT: %arrayidx3 = getelementptr inbounds i8, ptr %3, i64 1192
+  // X64-NEXT: %arrayidx3 = getelementptr inbounds nuw i8, ptr %3, i64 1192
   unsigned long* x = (unsigned long*)((char***** __ptr32*)1208)[0][11][1][113][149];
   return x;
 }
@@ -71,9 +71,9 @@ char* __ptr32* __ptr32 test_misc_2() {
   // X64-LABEL: define ptr addrspace(1) @test_misc_2()
   // X64: br i1 %cmp, label %if.then, label %if.end
   // X64: %1 = load ptr addrspace(1), ptr inttoptr (i64 16 to ptr)
-  // X64-NEXT: %arrayidx = getelementptr inbounds i8, ptr addrspace(1) %1, i32 544
+  // X64-NEXT: %arrayidx = getelementptr inbounds nuw i8, ptr addrspace(1) %1, i32 544
   // X64-NEXT: %2 = load ptr addrspace(1), ptr addrspace(1) %arrayidx
-  // X64-NEXT: %arrayidx1 = getelementptr inbounds i8, ptr addrspace(1) %2, i32 24
+  // X64-NEXT: %arrayidx1 = getelementptr inbounds nuw i8, ptr addrspace(1) %2, i32 24
   // X64-NEXT: %3 = load ptr addrspace(1), ptr addrspace(1) %arrayidx1
   // X64-NEXT: store ptr addrspace(1) %3, ptr @test_misc_2.res
   // X64: ret ptr addrspace(1)
@@ -88,7 +88,7 @@ unsigned short test_misc_3() {
   // X64-LABEL: define zeroext i16 @test_misc_3()
   // X64: %0 = load ptr addrspace(1), ptr inttoptr (i64 548 to ptr)
   // X64-NEXT: %1 = addrspacecast ptr addrspace(1) %0 to ptr
-  // X64-NEXT: %arrayidx = getelementptr inbounds i8, ptr %1, i64 36
+  // X64-NEXT: %arrayidx = getelementptr inbounds nuw i8, ptr %1, i64 36
   // X64-NEXT: %2 = load i16, ptr %arrayidx, align 2
   // X64-NEXT: ret i16 %2
   unsigned short this_asid = ((unsigned short*)(*(char* __ptr32*)(0x224)))[18];
@@ -97,10 +97,10 @@ unsigned short test_misc_3() {
 
 int test_misc_4() {
   // X64-LABEL: define signext range(i32 0, 2) i32 @test_misc_4()
-  // X64: getelementptr inbounds i8, ptr addrspace(1) {{%[0-9]}}, i32 88
-  // X64: getelementptr inbounds i8, ptr {{%[0-9]}}, i64 8
-  // X64: getelementptr inbounds i8, ptr {{%[0-9]}}, i64 984
-  // X64: getelementptr inbounds i8, ptr %3, i64 80
+  // X64: getelementptr inbounds nuw i8, ptr addrspace(1) {{%[0-9]}}, i32 88
+  // X64: getelementptr inbounds nuw i8, ptr {{%[0-9]}}, i64 8
+  // X64: getelementptr inbounds nuw i8, ptr {{%[0-9]}}, i64 984
+  // X64: getelementptr inbounds nuw i8, ptr %3, i64 80
   // X64: icmp sgt i32 {{.*[0-9]}}, 67240703
   // X64: ret i32
   int a = (*(int*)(80 + ((char**** __ptr32*)1208)[0][11][1][123]) > 0x040202FF);
@@ -189,7 +189,7 @@ int test_function_ptr32_is_32bit() {
 int get_processor_count() {
   // X64-LABEL: define signext range(i32 -128, 128) i32 @get_processor_count()
   // X64: load ptr addrspace(1), ptr inttoptr (i64 16 to ptr)
-  // X64-NEXT: [[ARR_IDX1:%[a-z].*]] = getelementptr inbounds i8, ptr addrspace(1) %0, i32 660
+  // X64-NEXT: [[ARR_IDX1:%[a-z].*]] = getelementptr inbounds nuw i8, ptr addrspace(1) %0, i32 660
   // X64: load ptr addrspace(1), ptr addrspace(1) [[ARR_IDX1]]
   // X64: load i8, ptr addrspace(1) {{%[a-z].*}}
   // X64: sext i8 {{%[0-9]}} to i32
diff --git a/clang/test/CodeGen/aarch64-ls64-inline-asm.c b/clang/test/CodeGen/aarch64-ls64-inline-asm.c
index a01393525bcd42..8aa0684dba14d0 100644
--- a/clang/test/CodeGen/aarch64-ls64-inline-asm.c
+++ b/clang/test/CodeGen/aarch64-ls64-inline-asm.c
@@ -5,7 +5,7 @@ struct foo { unsigned long long x[8]; };
 
 // CHECK-LABEL: @load(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call i512 asm sideeffect "ld64b $0,[$1]", "=r,r,~{memory}"(ptr [[ADDR:%.*]]) #[[ATTR1:[0-9]+]], !srcloc !2
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call i512 asm sideeffect "ld64b $0,[$1]", "=r,r,~{memory}"(ptr [[ADDR:%.*]]) #[[ATTR1:[0-9]+]], !srcloc [[META2:![0-9]+]]
 // CHECK-NEXT:    store i512 [[TMP0]], ptr [[OUTPUT:%.*]], align 8
 // CHECK-NEXT:    ret void
 //
@@ -17,7 +17,7 @@ void load(struct foo *output, void *addr)
 // CHECK-LABEL: @store(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    [[TMP0:%.*]] = load i512, ptr [[INPUT:%.*]], align 8
-// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[TMP0]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc !3
+// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[TMP0]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc [[META3:![0-9]+]]
 // CHECK-NEXT:    ret void
 //
 void store(const struct foo *input, void *addr)
@@ -29,25 +29,25 @@ void store(const struct foo *input, void *addr)
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[IN:%.*]], align 4, !tbaa [[TBAA4:![0-9]+]]
 // CHECK-NEXT:    [[CONV:%.*]] = sext i32 [[TMP0]] to i64
-// CHECK-NEXT:    [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 4
+// CHECK-NEXT:    [[ARRAYIDX1:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 4
 // CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[ARRAYIDX1]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV2:%.*]] = sext i32 [[TMP1]] to i64
-// CHECK-NEXT:    [[ARRAYIDX4:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 16
+// CHECK-NEXT:    [[ARRAYIDX4:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 16
 // CHECK-NEXT:    [[TMP2:%.*]] = load i32, ptr [[ARRAYIDX4]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV5:%.*]] = sext i32 [[TMP2]] to i64
-// CHECK-NEXT:    [[ARRAYIDX7:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 64
+// CHECK-NEXT:    [[ARRAYIDX7:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 64
 // CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[ARRAYIDX7]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV8:%.*]] = sext i32 [[TMP3]] to i64
-// CHECK-NEXT:    [[ARRAYIDX10:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 100
+// CHECK-NEXT:    [[ARRAYIDX10:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 100
 // CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[ARRAYIDX10]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV11:%.*]] = sext i32 [[TMP4]] to i64
-// CHECK-NEXT:    [[ARRAYIDX13:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 144
+// CHECK-NEXT:    [[ARRAYIDX13:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 144
 // CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[ARRAYIDX13]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV14:%.*]] = sext i32 [[TMP5]] to i64
-// CHECK-NEXT:    [[ARRAYIDX16:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 196
+// CHECK-NEXT:    [[ARRAYIDX16:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 196
 // CHECK-NEXT:    [[TMP6:%.*]] = load i32, ptr [[ARRAYIDX16]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV17:%.*]] = sext i32 [[TMP6]] to i64
-// CHECK-NEXT:    [[ARRAYIDX19:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 256
+// CHECK-NEXT:    [[ARRAYIDX19:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 256
 // CHECK-NEXT:    [[TMP7:%.*]] = load i32, ptr [[ARRAYIDX19]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV20:%.*]] = sext i32 [[TMP7]] to i64
 // CHECK-NEXT:    [[S_SROA_10_0_INSERT_EXT:%.*]] = zext i64 [[CONV20]] to i512
@@ -72,7 +72,7 @@ void store(const struct foo *input, void *addr)
 // CHECK-NEXT:    [[S_SROA_0_0_INSERT_EXT:%.*]] = zext i64 [[CONV]] to i512
 // CHECK-NEXT:    [[S_SROA_0_0_INSERT_MASK:%.*]] = or disjoint i512 [[S_SROA_4_0_INSERT_MASK]], [[S_SROA_4_0_INSERT_SHIFT]]
 // CHECK-NEXT:    [[S_SROA_0_0_INSERT_INSERT:%.*]] = or i512 [[S_SROA_0_0_INSERT_MASK]], [[S_SROA_0_0_INSERT_EXT]]
-// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[S_SROA_0_0_INSERT_INSERT]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc !8
+// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[S_SROA_0_0_INSERT_INSERT]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc [[META8:![0-9]+]]
 // CHECK-NEXT:    ret void
 //
 void store2(int *in, void *addr)
diff --git a/clang/test/CodeGen/arm64_32-vaarg.c b/clang/test/CodeGen/arm64_32-vaarg.c
index 3f1f4443436da1..72c23d4967d2d3 100644
--- a/clang/test/CodeGen/arm64_32-vaarg.c
+++ b/clang/test/CodeGen/arm64_32-vaarg.c
@@ -10,7 +10,7 @@ typedef struct {
 int test_int(OneInt input, va_list *mylist) {
 // CHECK-LABEL: define{{.*}} i32 @test_int(i32 %input
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 4
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 4
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[RES:%.*]] = load i32, ptr [[START]]
@@ -28,9 +28,9 @@ typedef struct {
 long long test_longlong(OneLongLong input, va_list *mylist) {
   // CHECK-LABEL: define{{.*}} i64 @test_longlong(i64 %input
   // CHECK: [[STARTPTR:%.*]] = load ptr, ptr %mylist
-  // CHECK: [[ALIGN_TMP:%.+]] = getelementptr inbounds i8, ptr [[STARTPTR]], i32 7
+  // CHECK: [[ALIGN_TMP:%.+]] = getelementptr inbounds nuw i8, ptr [[STARTPTR]], i32 7
   // CHECK: [[ALIGNED_ADDR:%.+]] = tail call align 8 ptr @llvm.ptrmask.p0.i32(ptr nonnull [[ALIGN_TMP]], i32 -8)
-  // CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[ALIGNED_ADDR]], i32 8
+  // CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[ALIGNED_ADDR]], i32 8
   // CHECK: store ptr [[NEXT]], ptr %mylist
 
   // CHECK: [[RES:%.*]] = load i64, ptr [[ALIGNED_ADDR]]
@@ -49,7 +49,7 @@ float test_hfa(va_list *mylist) {
 // CHECK-LABEL: define{{.*}} float @test_hfa
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
 
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 16
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 16
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[RES:%.*]] = load float, ptr [[START]]
@@ -76,7 +76,7 @@ typedef struct {
 long long test_bigstruct(BigStruct input, va_list *mylist) {
 // CHECK-LABEL: define{{.*}} i64 @test_bigstruct(ptr
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 4
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 4
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[ADDR:%.*]] = load ptr, ptr [[START]]
@@ -97,7 +97,7 @@ short test_threeshorts(ThreeShorts input, va_list *mylist) {
 // CHECK-LABEL: define{{.*}} signext i16 @test_threeshorts([2 x i32] %input
 
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 8
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 8
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[RES:%.*]] = load i16, ptr [[START]]
diff --git a/clang/test/CodeGen/attr-counted-by-pr110385.c b/clang/test/CodeGen/attr-counted-by-pr110385.c
index e120dcc583578d..c2ff032334fe27 100644
--- a/clang/test/CodeGen/attr-counted-by-pr110385.c
+++ b/clang/test/CodeGen/attr-counted-by-pr110385.c
@@ -31,7 +31,7 @@ void init(void * __attribute__((pass_dynamic_object_size(0))));
 // CHECK-NEXT:    [[GROWABLE:%.*]] = getelementptr inbounds nuw i8, ptr [[FOO]], i64 8
 // CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[GROWABLE]], align 8, !tbaa [[TBAA2:![0-9]+]]
 // CHECK-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP0]], i64 12
-// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 8
+// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP0]], i64 8
 // CHECK-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // CHECK-NEXT:    [[TMP1:%.*]] = sext i32 [[DOT_COUNTED_BY_LOAD]] to i64
 // CHECK-NEXT:    [[TMP2:%.*]] = shl nsw i64 [[TMP1]], 1
@@ -48,7 +48,7 @@ void test1(struct bucket *foo) {
 // CHECK-SAME: ptr noundef [[FOO:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[FOO]], i64 16
-// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[FOO]], i64 12
+// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[FOO]], i64 12
 // CHECK-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = sext i32 [[DOT_COUNTED_BY_LOAD]] to i64
 // CHECK-NEXT:    [[TMP1:%.*]] = shl nsw i64 [[TMP0]], 1
diff --git a/clang/test/CodeGen/attr-counted-by.c b/clang/test/CodeGen/attr-counted-by.c
index 4a130c5e3d401f..1028bffaf896d7 100644
--- a/clang/test/CodeGen/attr-counted-by.c
+++ b/clang/test/CodeGen/attr-counted-by.c
@@ -60,13 +60,13 @@ struct anon_struct {
 // SANITIZE-WITH-ATTR-SAME: ptr noundef [[P:%.*]], i32 noundef [[INDEX:%.*]], i32 noundef [[VAL:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SANITIZE-WITH-ATTR-NEXT:  entry:
 // SANITIZE-WITH-ATTR-NEXT:    [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// SANITIZE-WITH-ATTR-NEXT:    [[DOTCOUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8
+// SANITIZE-WITH-ATTR-NEXT:    [[DOTCOUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 8
 // SANITIZE-WITH-ATTR-NEXT:    [[DOTCOUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOTCOUNTED_BY_GEP]], align 4
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = zext i32 [[DOTCOUNTED_BY_LOAD]] to i64, !nosanitize [[META2:![0-9]+]]
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP1:%.*]] = icmp ult i64 [[IDXPROM]], [[TMP0]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    br i1 [[TMP1]], label [[CONT3:%.*]], label [[HANDLER_OUT_OF_BOUNDS:%.*]], !prof [[PROF3:![0-9]+]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       handler.out_of_bounds:
-// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[IDXPROM]]) #[[ATTR10:[0-9]+]], !nosanitize [[META2]]
+// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[IDXPROM]]) #[[ATTR9:[0-9]+]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    unreachable, !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       cont3:
 // SANITIZE-WITH-ATTR-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
@@ -108,13 +108,13 @@ void test1(struct annotated *p, int index, int val) {
 // SANITIZE-WITH-ATTR-LABEL: define dso_local void @test2(
 // SANITIZE-WITH-ATTR-SAME: ptr noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SANITIZE-WITH-ATTR-NEXT:  entry:
-// SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8
+// SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 8
 // SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = zext i32 [[DOT_COUNTED_BY_LOAD]] to i64, !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP1:%.*]] = icmp ult i64 [[INDEX]], [[TMP0]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    br i1 [[TMP1]], label [[CONT3:%.*]], label [[HANDLER_OUT_OF_BOUNDS:%.*]], !prof [[PROF3]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       handler.out_of_bounds:
-// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB3:[0-9]+]], i64 [[INDEX]]) #[[ATTR10]], !nosanitize [[META2]]
+// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB3:[0-9]+]], i64 [[INDEX]]) #[[ATTR9]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    unreachable, !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       cont3:
 // SANITIZE-WITH-ATTR-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
@@ -128,7 +128,7 @@ void test1(struct annotated *p, int index, int val) {
 // NO-SANITIZE-WITH-ATTR-LABEL: define dso_local void @test2(
 // NO-SANITIZE-WITH-ATTR-SAME: ptr nocapture noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR1:[0-9]+]] {
 // NO-SANITIZE-WITH-ATTR-NEXT:  entry:
-// NO-SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8
+// NO-SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 8
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = shl i32 [[DOT_COUNTED_BY_LOAD]], 2
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[DO...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 4, 2024

@llvm/pr-subscribers-coroutines

Author: Nikita Popov (nikic)

Changes

If the gep is nusw (usually via inbounds) and the offset is non-negative, we can infer nuw.

Unfortunately this inference does have some compile-time overhead: https://llvm-compile-time-tracker.com/compare.php?from=37e5319a12ba47c18049728804d3d1e1b10c4eb4&amp;to=af56d73d6543f05b1e5205b96934e2427bb24d72&amp;stat=instructions:u

Proof: https://alive2.llvm.org/ce/z/ihztLy


Patch is 1.28 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111144.diff

183 Files Affected:

  • (modified) clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c (+5-5)
  • (modified) clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c (+14-14)
  • (modified) clang/test/CodeGen/aarch64-ls64-inline-asm.c (+10-10)
  • (modified) clang/test/CodeGen/arm64_32-vaarg.c (+6-6)
  • (modified) clang/test/CodeGen/attr-counted-by-pr110385.c (+2-2)
  • (modified) clang/test/CodeGen/attr-counted-by.c (+106-110)
  • (modified) clang/test/CodeGen/math-libcalls-tbaa.c (+7-7)
  • (modified) clang/test/CodeGen/union-tbaa1.c (+2-2)
  • (modified) clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu (+3-3)
  • (modified) clang/test/CodeGenCXX/auto-var-init.cpp (+4-4)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-dynamic-cast.cpp (+9-9)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-typeid.cpp (+1-1)
  • (modified) clang/test/CodeGenOpenCL/amdgpu-nullptr.cl (+2-2)
  • (modified) clang/test/CodeGenOpenCL/builtins-amdgcn.cl (+4-4)
  • (modified) clang/test/CodeGenOpenCLCXX/array-type-infinite-loop.clcpp (+12-12)
  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+9)
  • (modified) llvm/test/Analysis/BasicAA/featuretest.ll (+1-1)
  • (modified) llvm/test/Analysis/ValueTracking/phi-known-bits.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/implicit-arg-v5-opt.ll (+6-6)
  • (modified) llvm/test/CodeGen/AMDGPU/reqd-work-group-size.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/vector-alloca-bitcast.ll (+6-6)
  • (modified) llvm/test/Transforms/Coroutines/coro-async.ll (+12-12)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-alloca-opaque-ptr.ll (+1-1)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-alloca.ll (+1-1)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-once-value.ll (+3-3)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-resume-values.ll (+10-10)
  • (modified) llvm/test/Transforms/Coroutines/coro-swifterror.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/2007-03-25-BadShiftMask.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/2009-01-08-AlignAlloca.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/X86/x86-addsub-inseltpoison.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/X86/x86-addsub.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/array.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/assume-align.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/assume-loop-align.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/assume-redundant.ll (+5-1)
  • (modified) llvm/test/Transforms/InstCombine/assume.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/call-cast-target.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/cast_phi.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/cast_ptr.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/catchswitch-phi.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/compare-alloca.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/compare-unescaped.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/dependent-ivs.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/extractvalue.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/fmul.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/fsh.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/gep-addrspace.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/gep-canonicalize-constant-indices.ll (+9-9)
  • (modified) llvm/test/Transforms/InstCombine/gep-combine-loop-invariant.ll (+14-14)
  • (modified) llvm/test/Transforms/InstCombine/gep-merge-constant-indices.ll (+12-12)
  • (modified) llvm/test/Transforms/InstCombine/gep-vector-indices.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/gepphigep.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/getelementptr.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/icmp-custom-dl.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/icmp-gep.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/icmp.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/inbounds-gep.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/indexed-gep-compares.ll (+9-5)
  • (modified) llvm/test/Transforms/InstCombine/intptr1.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/intptr7.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/load-bitcast-select.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/mem-par-metadata-memcpy.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/memccpy.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/memcpy_alloca.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/mempcpy.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/memset2.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/opaque-ptr.ll (+10-6)
  • (modified) llvm/test/Transforms/InstCombine/phi-equal-incoming-pointers.ll (+14-14)
  • (modified) llvm/test/Transforms/InstCombine/phi-timeout.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/phi.ll (+22-22)
  • (modified) llvm/test/Transforms/InstCombine/ptr-replace-alloca.ll (+8-8)
  • (modified) llvm/test/Transforms/InstCombine/ptrmask.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/remove-loop-phi-multiply-by-zero.ll (+10-10)
  • (modified) llvm/test/Transforms/InstCombine/select-cmp-br.ll (+8-8)
  • (modified) llvm/test/Transforms/InstCombine/select-gep.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/sink_sideeffecting_instruction.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/snprintf-2.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/snprintf-3.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/snprintf-4.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/snprintf.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/sprintf-1.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/stpncpy-1.ll (+32-32)
  • (modified) llvm/test/Transforms/InstCombine/str-int.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/strlcpy-1.ll (+14-14)
  • (modified) llvm/test/Transforms/InstCombine/strlen-1.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/struct-assign-tbaa-2.ll (+4-2)
  • (modified) llvm/test/Transforms/InstCombine/sub.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/unpack-fca.ll (+26-26)
  • (modified) llvm/test/Transforms/InstCombine/vec_gep_scalar_arg-inseltpoison.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/vec_gep_scalar_arg.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/vscale_gep.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-1.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-3.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-5.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll (+24-24)
  • (modified) llvm/test/Transforms/LoopUnroll/ARM/upperbound.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopUnroll/WebAssembly/basic-unrolling.ll (+31-31)
  • (modified) llvm/test/Transforms/LoopUnroll/peel-loop.ll (+6-6)
  • (modified) llvm/test/Transforms/LoopUnroll/runtime-unroll-remainder.ll (+8-8)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/deterministic-type-shrinkage.ll (+54-54)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-cond-inv-loads.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-gather-scatter.ll (+25-25)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+198-198)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-masked-accesses.ll (+95-95)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-phi.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-epilogue.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-no-scalar-interleave.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-too-many-deps.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt.ll (+14-14)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/uniform-args-call-variants.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/AMDGPU/packed-math.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/ARM/mve-reductions.ll (+12-12)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll (+36-36)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/interleaving.ll (+44-44)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/invariant-load-gather.ll (+24-24)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll (+51-51)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/metadata-enable.ll (+474-474)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/pr23997.ll (+8-8)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll (+325-325)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll (+22-22)
  • (modified) llvm/test/Transforms/LoopVectorize/extract-last-veclane.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/float-induction.ll (+150-150)
  • (modified) llvm/test/Transforms/LoopVectorize/forked-pointers.ll (+15-15)
  • (modified) llvm/test/Transforms/LoopVectorize/histograms.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/induction.ll (+42-42)
  • (modified) llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVectorize/invariant-store-vectorization-2.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/invariant-store-vectorization.ll (+11-11)
  • (modified) llvm/test/Transforms/LoopVectorize/loop-scalars.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction-inloop-uf4.ll (+229-229)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction-inloop.ll (+4-4)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/runtime-check.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/scalar_after_vectorization.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/trunc-reductions.ll (+92-32)
  • (modified) llvm/test/Transforms/LoopVectorize/vector-geps.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVersioningLICM/loopversioningLICM1.ll (+13-13)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/hoist-runtime-checks.ll (+5-5)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/hoisting-sinking-required-for-vectorization.ll (+10-10)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/indvars-vectorization.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/loopflatten.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/matrix-extract-insert.ll (+23-23)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/quant_4x4.ll (+48-48)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/sinking-vs-if-conversion.ll (+7-7)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/slpordering.ll (+8-8)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll (+12-12)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/hoist-load-of-baseptr.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/merge-functions2.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/merge-functions3.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/pixel-splat.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/pr50555.ll (+5-5)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/simplifycfg-late.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/speculation-vs-tbaa.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/spurious-peeling.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll (+24-24)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vec-load-combine.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vec-shift.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vector-reduction-known-first-value.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/basic.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/bitcast-store-branch.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/dce-after-argument-promotion-loads.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/gvn-replacement-vs-hoist.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/loop-access-checks.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/lto-licm.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/pr39282.ll (+10-10)
  • (modified) llvm/test/Transforms/PhaseOrdering/pr98799-inline-simplifycfg-ub.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/scev-custom-dl.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/simplifycfg-options.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/single-iteration-loop-sroa.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll (+2-2)
  • (modified) llvm/test/Transforms/RewriteStatepointsForGC/intrinsics.ll (+12-12)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/gather-cost.ll (+6-6)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll (+2-2)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/getelementptr2.ll (+2-2)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/loadorder.ll (+32-32)
  • (modified) llvm/test/Transforms/SLPVectorizer/WebAssembly/no-vectorize-rotate.ll (+1-1)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/minimum-sizes.ll (+4-4)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/opt.ll (+3-3)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/pr47629-inseltpoison.ll (+131-131)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/pr47629.ll (+131-131)
  • (modified) llvm/test/Transforms/SampleProfile/pseudo-probe-instcombine.ll (+5-5)
  • (modified) llvm/test/Transforms/SimpleLoopUnswitch/AMDGPU/uniform-unswitch.ll (+1-1)
  • (modified) llvm/test/Transforms/SimplifyCFG/Hexagon/switch-to-lookup-table.ll (+1-1)
diff --git a/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c b/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c
index 5422d993ff1575..08ff936a0a797b 100644
--- a/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c
+++ b/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c
@@ -25,13 +25,13 @@ void test1(unsigned char *vqp, unsigned char *vpp, vector unsigned char vc, unsi
 // CHECK-NEXT:    [[TMP2:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 0
 // CHECK-NEXT:    store <16 x i8> [[TMP2]], ptr [[RESP:%.*]], align 16
 // CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 1
-// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 16
+// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 16
 // CHECK-NEXT:    store <16 x i8> [[TMP3]], ptr [[TMP4]], align 16
 // CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 2
-// CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 32
+// CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 32
 // CHECK-NEXT:    store <16 x i8> [[TMP5]], ptr [[TMP6]], align 16
 // CHECK-NEXT:    [[TMP7:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 3
-// CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 48
+// CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 48
 // CHECK-NEXT:    store <16 x i8> [[TMP7]], ptr [[TMP8]], align 16
 // CHECK-NEXT:    ret void
 //
@@ -60,7 +60,7 @@ void test3(unsigned char *vqp, unsigned char *vpp, vector unsigned char vc, unsi
 // CHECK-NEXT:    [[TMP2:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 0
 // CHECK-NEXT:    store <16 x i8> [[TMP2]], ptr [[RESP:%.*]], align 16
 // CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 1
-// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 16
+// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 16
 // CHECK-NEXT:    store <16 x i8> [[TMP3]], ptr [[TMP4]], align 16
 // CHECK-NEXT:    ret void
 //
@@ -1072,7 +1072,7 @@ void test76(unsigned char *vqp, unsigned char *vpp, vector unsigned char vc, uns
 // CHECK-NEXT:    [[TMP2:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 0
 // CHECK-NEXT:    store <16 x i8> [[TMP2]], ptr [[RESP:%.*]], align 16
 // CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 1
-// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 16
+// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 16
 // CHECK-NEXT:    store <16 x i8> [[TMP3]], ptr [[TMP4]], align 16
 // CHECK-NEXT:    ret void
 //
diff --git a/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c b/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c
index 6194c9b1804fb0..2d9629eff3c98c 100644
--- a/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c
+++ b/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c
@@ -48,21 +48,21 @@ void test_indexing(struct Foo *f) {
 
 void test_indexing_2(struct Foo *f) {
   // X64-LABEL: define void @test_indexing_2(ptr noundef %f)
-  // X64: getelementptr inbounds i8, ptr addrspace(1) {{%[0-9]}}, i32 16
-  // X64: getelementptr inbounds i8, ptr {{%[0-9]}}, i64 24
+  // X64: getelementptr inbounds nuw i8, ptr addrspace(1) {{%[0-9]}}, i32 16
+  // X64: getelementptr inbounds nuw i8, ptr {{%[0-9]}}, i64 24
   f->cp64 = ((char *** __ptr32 *)1028)[1][2][3];
   use_foo(f);
 }
 
 unsigned long* test_misc() {
   // X64-LABEL: define ptr @test_misc()
-  // X64: %arrayidx = getelementptr inbounds i8, ptr addrspace(1) %0, i32 88
+  // X64: %arrayidx = getelementptr inbounds nuw i8, ptr addrspace(1) %0, i32 88
   // X64-NEXT: %1 = load ptr, ptr addrspace(1) %arrayidx
-  // X64-NEXT: %arrayidx1 = getelementptr inbounds i8, ptr %1, i64 8
+  // X64-NEXT: %arrayidx1 = getelementptr inbounds nuw i8, ptr %1, i64 8
   // X64-NEXT: %2 = load ptr, ptr %arrayidx1
-  // X64-NEXT: %arrayidx2 = getelementptr inbounds i8, ptr %2, i64 904
+  // X64-NEXT: %arrayidx2 = getelementptr inbounds nuw i8, ptr %2, i64 904
   // X64-NEXT: %3 = load ptr, ptr %arrayidx2
-  // X64-NEXT: %arrayidx3 = getelementptr inbounds i8, ptr %3, i64 1192
+  // X64-NEXT: %arrayidx3 = getelementptr inbounds nuw i8, ptr %3, i64 1192
   unsigned long* x = (unsigned long*)((char***** __ptr32*)1208)[0][11][1][113][149];
   return x;
 }
@@ -71,9 +71,9 @@ char* __ptr32* __ptr32 test_misc_2() {
   // X64-LABEL: define ptr addrspace(1) @test_misc_2()
   // X64: br i1 %cmp, label %if.then, label %if.end
   // X64: %1 = load ptr addrspace(1), ptr inttoptr (i64 16 to ptr)
-  // X64-NEXT: %arrayidx = getelementptr inbounds i8, ptr addrspace(1) %1, i32 544
+  // X64-NEXT: %arrayidx = getelementptr inbounds nuw i8, ptr addrspace(1) %1, i32 544
   // X64-NEXT: %2 = load ptr addrspace(1), ptr addrspace(1) %arrayidx
-  // X64-NEXT: %arrayidx1 = getelementptr inbounds i8, ptr addrspace(1) %2, i32 24
+  // X64-NEXT: %arrayidx1 = getelementptr inbounds nuw i8, ptr addrspace(1) %2, i32 24
   // X64-NEXT: %3 = load ptr addrspace(1), ptr addrspace(1) %arrayidx1
   // X64-NEXT: store ptr addrspace(1) %3, ptr @test_misc_2.res
   // X64: ret ptr addrspace(1)
@@ -88,7 +88,7 @@ unsigned short test_misc_3() {
   // X64-LABEL: define zeroext i16 @test_misc_3()
   // X64: %0 = load ptr addrspace(1), ptr inttoptr (i64 548 to ptr)
   // X64-NEXT: %1 = addrspacecast ptr addrspace(1) %0 to ptr
-  // X64-NEXT: %arrayidx = getelementptr inbounds i8, ptr %1, i64 36
+  // X64-NEXT: %arrayidx = getelementptr inbounds nuw i8, ptr %1, i64 36
   // X64-NEXT: %2 = load i16, ptr %arrayidx, align 2
   // X64-NEXT: ret i16 %2
   unsigned short this_asid = ((unsigned short*)(*(char* __ptr32*)(0x224)))[18];
@@ -97,10 +97,10 @@ unsigned short test_misc_3() {
 
 int test_misc_4() {
   // X64-LABEL: define signext range(i32 0, 2) i32 @test_misc_4()
-  // X64: getelementptr inbounds i8, ptr addrspace(1) {{%[0-9]}}, i32 88
-  // X64: getelementptr inbounds i8, ptr {{%[0-9]}}, i64 8
-  // X64: getelementptr inbounds i8, ptr {{%[0-9]}}, i64 984
-  // X64: getelementptr inbounds i8, ptr %3, i64 80
+  // X64: getelementptr inbounds nuw i8, ptr addrspace(1) {{%[0-9]}}, i32 88
+  // X64: getelementptr inbounds nuw i8, ptr {{%[0-9]}}, i64 8
+  // X64: getelementptr inbounds nuw i8, ptr {{%[0-9]}}, i64 984
+  // X64: getelementptr inbounds nuw i8, ptr %3, i64 80
   // X64: icmp sgt i32 {{.*[0-9]}}, 67240703
   // X64: ret i32
   int a = (*(int*)(80 + ((char**** __ptr32*)1208)[0][11][1][123]) > 0x040202FF);
@@ -189,7 +189,7 @@ int test_function_ptr32_is_32bit() {
 int get_processor_count() {
   // X64-LABEL: define signext range(i32 -128, 128) i32 @get_processor_count()
   // X64: load ptr addrspace(1), ptr inttoptr (i64 16 to ptr)
-  // X64-NEXT: [[ARR_IDX1:%[a-z].*]] = getelementptr inbounds i8, ptr addrspace(1) %0, i32 660
+  // X64-NEXT: [[ARR_IDX1:%[a-z].*]] = getelementptr inbounds nuw i8, ptr addrspace(1) %0, i32 660
   // X64: load ptr addrspace(1), ptr addrspace(1) [[ARR_IDX1]]
   // X64: load i8, ptr addrspace(1) {{%[a-z].*}}
   // X64: sext i8 {{%[0-9]}} to i32
diff --git a/clang/test/CodeGen/aarch64-ls64-inline-asm.c b/clang/test/CodeGen/aarch64-ls64-inline-asm.c
index a01393525bcd42..8aa0684dba14d0 100644
--- a/clang/test/CodeGen/aarch64-ls64-inline-asm.c
+++ b/clang/test/CodeGen/aarch64-ls64-inline-asm.c
@@ -5,7 +5,7 @@ struct foo { unsigned long long x[8]; };
 
 // CHECK-LABEL: @load(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call i512 asm sideeffect "ld64b $0,[$1]", "=r,r,~{memory}"(ptr [[ADDR:%.*]]) #[[ATTR1:[0-9]+]], !srcloc !2
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call i512 asm sideeffect "ld64b $0,[$1]", "=r,r,~{memory}"(ptr [[ADDR:%.*]]) #[[ATTR1:[0-9]+]], !srcloc [[META2:![0-9]+]]
 // CHECK-NEXT:    store i512 [[TMP0]], ptr [[OUTPUT:%.*]], align 8
 // CHECK-NEXT:    ret void
 //
@@ -17,7 +17,7 @@ void load(struct foo *output, void *addr)
 // CHECK-LABEL: @store(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    [[TMP0:%.*]] = load i512, ptr [[INPUT:%.*]], align 8
-// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[TMP0]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc !3
+// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[TMP0]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc [[META3:![0-9]+]]
 // CHECK-NEXT:    ret void
 //
 void store(const struct foo *input, void *addr)
@@ -29,25 +29,25 @@ void store(const struct foo *input, void *addr)
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[IN:%.*]], align 4, !tbaa [[TBAA4:![0-9]+]]
 // CHECK-NEXT:    [[CONV:%.*]] = sext i32 [[TMP0]] to i64
-// CHECK-NEXT:    [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 4
+// CHECK-NEXT:    [[ARRAYIDX1:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 4
 // CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[ARRAYIDX1]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV2:%.*]] = sext i32 [[TMP1]] to i64
-// CHECK-NEXT:    [[ARRAYIDX4:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 16
+// CHECK-NEXT:    [[ARRAYIDX4:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 16
 // CHECK-NEXT:    [[TMP2:%.*]] = load i32, ptr [[ARRAYIDX4]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV5:%.*]] = sext i32 [[TMP2]] to i64
-// CHECK-NEXT:    [[ARRAYIDX7:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 64
+// CHECK-NEXT:    [[ARRAYIDX7:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 64
 // CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[ARRAYIDX7]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV8:%.*]] = sext i32 [[TMP3]] to i64
-// CHECK-NEXT:    [[ARRAYIDX10:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 100
+// CHECK-NEXT:    [[ARRAYIDX10:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 100
 // CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[ARRAYIDX10]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV11:%.*]] = sext i32 [[TMP4]] to i64
-// CHECK-NEXT:    [[ARRAYIDX13:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 144
+// CHECK-NEXT:    [[ARRAYIDX13:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 144
 // CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[ARRAYIDX13]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV14:%.*]] = sext i32 [[TMP5]] to i64
-// CHECK-NEXT:    [[ARRAYIDX16:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 196
+// CHECK-NEXT:    [[ARRAYIDX16:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 196
 // CHECK-NEXT:    [[TMP6:%.*]] = load i32, ptr [[ARRAYIDX16]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV17:%.*]] = sext i32 [[TMP6]] to i64
-// CHECK-NEXT:    [[ARRAYIDX19:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 256
+// CHECK-NEXT:    [[ARRAYIDX19:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 256
 // CHECK-NEXT:    [[TMP7:%.*]] = load i32, ptr [[ARRAYIDX19]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV20:%.*]] = sext i32 [[TMP7]] to i64
 // CHECK-NEXT:    [[S_SROA_10_0_INSERT_EXT:%.*]] = zext i64 [[CONV20]] to i512
@@ -72,7 +72,7 @@ void store(const struct foo *input, void *addr)
 // CHECK-NEXT:    [[S_SROA_0_0_INSERT_EXT:%.*]] = zext i64 [[CONV]] to i512
 // CHECK-NEXT:    [[S_SROA_0_0_INSERT_MASK:%.*]] = or disjoint i512 [[S_SROA_4_0_INSERT_MASK]], [[S_SROA_4_0_INSERT_SHIFT]]
 // CHECK-NEXT:    [[S_SROA_0_0_INSERT_INSERT:%.*]] = or i512 [[S_SROA_0_0_INSERT_MASK]], [[S_SROA_0_0_INSERT_EXT]]
-// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[S_SROA_0_0_INSERT_INSERT]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc !8
+// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[S_SROA_0_0_INSERT_INSERT]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc [[META8:![0-9]+]]
 // CHECK-NEXT:    ret void
 //
 void store2(int *in, void *addr)
diff --git a/clang/test/CodeGen/arm64_32-vaarg.c b/clang/test/CodeGen/arm64_32-vaarg.c
index 3f1f4443436da1..72c23d4967d2d3 100644
--- a/clang/test/CodeGen/arm64_32-vaarg.c
+++ b/clang/test/CodeGen/arm64_32-vaarg.c
@@ -10,7 +10,7 @@ typedef struct {
 int test_int(OneInt input, va_list *mylist) {
 // CHECK-LABEL: define{{.*}} i32 @test_int(i32 %input
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 4
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 4
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[RES:%.*]] = load i32, ptr [[START]]
@@ -28,9 +28,9 @@ typedef struct {
 long long test_longlong(OneLongLong input, va_list *mylist) {
   // CHECK-LABEL: define{{.*}} i64 @test_longlong(i64 %input
   // CHECK: [[STARTPTR:%.*]] = load ptr, ptr %mylist
-  // CHECK: [[ALIGN_TMP:%.+]] = getelementptr inbounds i8, ptr [[STARTPTR]], i32 7
+  // CHECK: [[ALIGN_TMP:%.+]] = getelementptr inbounds nuw i8, ptr [[STARTPTR]], i32 7
   // CHECK: [[ALIGNED_ADDR:%.+]] = tail call align 8 ptr @llvm.ptrmask.p0.i32(ptr nonnull [[ALIGN_TMP]], i32 -8)
-  // CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[ALIGNED_ADDR]], i32 8
+  // CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[ALIGNED_ADDR]], i32 8
   // CHECK: store ptr [[NEXT]], ptr %mylist
 
   // CHECK: [[RES:%.*]] = load i64, ptr [[ALIGNED_ADDR]]
@@ -49,7 +49,7 @@ float test_hfa(va_list *mylist) {
 // CHECK-LABEL: define{{.*}} float @test_hfa
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
 
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 16
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 16
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[RES:%.*]] = load float, ptr [[START]]
@@ -76,7 +76,7 @@ typedef struct {
 long long test_bigstruct(BigStruct input, va_list *mylist) {
 // CHECK-LABEL: define{{.*}} i64 @test_bigstruct(ptr
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 4
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 4
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[ADDR:%.*]] = load ptr, ptr [[START]]
@@ -97,7 +97,7 @@ short test_threeshorts(ThreeShorts input, va_list *mylist) {
 // CHECK-LABEL: define{{.*}} signext i16 @test_threeshorts([2 x i32] %input
 
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 8
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 8
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[RES:%.*]] = load i16, ptr [[START]]
diff --git a/clang/test/CodeGen/attr-counted-by-pr110385.c b/clang/test/CodeGen/attr-counted-by-pr110385.c
index e120dcc583578d..c2ff032334fe27 100644
--- a/clang/test/CodeGen/attr-counted-by-pr110385.c
+++ b/clang/test/CodeGen/attr-counted-by-pr110385.c
@@ -31,7 +31,7 @@ void init(void * __attribute__((pass_dynamic_object_size(0))));
 // CHECK-NEXT:    [[GROWABLE:%.*]] = getelementptr inbounds nuw i8, ptr [[FOO]], i64 8
 // CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[GROWABLE]], align 8, !tbaa [[TBAA2:![0-9]+]]
 // CHECK-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP0]], i64 12
-// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 8
+// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP0]], i64 8
 // CHECK-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // CHECK-NEXT:    [[TMP1:%.*]] = sext i32 [[DOT_COUNTED_BY_LOAD]] to i64
 // CHECK-NEXT:    [[TMP2:%.*]] = shl nsw i64 [[TMP1]], 1
@@ -48,7 +48,7 @@ void test1(struct bucket *foo) {
 // CHECK-SAME: ptr noundef [[FOO:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[FOO]], i64 16
-// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[FOO]], i64 12
+// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[FOO]], i64 12
 // CHECK-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = sext i32 [[DOT_COUNTED_BY_LOAD]] to i64
 // CHECK-NEXT:    [[TMP1:%.*]] = shl nsw i64 [[TMP0]], 1
diff --git a/clang/test/CodeGen/attr-counted-by.c b/clang/test/CodeGen/attr-counted-by.c
index 4a130c5e3d401f..1028bffaf896d7 100644
--- a/clang/test/CodeGen/attr-counted-by.c
+++ b/clang/test/CodeGen/attr-counted-by.c
@@ -60,13 +60,13 @@ struct anon_struct {
 // SANITIZE-WITH-ATTR-SAME: ptr noundef [[P:%.*]], i32 noundef [[INDEX:%.*]], i32 noundef [[VAL:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SANITIZE-WITH-ATTR-NEXT:  entry:
 // SANITIZE-WITH-ATTR-NEXT:    [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// SANITIZE-WITH-ATTR-NEXT:    [[DOTCOUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8
+// SANITIZE-WITH-ATTR-NEXT:    [[DOTCOUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 8
 // SANITIZE-WITH-ATTR-NEXT:    [[DOTCOUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOTCOUNTED_BY_GEP]], align 4
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = zext i32 [[DOTCOUNTED_BY_LOAD]] to i64, !nosanitize [[META2:![0-9]+]]
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP1:%.*]] = icmp ult i64 [[IDXPROM]], [[TMP0]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    br i1 [[TMP1]], label [[CONT3:%.*]], label [[HANDLER_OUT_OF_BOUNDS:%.*]], !prof [[PROF3:![0-9]+]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       handler.out_of_bounds:
-// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[IDXPROM]]) #[[ATTR10:[0-9]+]], !nosanitize [[META2]]
+// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[IDXPROM]]) #[[ATTR9:[0-9]+]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    unreachable, !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       cont3:
 // SANITIZE-WITH-ATTR-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
@@ -108,13 +108,13 @@ void test1(struct annotated *p, int index, int val) {
 // SANITIZE-WITH-ATTR-LABEL: define dso_local void @test2(
 // SANITIZE-WITH-ATTR-SAME: ptr noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SANITIZE-WITH-ATTR-NEXT:  entry:
-// SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8
+// SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 8
 // SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = zext i32 [[DOT_COUNTED_BY_LOAD]] to i64, !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP1:%.*]] = icmp ult i64 [[INDEX]], [[TMP0]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    br i1 [[TMP1]], label [[CONT3:%.*]], label [[HANDLER_OUT_OF_BOUNDS:%.*]], !prof [[PROF3]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       handler.out_of_bounds:
-// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB3:[0-9]+]], i64 [[INDEX]]) #[[ATTR10]], !nosanitize [[META2]]
+// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB3:[0-9]+]], i64 [[INDEX]]) #[[ATTR9]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    unreachable, !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       cont3:
 // SANITIZE-WITH-ATTR-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
@@ -128,7 +128,7 @@ void test1(struct annotated *p, int index, int val) {
 // NO-SANITIZE-WITH-ATTR-LABEL: define dso_local void @test2(
 // NO-SANITIZE-WITH-ATTR-SAME: ptr nocapture noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR1:[0-9]+]] {
 // NO-SANITIZE-WITH-ATTR-NEXT:  entry:
-// NO-SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8
+// NO-SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 8
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = shl i32 [[DOT_COUNTED_BY_LOAD]], 2
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[DO...
[truncated]

@llvmbot
Copy link
Collaborator

llvmbot commented Oct 4, 2024

@llvm/pr-subscribers-pgo

Author: Nikita Popov (nikic)

Changes

If the gep is nusw (usually via inbounds) and the offset is non-negative, we can infer nuw.

Unfortunately this inference does have some compile-time overhead: https://llvm-compile-time-tracker.com/compare.php?from=37e5319a12ba47c18049728804d3d1e1b10c4eb4&amp;to=af56d73d6543f05b1e5205b96934e2427bb24d72&amp;stat=instructions:u

Proof: https://alive2.llvm.org/ce/z/ihztLy


Patch is 1.28 MiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/111144.diff

183 Files Affected:

  • (modified) clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c (+5-5)
  • (modified) clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c (+14-14)
  • (modified) clang/test/CodeGen/aarch64-ls64-inline-asm.c (+10-10)
  • (modified) clang/test/CodeGen/arm64_32-vaarg.c (+6-6)
  • (modified) clang/test/CodeGen/attr-counted-by-pr110385.c (+2-2)
  • (modified) clang/test/CodeGen/attr-counted-by.c (+106-110)
  • (modified) clang/test/CodeGen/math-libcalls-tbaa.c (+7-7)
  • (modified) clang/test/CodeGen/union-tbaa1.c (+2-2)
  • (modified) clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu (+3-3)
  • (modified) clang/test/CodeGenCXX/auto-var-init.cpp (+4-4)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-dynamic-cast.cpp (+9-9)
  • (modified) clang/test/CodeGenCXX/microsoft-abi-typeid.cpp (+1-1)
  • (modified) clang/test/CodeGenOpenCL/amdgpu-nullptr.cl (+2-2)
  • (modified) clang/test/CodeGenOpenCL/builtins-amdgcn.cl (+4-4)
  • (modified) clang/test/CodeGenOpenCLCXX/array-type-infinite-loop.clcpp (+12-12)
  • (modified) llvm/lib/Transforms/InstCombine/InstructionCombining.cpp (+9)
  • (modified) llvm/test/Analysis/BasicAA/featuretest.ll (+1-1)
  • (modified) llvm/test/Analysis/ValueTracking/phi-known-bits.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/implicit-arg-v5-opt.ll (+6-6)
  • (modified) llvm/test/CodeGen/AMDGPU/reqd-work-group-size.ll (+3-3)
  • (modified) llvm/test/CodeGen/AMDGPU/vector-alloca-bitcast.ll (+6-6)
  • (modified) llvm/test/Transforms/Coroutines/coro-async.ll (+12-12)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-alloca-opaque-ptr.ll (+1-1)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-alloca.ll (+1-1)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-once-value.ll (+3-3)
  • (modified) llvm/test/Transforms/Coroutines/coro-retcon-resume-values.ll (+10-10)
  • (modified) llvm/test/Transforms/Coroutines/coro-swifterror.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/2007-03-25-BadShiftMask.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/2009-01-08-AlignAlloca.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/X86/x86-addsub-inseltpoison.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/X86/x86-addsub.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/array.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/assume-align.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/assume-loop-align.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/assume-redundant.ll (+5-1)
  • (modified) llvm/test/Transforms/InstCombine/assume.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/call-cast-target.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/cast_phi.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/cast_ptr.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/catchswitch-phi.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/compare-alloca.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/compare-unescaped.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/dependent-ivs.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/extractvalue.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/fmul.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/fsh.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/gep-addrspace.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/gep-canonicalize-constant-indices.ll (+9-9)
  • (modified) llvm/test/Transforms/InstCombine/gep-combine-loop-invariant.ll (+14-14)
  • (modified) llvm/test/Transforms/InstCombine/gep-merge-constant-indices.ll (+12-12)
  • (modified) llvm/test/Transforms/InstCombine/gep-vector-indices.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/gepphigep.ll (+5-5)
  • (modified) llvm/test/Transforms/InstCombine/getelementptr.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/icmp-custom-dl.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/icmp-gep.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/icmp.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/inbounds-gep.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/indexed-gep-compares.ll (+9-5)
  • (modified) llvm/test/Transforms/InstCombine/intptr1.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/intptr7.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/load-bitcast-select.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/mem-par-metadata-memcpy.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/memccpy.ll (+6-6)
  • (modified) llvm/test/Transforms/InstCombine/memcpy_alloca.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/mempcpy.ll (+2-2)
  • (modified) llvm/test/Transforms/InstCombine/memset2.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/opaque-ptr.ll (+10-6)
  • (modified) llvm/test/Transforms/InstCombine/phi-equal-incoming-pointers.ll (+14-14)
  • (modified) llvm/test/Transforms/InstCombine/phi-timeout.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/phi.ll (+22-22)
  • (modified) llvm/test/Transforms/InstCombine/ptr-replace-alloca.ll (+8-8)
  • (modified) llvm/test/Transforms/InstCombine/ptrmask.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/remove-loop-phi-multiply-by-zero.ll (+10-10)
  • (modified) llvm/test/Transforms/InstCombine/select-cmp-br.ll (+8-8)
  • (modified) llvm/test/Transforms/InstCombine/select-gep.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/sink_sideeffecting_instruction.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/snprintf-2.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/snprintf-3.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/snprintf-4.ll (+4-4)
  • (modified) llvm/test/Transforms/InstCombine/snprintf.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/sprintf-1.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/stpncpy-1.ll (+32-32)
  • (modified) llvm/test/Transforms/InstCombine/str-int.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/strlcpy-1.ll (+14-14)
  • (modified) llvm/test/Transforms/InstCombine/strlen-1.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/struct-assign-tbaa-2.ll (+4-2)
  • (modified) llvm/test/Transforms/InstCombine/sub.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/unpack-fca.ll (+26-26)
  • (modified) llvm/test/Transforms/InstCombine/vec_gep_scalar_arg-inseltpoison.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/vec_gep_scalar_arg.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/vscale_gep.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-1.ll (+3-3)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-3.ll (+1-1)
  • (modified) llvm/test/Transforms/InstCombine/wcslen-5.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopUnroll/AArch64/runtime-unroll-generic.ll (+24-24)
  • (modified) llvm/test/Transforms/LoopUnroll/ARM/upperbound.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopUnroll/WebAssembly/basic-unrolling.ll (+31-31)
  • (modified) llvm/test/Transforms/LoopUnroll/peel-loop.ll (+6-6)
  • (modified) llvm/test/Transforms/LoopUnroll/runtime-unroll-remainder.ll (+8-8)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/deterministic-type-shrinkage.ll (+54-54)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-cond-inv-loads.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-gather-scatter.ll (+25-25)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-accesses.ll (+198-198)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-interleaved-masked-accesses.ll (+95-95)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve-widen-phi.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-epilogue.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-no-scalar-interleave.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt-too-many-deps.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/sve2-histcnt.ll (+14-14)
  • (modified) llvm/test/Transforms/LoopVectorize/AArch64/uniform-args-call-variants.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/AMDGPU/packed-math.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/ARM/mve-reductions.ll (+12-12)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/float-induction-x86.ll (+36-36)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/interleaving.ll (+44-44)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/invariant-load-gather.ll (+24-24)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/invariant-store-vectorization.ll (+51-51)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/metadata-enable.ll (+474-474)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/pr23997.ll (+8-8)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-accesses-masked-group.ll (+325-325)
  • (modified) llvm/test/Transforms/LoopVectorize/X86/x86-interleaved-store-accesses-with-gaps.ll (+22-22)
  • (modified) llvm/test/Transforms/LoopVectorize/extract-last-veclane.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/float-induction.ll (+150-150)
  • (modified) llvm/test/Transforms/LoopVectorize/forked-pointers.ll (+15-15)
  • (modified) llvm/test/Transforms/LoopVectorize/histograms.ll (+2-2)
  • (modified) llvm/test/Transforms/LoopVectorize/induction.ll (+42-42)
  • (modified) llvm/test/Transforms/LoopVectorize/interleaved-accesses.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVectorize/invariant-store-vectorization-2.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/invariant-store-vectorization.ll (+11-11)
  • (modified) llvm/test/Transforms/LoopVectorize/loop-scalars.ll (+3-3)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction-inloop-uf4.ll (+229-229)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction-inloop.ll (+4-4)
  • (modified) llvm/test/Transforms/LoopVectorize/reduction.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/runtime-check.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/scalar_after_vectorization.ll (+1-1)
  • (modified) llvm/test/Transforms/LoopVectorize/trunc-reductions.ll (+92-32)
  • (modified) llvm/test/Transforms/LoopVectorize/vector-geps.ll (+5-5)
  • (modified) llvm/test/Transforms/LoopVersioningLICM/loopversioningLICM1.ll (+13-13)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/hoist-runtime-checks.ll (+5-5)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/hoisting-sinking-required-for-vectorization.ll (+10-10)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/indvars-vectorization.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/loopflatten.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/matrix-extract-insert.ll (+23-23)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/quant_4x4.ll (+48-48)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/sinking-vs-if-conversion.ll (+7-7)
  • (modified) llvm/test/Transforms/PhaseOrdering/AArch64/slpordering.ll (+8-8)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/excessive-unrolling.ll (+12-12)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/hoist-load-of-baseptr.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/merge-functions2.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/merge-functions3.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/pixel-splat.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/pr50555.ll (+5-5)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/simplifycfg-late.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/speculation-vs-tbaa.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/spurious-peeling.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vdiv.ll (+24-24)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vec-load-combine.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vec-shift.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/X86/vector-reduction-known-first-value.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/basic.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/bitcast-store-branch.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/dce-after-argument-promotion-loads.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/gvn-replacement-vs-hoist.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/loop-access-checks.ll (+4-4)
  • (modified) llvm/test/Transforms/PhaseOrdering/lto-licm.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/pr39282.ll (+10-10)
  • (modified) llvm/test/Transforms/PhaseOrdering/pr98799-inline-simplifycfg-ub.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/scev-custom-dl.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/simplifycfg-options.ll (+1-1)
  • (modified) llvm/test/Transforms/PhaseOrdering/single-iteration-loop-sroa.ll (+2-2)
  • (modified) llvm/test/Transforms/PhaseOrdering/switch_with_geps.ll (+2-2)
  • (modified) llvm/test/Transforms/RewriteStatepointsForGC/intrinsics.ll (+12-12)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/gather-cost.ll (+6-6)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/getelementptr.ll (+2-2)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/getelementptr2.ll (+2-2)
  • (modified) llvm/test/Transforms/SLPVectorizer/AArch64/loadorder.ll (+32-32)
  • (modified) llvm/test/Transforms/SLPVectorizer/WebAssembly/no-vectorize-rotate.ll (+1-1)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/minimum-sizes.ll (+4-4)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/opt.ll (+3-3)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/pr47629-inseltpoison.ll (+131-131)
  • (modified) llvm/test/Transforms/SLPVectorizer/X86/pr47629.ll (+131-131)
  • (modified) llvm/test/Transforms/SampleProfile/pseudo-probe-instcombine.ll (+5-5)
  • (modified) llvm/test/Transforms/SimpleLoopUnswitch/AMDGPU/uniform-unswitch.ll (+1-1)
  • (modified) llvm/test/Transforms/SimplifyCFG/Hexagon/switch-to-lookup-table.ll (+1-1)
diff --git a/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c b/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c
index 5422d993ff1575..08ff936a0a797b 100644
--- a/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c
+++ b/clang/test/CodeGen/PowerPC/builtins-ppc-pair-mma.c
@@ -25,13 +25,13 @@ void test1(unsigned char *vqp, unsigned char *vpp, vector unsigned char vc, unsi
 // CHECK-NEXT:    [[TMP2:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 0
 // CHECK-NEXT:    store <16 x i8> [[TMP2]], ptr [[RESP:%.*]], align 16
 // CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 1
-// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 16
+// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 16
 // CHECK-NEXT:    store <16 x i8> [[TMP3]], ptr [[TMP4]], align 16
 // CHECK-NEXT:    [[TMP5:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 2
-// CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 32
+// CHECK-NEXT:    [[TMP6:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 32
 // CHECK-NEXT:    store <16 x i8> [[TMP5]], ptr [[TMP6]], align 16
 // CHECK-NEXT:    [[TMP7:%.*]] = extractvalue { <16 x i8>, <16 x i8>, <16 x i8>, <16 x i8> } [[TMP1]], 3
-// CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 48
+// CHECK-NEXT:    [[TMP8:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 48
 // CHECK-NEXT:    store <16 x i8> [[TMP7]], ptr [[TMP8]], align 16
 // CHECK-NEXT:    ret void
 //
@@ -60,7 +60,7 @@ void test3(unsigned char *vqp, unsigned char *vpp, vector unsigned char vc, unsi
 // CHECK-NEXT:    [[TMP2:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 0
 // CHECK-NEXT:    store <16 x i8> [[TMP2]], ptr [[RESP:%.*]], align 16
 // CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 1
-// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 16
+// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 16
 // CHECK-NEXT:    store <16 x i8> [[TMP3]], ptr [[TMP4]], align 16
 // CHECK-NEXT:    ret void
 //
@@ -1072,7 +1072,7 @@ void test76(unsigned char *vqp, unsigned char *vpp, vector unsigned char vc, uns
 // CHECK-NEXT:    [[TMP2:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 0
 // CHECK-NEXT:    store <16 x i8> [[TMP2]], ptr [[RESP:%.*]], align 16
 // CHECK-NEXT:    [[TMP3:%.*]] = extractvalue { <16 x i8>, <16 x i8> } [[TMP1]], 1
-// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds i8, ptr [[RESP]], i64 16
+// CHECK-NEXT:    [[TMP4:%.*]] = getelementptr inbounds nuw i8, ptr [[RESP]], i64 16
 // CHECK-NEXT:    store <16 x i8> [[TMP3]], ptr [[TMP4]], align 16
 // CHECK-NEXT:    ret void
 //
diff --git a/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c b/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c
index 6194c9b1804fb0..2d9629eff3c98c 100644
--- a/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c
+++ b/clang/test/CodeGen/SystemZ/zos-mixed-ptr-sizes.c
@@ -48,21 +48,21 @@ void test_indexing(struct Foo *f) {
 
 void test_indexing_2(struct Foo *f) {
   // X64-LABEL: define void @test_indexing_2(ptr noundef %f)
-  // X64: getelementptr inbounds i8, ptr addrspace(1) {{%[0-9]}}, i32 16
-  // X64: getelementptr inbounds i8, ptr {{%[0-9]}}, i64 24
+  // X64: getelementptr inbounds nuw i8, ptr addrspace(1) {{%[0-9]}}, i32 16
+  // X64: getelementptr inbounds nuw i8, ptr {{%[0-9]}}, i64 24
   f->cp64 = ((char *** __ptr32 *)1028)[1][2][3];
   use_foo(f);
 }
 
 unsigned long* test_misc() {
   // X64-LABEL: define ptr @test_misc()
-  // X64: %arrayidx = getelementptr inbounds i8, ptr addrspace(1) %0, i32 88
+  // X64: %arrayidx = getelementptr inbounds nuw i8, ptr addrspace(1) %0, i32 88
   // X64-NEXT: %1 = load ptr, ptr addrspace(1) %arrayidx
-  // X64-NEXT: %arrayidx1 = getelementptr inbounds i8, ptr %1, i64 8
+  // X64-NEXT: %arrayidx1 = getelementptr inbounds nuw i8, ptr %1, i64 8
   // X64-NEXT: %2 = load ptr, ptr %arrayidx1
-  // X64-NEXT: %arrayidx2 = getelementptr inbounds i8, ptr %2, i64 904
+  // X64-NEXT: %arrayidx2 = getelementptr inbounds nuw i8, ptr %2, i64 904
   // X64-NEXT: %3 = load ptr, ptr %arrayidx2
-  // X64-NEXT: %arrayidx3 = getelementptr inbounds i8, ptr %3, i64 1192
+  // X64-NEXT: %arrayidx3 = getelementptr inbounds nuw i8, ptr %3, i64 1192
   unsigned long* x = (unsigned long*)((char***** __ptr32*)1208)[0][11][1][113][149];
   return x;
 }
@@ -71,9 +71,9 @@ char* __ptr32* __ptr32 test_misc_2() {
   // X64-LABEL: define ptr addrspace(1) @test_misc_2()
   // X64: br i1 %cmp, label %if.then, label %if.end
   // X64: %1 = load ptr addrspace(1), ptr inttoptr (i64 16 to ptr)
-  // X64-NEXT: %arrayidx = getelementptr inbounds i8, ptr addrspace(1) %1, i32 544
+  // X64-NEXT: %arrayidx = getelementptr inbounds nuw i8, ptr addrspace(1) %1, i32 544
   // X64-NEXT: %2 = load ptr addrspace(1), ptr addrspace(1) %arrayidx
-  // X64-NEXT: %arrayidx1 = getelementptr inbounds i8, ptr addrspace(1) %2, i32 24
+  // X64-NEXT: %arrayidx1 = getelementptr inbounds nuw i8, ptr addrspace(1) %2, i32 24
   // X64-NEXT: %3 = load ptr addrspace(1), ptr addrspace(1) %arrayidx1
   // X64-NEXT: store ptr addrspace(1) %3, ptr @test_misc_2.res
   // X64: ret ptr addrspace(1)
@@ -88,7 +88,7 @@ unsigned short test_misc_3() {
   // X64-LABEL: define zeroext i16 @test_misc_3()
   // X64: %0 = load ptr addrspace(1), ptr inttoptr (i64 548 to ptr)
   // X64-NEXT: %1 = addrspacecast ptr addrspace(1) %0 to ptr
-  // X64-NEXT: %arrayidx = getelementptr inbounds i8, ptr %1, i64 36
+  // X64-NEXT: %arrayidx = getelementptr inbounds nuw i8, ptr %1, i64 36
   // X64-NEXT: %2 = load i16, ptr %arrayidx, align 2
   // X64-NEXT: ret i16 %2
   unsigned short this_asid = ((unsigned short*)(*(char* __ptr32*)(0x224)))[18];
@@ -97,10 +97,10 @@ unsigned short test_misc_3() {
 
 int test_misc_4() {
   // X64-LABEL: define signext range(i32 0, 2) i32 @test_misc_4()
-  // X64: getelementptr inbounds i8, ptr addrspace(1) {{%[0-9]}}, i32 88
-  // X64: getelementptr inbounds i8, ptr {{%[0-9]}}, i64 8
-  // X64: getelementptr inbounds i8, ptr {{%[0-9]}}, i64 984
-  // X64: getelementptr inbounds i8, ptr %3, i64 80
+  // X64: getelementptr inbounds nuw i8, ptr addrspace(1) {{%[0-9]}}, i32 88
+  // X64: getelementptr inbounds nuw i8, ptr {{%[0-9]}}, i64 8
+  // X64: getelementptr inbounds nuw i8, ptr {{%[0-9]}}, i64 984
+  // X64: getelementptr inbounds nuw i8, ptr %3, i64 80
   // X64: icmp sgt i32 {{.*[0-9]}}, 67240703
   // X64: ret i32
   int a = (*(int*)(80 + ((char**** __ptr32*)1208)[0][11][1][123]) > 0x040202FF);
@@ -189,7 +189,7 @@ int test_function_ptr32_is_32bit() {
 int get_processor_count() {
   // X64-LABEL: define signext range(i32 -128, 128) i32 @get_processor_count()
   // X64: load ptr addrspace(1), ptr inttoptr (i64 16 to ptr)
-  // X64-NEXT: [[ARR_IDX1:%[a-z].*]] = getelementptr inbounds i8, ptr addrspace(1) %0, i32 660
+  // X64-NEXT: [[ARR_IDX1:%[a-z].*]] = getelementptr inbounds nuw i8, ptr addrspace(1) %0, i32 660
   // X64: load ptr addrspace(1), ptr addrspace(1) [[ARR_IDX1]]
   // X64: load i8, ptr addrspace(1) {{%[a-z].*}}
   // X64: sext i8 {{%[0-9]}} to i32
diff --git a/clang/test/CodeGen/aarch64-ls64-inline-asm.c b/clang/test/CodeGen/aarch64-ls64-inline-asm.c
index a01393525bcd42..8aa0684dba14d0 100644
--- a/clang/test/CodeGen/aarch64-ls64-inline-asm.c
+++ b/clang/test/CodeGen/aarch64-ls64-inline-asm.c
@@ -5,7 +5,7 @@ struct foo { unsigned long long x[8]; };
 
 // CHECK-LABEL: @load(
 // CHECK-NEXT:  entry:
-// CHECK-NEXT:    [[TMP0:%.*]] = tail call i512 asm sideeffect "ld64b $0,[$1]", "=r,r,~{memory}"(ptr [[ADDR:%.*]]) #[[ATTR1:[0-9]+]], !srcloc !2
+// CHECK-NEXT:    [[TMP0:%.*]] = tail call i512 asm sideeffect "ld64b $0,[$1]", "=r,r,~{memory}"(ptr [[ADDR:%.*]]) #[[ATTR1:[0-9]+]], !srcloc [[META2:![0-9]+]]
 // CHECK-NEXT:    store i512 [[TMP0]], ptr [[OUTPUT:%.*]], align 8
 // CHECK-NEXT:    ret void
 //
@@ -17,7 +17,7 @@ void load(struct foo *output, void *addr)
 // CHECK-LABEL: @store(
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    [[TMP0:%.*]] = load i512, ptr [[INPUT:%.*]], align 8
-// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[TMP0]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc !3
+// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[TMP0]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc [[META3:![0-9]+]]
 // CHECK-NEXT:    ret void
 //
 void store(const struct foo *input, void *addr)
@@ -29,25 +29,25 @@ void store(const struct foo *input, void *addr)
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    [[TMP0:%.*]] = load i32, ptr [[IN:%.*]], align 4, !tbaa [[TBAA4:![0-9]+]]
 // CHECK-NEXT:    [[CONV:%.*]] = sext i32 [[TMP0]] to i64
-// CHECK-NEXT:    [[ARRAYIDX1:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 4
+// CHECK-NEXT:    [[ARRAYIDX1:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 4
 // CHECK-NEXT:    [[TMP1:%.*]] = load i32, ptr [[ARRAYIDX1]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV2:%.*]] = sext i32 [[TMP1]] to i64
-// CHECK-NEXT:    [[ARRAYIDX4:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 16
+// CHECK-NEXT:    [[ARRAYIDX4:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 16
 // CHECK-NEXT:    [[TMP2:%.*]] = load i32, ptr [[ARRAYIDX4]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV5:%.*]] = sext i32 [[TMP2]] to i64
-// CHECK-NEXT:    [[ARRAYIDX7:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 64
+// CHECK-NEXT:    [[ARRAYIDX7:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 64
 // CHECK-NEXT:    [[TMP3:%.*]] = load i32, ptr [[ARRAYIDX7]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV8:%.*]] = sext i32 [[TMP3]] to i64
-// CHECK-NEXT:    [[ARRAYIDX10:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 100
+// CHECK-NEXT:    [[ARRAYIDX10:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 100
 // CHECK-NEXT:    [[TMP4:%.*]] = load i32, ptr [[ARRAYIDX10]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV11:%.*]] = sext i32 [[TMP4]] to i64
-// CHECK-NEXT:    [[ARRAYIDX13:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 144
+// CHECK-NEXT:    [[ARRAYIDX13:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 144
 // CHECK-NEXT:    [[TMP5:%.*]] = load i32, ptr [[ARRAYIDX13]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV14:%.*]] = sext i32 [[TMP5]] to i64
-// CHECK-NEXT:    [[ARRAYIDX16:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 196
+// CHECK-NEXT:    [[ARRAYIDX16:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 196
 // CHECK-NEXT:    [[TMP6:%.*]] = load i32, ptr [[ARRAYIDX16]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV17:%.*]] = sext i32 [[TMP6]] to i64
-// CHECK-NEXT:    [[ARRAYIDX19:%.*]] = getelementptr inbounds i8, ptr [[IN]], i64 256
+// CHECK-NEXT:    [[ARRAYIDX19:%.*]] = getelementptr inbounds nuw i8, ptr [[IN]], i64 256
 // CHECK-NEXT:    [[TMP7:%.*]] = load i32, ptr [[ARRAYIDX19]], align 4, !tbaa [[TBAA4]]
 // CHECK-NEXT:    [[CONV20:%.*]] = sext i32 [[TMP7]] to i64
 // CHECK-NEXT:    [[S_SROA_10_0_INSERT_EXT:%.*]] = zext i64 [[CONV20]] to i512
@@ -72,7 +72,7 @@ void store(const struct foo *input, void *addr)
 // CHECK-NEXT:    [[S_SROA_0_0_INSERT_EXT:%.*]] = zext i64 [[CONV]] to i512
 // CHECK-NEXT:    [[S_SROA_0_0_INSERT_MASK:%.*]] = or disjoint i512 [[S_SROA_4_0_INSERT_MASK]], [[S_SROA_4_0_INSERT_SHIFT]]
 // CHECK-NEXT:    [[S_SROA_0_0_INSERT_INSERT:%.*]] = or i512 [[S_SROA_0_0_INSERT_MASK]], [[S_SROA_0_0_INSERT_EXT]]
-// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[S_SROA_0_0_INSERT_INSERT]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc !8
+// CHECK-NEXT:    tail call void asm sideeffect "st64b $0,[$1]", "r,r,~{memory}"(i512 [[S_SROA_0_0_INSERT_INSERT]], ptr [[ADDR:%.*]]) #[[ATTR1]], !srcloc [[META8:![0-9]+]]
 // CHECK-NEXT:    ret void
 //
 void store2(int *in, void *addr)
diff --git a/clang/test/CodeGen/arm64_32-vaarg.c b/clang/test/CodeGen/arm64_32-vaarg.c
index 3f1f4443436da1..72c23d4967d2d3 100644
--- a/clang/test/CodeGen/arm64_32-vaarg.c
+++ b/clang/test/CodeGen/arm64_32-vaarg.c
@@ -10,7 +10,7 @@ typedef struct {
 int test_int(OneInt input, va_list *mylist) {
 // CHECK-LABEL: define{{.*}} i32 @test_int(i32 %input
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 4
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 4
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[RES:%.*]] = load i32, ptr [[START]]
@@ -28,9 +28,9 @@ typedef struct {
 long long test_longlong(OneLongLong input, va_list *mylist) {
   // CHECK-LABEL: define{{.*}} i64 @test_longlong(i64 %input
   // CHECK: [[STARTPTR:%.*]] = load ptr, ptr %mylist
-  // CHECK: [[ALIGN_TMP:%.+]] = getelementptr inbounds i8, ptr [[STARTPTR]], i32 7
+  // CHECK: [[ALIGN_TMP:%.+]] = getelementptr inbounds nuw i8, ptr [[STARTPTR]], i32 7
   // CHECK: [[ALIGNED_ADDR:%.+]] = tail call align 8 ptr @llvm.ptrmask.p0.i32(ptr nonnull [[ALIGN_TMP]], i32 -8)
-  // CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[ALIGNED_ADDR]], i32 8
+  // CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[ALIGNED_ADDR]], i32 8
   // CHECK: store ptr [[NEXT]], ptr %mylist
 
   // CHECK: [[RES:%.*]] = load i64, ptr [[ALIGNED_ADDR]]
@@ -49,7 +49,7 @@ float test_hfa(va_list *mylist) {
 // CHECK-LABEL: define{{.*}} float @test_hfa
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
 
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 16
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 16
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[RES:%.*]] = load float, ptr [[START]]
@@ -76,7 +76,7 @@ typedef struct {
 long long test_bigstruct(BigStruct input, va_list *mylist) {
 // CHECK-LABEL: define{{.*}} i64 @test_bigstruct(ptr
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 4
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 4
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[ADDR:%.*]] = load ptr, ptr [[START]]
@@ -97,7 +97,7 @@ short test_threeshorts(ThreeShorts input, va_list *mylist) {
 // CHECK-LABEL: define{{.*}} signext i16 @test_threeshorts([2 x i32] %input
 
 // CHECK: [[START:%.*]] = load ptr, ptr %mylist
-// CHECK: [[NEXT:%.*]] = getelementptr inbounds i8, ptr [[START]], i32 8
+// CHECK: [[NEXT:%.*]] = getelementptr inbounds nuw i8, ptr [[START]], i32 8
 // CHECK: store ptr [[NEXT]], ptr %mylist
 
 // CHECK: [[RES:%.*]] = load i16, ptr [[START]]
diff --git a/clang/test/CodeGen/attr-counted-by-pr110385.c b/clang/test/CodeGen/attr-counted-by-pr110385.c
index e120dcc583578d..c2ff032334fe27 100644
--- a/clang/test/CodeGen/attr-counted-by-pr110385.c
+++ b/clang/test/CodeGen/attr-counted-by-pr110385.c
@@ -31,7 +31,7 @@ void init(void * __attribute__((pass_dynamic_object_size(0))));
 // CHECK-NEXT:    [[GROWABLE:%.*]] = getelementptr inbounds nuw i8, ptr [[FOO]], i64 8
 // CHECK-NEXT:    [[TMP0:%.*]] = load ptr, ptr [[GROWABLE]], align 8, !tbaa [[TBAA2:![0-9]+]]
 // CHECK-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP0]], i64 12
-// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[TMP0]], i64 8
+// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[TMP0]], i64 8
 // CHECK-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // CHECK-NEXT:    [[TMP1:%.*]] = sext i32 [[DOT_COUNTED_BY_LOAD]] to i64
 // CHECK-NEXT:    [[TMP2:%.*]] = shl nsw i64 [[TMP1]], 1
@@ -48,7 +48,7 @@ void test1(struct bucket *foo) {
 // CHECK-SAME: ptr noundef [[FOO:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // CHECK-NEXT:  entry:
 // CHECK-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[FOO]], i64 16
-// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[FOO]], i64 12
+// CHECK-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[FOO]], i64 12
 // CHECK-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // CHECK-NEXT:    [[TMP0:%.*]] = sext i32 [[DOT_COUNTED_BY_LOAD]] to i64
 // CHECK-NEXT:    [[TMP1:%.*]] = shl nsw i64 [[TMP0]], 1
diff --git a/clang/test/CodeGen/attr-counted-by.c b/clang/test/CodeGen/attr-counted-by.c
index 4a130c5e3d401f..1028bffaf896d7 100644
--- a/clang/test/CodeGen/attr-counted-by.c
+++ b/clang/test/CodeGen/attr-counted-by.c
@@ -60,13 +60,13 @@ struct anon_struct {
 // SANITIZE-WITH-ATTR-SAME: ptr noundef [[P:%.*]], i32 noundef [[INDEX:%.*]], i32 noundef [[VAL:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
 // SANITIZE-WITH-ATTR-NEXT:  entry:
 // SANITIZE-WITH-ATTR-NEXT:    [[IDXPROM:%.*]] = sext i32 [[INDEX]] to i64
-// SANITIZE-WITH-ATTR-NEXT:    [[DOTCOUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8
+// SANITIZE-WITH-ATTR-NEXT:    [[DOTCOUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 8
 // SANITIZE-WITH-ATTR-NEXT:    [[DOTCOUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOTCOUNTED_BY_GEP]], align 4
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = zext i32 [[DOTCOUNTED_BY_LOAD]] to i64, !nosanitize [[META2:![0-9]+]]
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP1:%.*]] = icmp ult i64 [[IDXPROM]], [[TMP0]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    br i1 [[TMP1]], label [[CONT3:%.*]], label [[HANDLER_OUT_OF_BOUNDS:%.*]], !prof [[PROF3:![0-9]+]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       handler.out_of_bounds:
-// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[IDXPROM]]) #[[ATTR10:[0-9]+]], !nosanitize [[META2]]
+// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB1:[0-9]+]], i64 [[IDXPROM]]) #[[ATTR9:[0-9]+]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    unreachable, !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       cont3:
 // SANITIZE-WITH-ATTR-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
@@ -108,13 +108,13 @@ void test1(struct annotated *p, int index, int val) {
 // SANITIZE-WITH-ATTR-LABEL: define dso_local void @test2(
 // SANITIZE-WITH-ATTR-SAME: ptr noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR0]] {
 // SANITIZE-WITH-ATTR-NEXT:  entry:
-// SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8
+// SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 8
 // SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = zext i32 [[DOT_COUNTED_BY_LOAD]] to i64, !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    [[TMP1:%.*]] = icmp ult i64 [[INDEX]], [[TMP0]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    br i1 [[TMP1]], label [[CONT3:%.*]], label [[HANDLER_OUT_OF_BOUNDS:%.*]], !prof [[PROF3]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       handler.out_of_bounds:
-// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB3:[0-9]+]], i64 [[INDEX]]) #[[ATTR10]], !nosanitize [[META2]]
+// SANITIZE-WITH-ATTR-NEXT:    tail call void @__ubsan_handle_out_of_bounds_abort(ptr nonnull @[[GLOB3:[0-9]+]], i64 [[INDEX]]) #[[ATTR9]], !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR-NEXT:    unreachable, !nosanitize [[META2]]
 // SANITIZE-WITH-ATTR:       cont3:
 // SANITIZE-WITH-ATTR-NEXT:    [[ARRAY:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 12
@@ -128,7 +128,7 @@ void test1(struct annotated *p, int index, int val) {
 // NO-SANITIZE-WITH-ATTR-LABEL: define dso_local void @test2(
 // NO-SANITIZE-WITH-ATTR-SAME: ptr nocapture noundef [[P:%.*]], i64 noundef [[INDEX:%.*]]) local_unnamed_addr #[[ATTR1:[0-9]+]] {
 // NO-SANITIZE-WITH-ATTR-NEXT:  entry:
-// NO-SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds i8, ptr [[P]], i64 8
+// NO-SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_GEP:%.*]] = getelementptr inbounds nuw i8, ptr [[P]], i64 8
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[DOT_COUNTED_BY_LOAD:%.*]] = load i32, ptr [[DOT_COUNTED_BY_GEP]], align 4
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[TMP0:%.*]] = shl i32 [[DOT_COUNTED_BY_LOAD]], 2
 // NO-SANITIZE-WITH-ATTR-NEXT:    [[DO...
[truncated]

if (GEP.hasNoUnsignedSignedWrap() && !GEP.hasNoUnsignedWrap() &&
all_of(GEP.indices(), [&](Value *Idx) {
return isKnownNonNegative(Idx, SQ.getWithInstruction(&GEP));
})) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any alleviation of compile time impact if you do only constants first to rule out trivial case?

Copy link
Contributor

@goldsteinn goldsteinn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM although might be a bit more we do to reduce compile time impact.

all_of(GEP.indices(), [&](Value *Idx) {
return isKnownNonNegative(Idx, SQ.getWithInstruction(&GEP));
})) {
GEP.setNoWrapFlags(GEP.getNoWrapFlags() | GEPNoWrapFlags::noUnsignedWrap());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should drop duplicate logic in other places:

SelectionDAGBuilder::visitGetElementPtr
InstCombinerImpl::visitPtrToInt
llvm/lib/Transforms/Scalar/LICM.cpp:hoistGEP
SeparateConstOffsetFromGEP::reorderGEP

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:AMDGPU backend:PowerPC backend:SystemZ clang Clang issues not falling into any other category coroutines C++20 coroutines llvm:analysis llvm:transforms PGO Profile Guided Optimizations
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants