[DirectX] add GEP i8 legalization #142475

farzonl · 2025-06-02T20:06:31Z

The i8 legalization code in DXILLegalizePass's fixI8UseChain needs to be updated to check for i8 geps.
It seems like there are i8 GEPs being left around after we remove all the other i8 instructions and this is causing problem on validation.

Since this is cleaning up a missed GEP The approach is to assume the getPointerOperand is to an alloca we further will check if this is an array alloca then do some byte offset arithmetic to figure out the memory index to use. Finally we will emit the new gep and cleanup the old one.

Finally needed to update upcastI8AllocasAndUses to account for loads off of GEPs instead of just loads from the alloca.

The i8 legalization code in DXILLegalizePass's `fixI8UseChain` needs to be updated to check for i8 geps. It seems like there are i8 GEPs being left around after we remove all the other i8 instructions and this is causing problem on validation. Since this is cleaning up a missed GEP The approach is to assume the getPointerOperand is to an alloca we further will check if this is an array alloca then do some byte offset arithmetic to figure out the memory index to use. Finally we will emit the new gep and cleanup the old one. Finally needed to update upcastI8AllocasAndUses to account for loads off of GEPs instead of just loads from the alloca.

llvmbot · 2025-06-02T20:07:06Z

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-hlsl

@llvm/pr-subscribers-backend-directx

Author: Farzon Lotfi (farzonl)

Changes

fixes #140415

The i8 legalization code in DXILLegalizePass's fixI8UseChain needs to be updated to check for i8 geps.
It seems like there are i8 GEPs being left around after we remove all the other i8 instructions and this is causing problem on validation.

Since this is cleaning up a missed GEP The approach is to assume the getPointerOperand is to an alloca we further will check if this is an array alloca then do some byte offset arithmetic to figure out the memory index to use. Finally we will emit the new gep and cleanup the old one.

Finally needed to update upcastI8AllocasAndUses to account for loads off of GEPs instead of just loads from the alloca.

Full diff: https://github.com/llvm/llvm-project/pull/142475.diff

2 Files Affected:

(modified) llvm/lib/Target/DirectX/DXILLegalizePass.cpp (+50-8)
(modified) llvm/test/CodeGen/DirectX/legalize-i8.ll (+54)

diff --git a/llvm/lib/Target/DirectX/DXILLegalizePass.cpp b/llvm/lib/Target/DirectX/DXILLegalizePass.cpp
index 23883c936a20d..f6a80ebfc8435 100644
--- a/llvm/lib/Target/DirectX/DXILLegalizePass.cpp
+++ b/llvm/lib/Target/DirectX/DXILLegalizePass.cpp
@@ -95,6 +95,8 @@ static void fixI8UseChain(Instruction &I,
     Type *ElementType = NewOperands[0]->getType();
     if (auto *AI = dyn_cast<AllocaInst>(NewOperands[0]))
       ElementType = AI->getAllocatedType();
+    if (auto *GEP = dyn_cast<GetElementPtrInst>(NewOperands[0]))
+      ElementType = GEP->getSourceElementType();
     LoadInst *NewLoad = Builder.CreateLoad(ElementType, NewOperands[0]);
     ReplacedValues[Load] = NewLoad;
     ToRemove.push_back(Load);
@@ -164,6 +166,36 @@ static void fixI8UseChain(Instruction &I,
     if (AdjustedCast)
       Cast->replaceAllUsesWith(AdjustedCast);
   }
+  if (auto *GEP = dyn_cast<GetElementPtrInst>(&I)) {
+    if (!GEP->getType()->isPointerTy() ||
+        !GEP->getSourceElementType()->isIntegerTy(8))
+      return;
+
+    Value *BasePtr = GEP->getPointerOperand();
+    if (ReplacedValues.count(BasePtr))
+      BasePtr = ReplacedValues[BasePtr];
+
+    Type *ElementType = BasePtr->getType();
+    if (auto *AI = dyn_cast<AllocaInst>(BasePtr))
+      ElementType = AI->getAllocatedType();
+    if (auto *ArrTy = dyn_cast<ArrayType>(ElementType))
+      ElementType = ArrTy->getArrayElementType();
+
+    ConstantInt *Offset = dyn_cast<ConstantInt>(GEP->getOperand(1));
+    // Note: the only way to convert an i8 offset to an i32 offset without
+    // emitting code Would be to emit code. We sould expect this value to be a
+    // ConstantInt since Offsets are very regulalrly converted.
+    assert(Offset && "Offset is expected to be a ConstantInt");
+    uint32_t ByteOffset = Offset->getZExtValue();
+    uint32_t ElemSize = GEP->getDataLayout().getTypeAllocSize(ElementType);
+    assert(ElemSize > 0 && "ElementSize must be set");
+    uint32_t Index = ByteOffset / ElemSize;
+    Value *NewGEP =
+        Builder.CreateGEP(ElementType, BasePtr, Builder.getInt32(Index),
+                          GEP->getName(), GEP->getNoWrapFlags());
+    ReplacedValues[GEP] = NewGEP;
+    ToRemove.push_back(GEP);
+  }
 }
 
 static void upcastI8AllocasAndUses(Instruction &I,
@@ -175,15 +207,12 @@ static void upcastI8AllocasAndUses(Instruction &I,
 
   Type *SmallestType = nullptr;
 
-  for (User *U : AI->users()) {
-    auto *Load = dyn_cast<LoadInst>(U);
-    if (!Load)
-      continue;
+  auto ProcessLoad = [&](LoadInst *Load) {
     for (User *LU : Load->users()) {
       Type *Ty = nullptr;
-      if (auto *Cast = dyn_cast<CastInst>(LU))
+      if (auto *Cast = dyn_cast<CastInst>(LU)) {
         Ty = Cast->getType();
-      if (CallInst *CI = dyn_cast<CallInst>(LU)) {
+      } else if (auto *CI = dyn_cast<CallInst>(LU)) {
         if (CI->getIntrinsicID() == Intrinsic::memset)
           Ty = Type::getInt32Ty(CI->getContext());
       }
@@ -191,9 +220,22 @@ static void upcastI8AllocasAndUses(Instruction &I,
       if (!Ty)
         continue;
 
-      if (!SmallestType ||
-          Ty->getPrimitiveSizeInBits() < SmallestType->getPrimitiveSizeInBits())
+      if (!SmallestType || Ty->getPrimitiveSizeInBits() <
+                               SmallestType->getPrimitiveSizeInBits()) {
         SmallestType = Ty;
+      }
+    }
+  };
+
+  for (User *U : AI->users()) {
+    if (auto *Load = dyn_cast<LoadInst>(U))
+      ProcessLoad(Load);
+    else if (auto *GEP = dyn_cast<GetElementPtrInst>(U)) {
+      for (User *GU : GEP->users()) {
+        if (auto *Load = dyn_cast<LoadInst>(GU)) {
+          ProcessLoad(Load);
+        }
+      }
     }
   }
 
diff --git a/llvm/test/CodeGen/DirectX/legalize-i8.ll b/llvm/test/CodeGen/DirectX/legalize-i8.ll
index 2602be778cd86..d1d76ccf5c76c 100644
--- a/llvm/test/CodeGen/DirectX/legalize-i8.ll
+++ b/llvm/test/CodeGen/DirectX/legalize-i8.ll
@@ -106,3 +106,57 @@ define i32 @all_imm() {
   %2 = sext i8 %1 to i32
   ret i32 %2
 }
+
+define i32 @scalar_i8_geps() {
+  ; CHECK-LABEL: define i32 @scalar_i8_geps(
+  ; CHECK-NEXT:    [[ALLOCA:%.*]] = alloca i32, align 4
+  ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds nuw i32, ptr [[ALLOCA]], i32 0
+  ; CHECK-NEXT:    [[LOAD:%.*]] = load i32, ptr [[GEP]], align 4
+  ; CHECK-NEXT:    ret i32 [[LOAD]]
+    %1 = alloca i8, align 4
+    %2 = getelementptr inbounds nuw i8, ptr %1, i32 0
+    %3 = load i8, ptr %2
+    %4 = sext i8 %3 to i32
+    ret i32 %4
+}
+
+define i32 @i8_geps_index0() {
+  ; CHECK-LABEL: define i32 @i8_geps_index0(
+  ; CHECK-NEXT:    [[ALLOCA:%.*]] = alloca [2 x i32], align 8
+  ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds nuw i32, ptr [[ALLOCA]], i32 0
+  ; CHECK-NEXT:    [[LOAD:%.*]] = load i32, ptr [[GEP]], align 4
+  ; CHECK-NEXT:    ret i32 [[LOAD]]
+  %1 = alloca [2 x i32], align 8
+  %2 = getelementptr inbounds nuw i8, ptr %1, i32 0
+  %3 = load i8, ptr %2
+  %4 = sext i8 %3 to i32
+  ret i32 %4
+}
+
+define i32 @i8_geps_index1() {
+  ; CHECK-LABEL: define i32 @i8_geps_index1(
+  ; CHECK-NEXT:    [[ALLOCA:%.*]] = alloca [2 x i32], align 8
+  ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds nuw i32, ptr [[ALLOCA]], i32 1
+  ; CHECK-NEXT:    [[LOAD:%.*]] = load i32, ptr [[GEP]]
+  ; CHECK-NEXT:    ret i32 [[LOAD]]
+  %1 = alloca [2 x i32], align 8
+  %2 = getelementptr inbounds nuw i8, ptr %1, i32 4
+  %3 = load i8, ptr %2
+  %4 = sext i8 %3 to i32
+  ret i32 %4
+}
+
+define i32 @i8_gep_store() {
+  ; CHECK-LABEL: define i32 @i8_gep_store(
+  ; CHECK-NEXT:    [[ALLOCA:%.*]] = alloca [2 x i32], align 8
+  ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds nuw i32, ptr [[ALLOCA]], i32 1
+  ; CHECK-NEXT:    store i32 1, ptr [[GEP]], align 4
+  ; CHECK-NEXT:    [[LOAD:%.*]] = load i32, ptr [[GEP]]
+  ; CHECK-NEXT:    ret i32 [[LOAD]]
+  %1 = alloca [2 x i32], align 8
+  %2 = getelementptr inbounds nuw i8, ptr %1, i32 4
+  store i8 1, ptr %2
+  %3 = load i8, ptr %2
+  %4 = sext i8 %3 to i32
+  ret i32 %4
+}

llvm/lib/Target/DirectX/DXILLegalizePass.cpp

spall · 2025-06-02T20:14:52Z

llvm/lib/Target/DirectX/DXILLegalizePass.cpp

+
+    ConstantInt *Offset = dyn_cast<ConstantInt>(GEP->getOperand(1));
+    // Note: the only way to convert an i8 offset to an i32 offset without
+    // emitting code Would be to emit code. We sould expect this value to be a


llvm/lib/Target/DirectX/DXILLegalizePass.cpp

…in pr comment

Icohedron · 2025-06-03T17:38:36Z

This PR appears to fail to compile this shader: https://godbolt.org/z/9MPc8Wca5

Stack trace:

While deleting: ptr addrspace(3) %
Use still stuck around after Def is destroyed:  %2 = load float, ptr addrspace(3) <badref>, align 4
Uses remain when a value is destroyed!
UNREACHABLE executed at /workspace/llvm-project/llvm/lib/IR/Value.cpp:102!
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: bin/clang-dxc -T cs_6_3 -E CSMain -enable-16bit-types Reproduce.hlsl
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'Reproduce.hlsl'.
4.      Running pass 'DXIL Legalizer' on function '@CSMain'
 #0 0x00005faf059d7246 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /workspace/llvm-project/llvm/lib/Support/Unix/Signals.inc:804:13
 #1 0x00005faf059d4ff0 llvm::sys::RunSignalHandlers() /workspace/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x00005faf059430bd (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /workspace/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:74:5
 #3 0x00005faf059430bd CrashRecoverySignalHandler(int) /workspace/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:391:51
 #4 0x000075164ba40a70 __restore_rt (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x40a70)
 #5 0x000075164ba988ec __pthread_kill_implementation (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x988ec)
 #6 0x000075164ba409c6 gsignal (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x409c6)
 #7 0x000075164ba28938 abort (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x28938)
 #8 0x00005faf05949c9f (bin/clang-dxc+0x4201c9f)
 #9 0x00005faf054ffd50 llvm::Value::~Value() /workspace/llvm-project/llvm/lib/IR/Value.cpp:0:5
#10 0x00005faf054ff976 llvm::Value::deleteValue() /workspace/llvm-project/llvm/lib/IR/Value.cpp:0:3
#11 0x00005faf054673d3 llvm::iplist_impl<llvm::simple_ilist<llvm::Instruction, llvm::ilist_iterator_bits<true>, llvm::ilist_parent<llvm::BasicBlock>>, llvm::SymbolTableListTraits<llvm::Instruction, llvm::ilist_iterator_bits<true>, llvm::ilist_parent<llvm::BasicBlock>>>::erase(llvm::ilist_iterator_w_bits<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void, true, llvm::BasicBlock>, false, false>) /workspace/llvm-project/llvm/include/llvm/ADT/ilist.h:206:12
#12 0x00005faf054673d3 llvm::Instruction::eraseFromParent() /workspace/llvm-project/llvm/lib/IR/Instruction.cpp:96:37
#13 0x00005faf0472710a bool std::operator==<llvm::Instruction**>(std::reverse_iterator<llvm::Instruction**> const&, std::reverse_iterator<llvm::Instruction**> const&) /nix/store/gnf3mv68i5g6jmabnbbncsar3kbg13zd-gcc-14-20241116/include/c++/14-20241116/bits/stl_iterator.h:443:25
#14 0x00005faf0472710a bool std::operator!=<llvm::Instruction**>(std::reverse_iterator<llvm::Instruction**> const&, std::reverse_iterator<llvm::Instruction**> const&) /nix/store/gnf3mv68i5g6jmabnbbncsar3kbg13zd-gcc-14-20241116/include/c++/14-20241116/bits/stl_iterator.h:457:20
#15 0x00005faf0472710a (anonymous namespace)::DXILLegalizationPipeline::runLegalizationPipeline(llvm::Function&) /workspace/llvm-project/llvm/lib/Target/DirectX/DXILLegalizePass.cpp:477:21
#16 0x00005faf0472ad22 (anonymous namespace)::DXILLegalizeLegacy::runOnFunction(llvm::Function&) /workspace/llvm-project/llvm/lib/Target/DirectX/DXILLegalizePass.cpp:522:21
#17 0x00005faf05499da9 llvm::FPPassManager::runOnFunction(llvm::Function&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1406:27
#18 0x00005faf054a0c92 llvm::FPPassManager::runOnModule(llvm::Module&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1452:13
#19 0x00005faf0549a4a0 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:0:27
#20 0x00005faf0549a4a0 llvm::legacy::PassManagerImpl::run(llvm::Module&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:539:44
#21 0x00005faf061298ea (anonymous namespace)::EmitAssemblyHelper::RunCodegenPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&) /workspace/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1262:9
#22 0x00005faf061298ea (anonymous namespace)::EmitAssemblyHelper::emitAssembly(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) /workspace/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1285:3
#23 0x00005faf061298ea clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) /workspace/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1450:13
#24 0x00005faf0613d5ea std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>::~unique_ptr() /nix/store/gnf3mv68i5g6jmabnbbncsar3kbg13zd-gcc-14-20241116/include/c++/14-20241116/bits/unique_ptr.h:398:6
#25 0x00005faf0613d5ea clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /workspace/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:316:3
#26 0x00005faf07c789d9 __gnu_cxx::__normal_iterator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>*, std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>>>>::__normal_iterator(std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>* const&) /nix/store/gnf3mv68i5g6jmabnbbncsar3kbg13zd-gcc-14-20241116/include/c++/14-20241116/bits/stl_iterator.h:1068:20
#27 0x00005faf07c789d9 std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>>>::begin() /nix/store/gnf3mv68i5g6jmabnbbncsar3kbg13zd-gcc-14-20241116/include/c++/14-20241116/bits/stl_vector.h:874:16
#28 0x00005faf07c789d9 void clang::finalize<std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>>>>(std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>>>&, clang::Sema const&) /workspace/llvm-project/clang/include/clang/Sema/TemplateInstCallback.h:54:16
#29 0x00005faf07c789d9 clang::ParseAST(clang::Sema&, bool, bool) /workspace/llvm-project/clang/lib/Parse/ParseAST.cpp:190:3
#30 0x00005faf067209f4 clang::FrontendAction::Execute() /workspace/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1225:10
#31 0x00005faf066904ad llvm::Error::getPtr() const /workspace/llvm-project/llvm/include/llvm/Support/Error.h:278:42
#32 0x00005faf066904ad llvm::Error::operator bool() /workspace/llvm-project/llvm/include/llvm/Support/Error.h:241:16
#33 0x00005faf066904ad clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /workspace/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1055:23
#34 0x00005faf0680e5c8 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /workspace/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:300:25
#35 0x00005faf045cd043 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /workspace/llvm-project/clang/tools/driver/cc1_main.cpp:297:15
#36 0x00005faf045c93b5 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /workspace/llvm-project/clang/tools/driver/driver.cpp:223:12
#37 0x00005faf06500bd9 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0::operator()() const /workspace/llvm-project/clang/lib/Driver/Job.cpp:436:30
#38 0x00005faf06500bd9 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) /workspace/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12
#39 0x00005faf05942cd2 llvm::function_ref<void ()>::operator()() const /workspace/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:0:12
#40 0x00005faf05942cd2 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /workspace/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:427:3
#41 0x00005faf06500284 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const /workspace/llvm-project/clang/lib/Driver/Job.cpp:436:7
#42 0x00005faf064c1d78 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const /workspace/llvm-project/clang/lib/Driver/Compilation.cpp:196:15
#43 0x00005faf064c1fc7 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const /workspace/llvm-project/clang/lib/Driver/Compilation.cpp:251:13
#44 0x00005faf064de649 llvm::SmallVectorBase<unsigned int>::empty() const /workspace/llvm-project/llvm/include/llvm/ADT/SmallVector.h:82:46
#45 0x00005faf064de649 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) /workspace/llvm-project/clang/lib/Driver/Driver.cpp:2229:23
#46 0x00005faf045c8d14 clang_main(int, char**, llvm::ToolContext const&) /workspace/llvm-project/clang/tools/driver/driver.cpp:406:21
#47 0x00005faf045d9006 main /workspace/llvm-project/build/tools/clang/tools/driver/clang-driver.cpp:17:10
#48 0x000075164ba2a1fc __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fc)
#49 0x000075164ba2a2b9 __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b9)
#50 0x00005faf045c6d25 _start (bin/clang-dxc+0x2e7ed25)
clang-dxc: error: clang frontend command failed with exit code 134 (use -v to see invocation)
clang version 21.0.0git (git@github.com:Icohedron/llvm-project.git 19668a1619396f3ea7447df4644b362af870c982)
Target: dxilv1.3-unknown-shadermodel6.3-compute
Thread model: posix
InstalledDir: /workspace/llvm-project/build/bin
Build config: +assertions
clang-dxc: note: diagnostic msg:

Found a shader where Clang emits an i8 GEP that fails to legalize with this PR

Icohedron · 2025-06-03T18:53:36Z

This PR appears to fail to compile this shader: https://godbolt.org/z/9MPc8Wca5

This shader now compiles, but it currently fails DXIL validation with the message

Explicit gep operator type does not match pointee type of pointer operand
Validation failed.

I think the i8 legalization is producing an incorrect GEP.

I see the pass converting these instructions:

  %2 = getelementptr inbounds nuw i8, ptr addrspace(3) @g, i32 4
  %3 = load float, ptr addrspace(3) %2, align 4

into

  %2 = load float, ptr addrspace(3) getelementptr inbounds nuw (ptr addrspace(3), ptr addrspace(3) @g, i32 1), align 4

but I would have expected this instead:

  %2 = load float, ptr addrspace(3) getelementptr inbounds nuw ([2 x float], ptr addrspace(3) @g, i32 0, i32 1), align 4

because the global is defined as @g = local_unnamed_addr addrspace(3) global [2 x float] zeroinitializer, align 4 and an earlier store operation in the IR looks fine

  %arrayidx.i = getelementptr inbounds nuw [2 x float], ptr addrspace(3) @g, i32 0, i32 %1
  store float 0.000000e+00, ptr addrspace(3) %arrayidx.i, align 4, !tbaa !5

Invalid GEP conversion

farzonl · 2025-06-03T20:14:01Z

the bug I was looking at used allocas, this one uses a globals its an easy fix to add:

if (auto *GV = dyn_cast<GlobalVariable>(BasePtr))
      ElementType = GV->getValueType();

However during my testing I found another gap. we still need to figure out how to support constExpr GEPs.

This PR appears to fail to compile this shader: https://godbolt.org/z/9MPc8Wca5

This shader now compiles, but it currently fails DXIL validation

llvm/test/CodeGen/DirectX/legalize-i8.ll

Icohedron · 2025-06-03T21:10:16Z

the bug I was looking at used allocsa, this one uses a globals its an easy fix to add:
if (auto *GV = dyn_cast<GlobalVariable>(BasePtr))
      ElementType = GV->getValueType();
However during my testing I found another gap. we still need to figure out how to support constExpr GEPs.

This PR appears to fail to compile this shader: https://godbolt.org/z/9MPc8Wca5

This shader now compiles, but it currently fails DXIL validation

Incorporating that change still results in a shader that doesn't pass validation due to the same message: Explicit gep operator type does not match pointee type of pointer operand

The DXIL Legalize pass now emits:

@g = local_unnamed_addr addrspace(3) global [2 x float] zeroinitializer, align 4
...
define void @CSMain() local_unnamed_addr #0 {
entry:
  ...
  %2 = load float, ptr addrspace(3) getelementptr inbounds nuw (float, ptr addrspace(3) @g, i32 1), align 4

I would expect the %2 instruction to be the following to be a valid GEP:

  %2 = load float, ptr addrspace(3) getelementptr inbounds nuw ([2 x float], ptr addrspace(3) @g, i32 0, i32 1), align 4

Since vectors have been scalarized and arrays have been flattened, this should be straightforward to fix.

…with constexpr GEPs

llvm/test/CodeGen/DirectX/legalize-i8.ll

farzonl self-assigned this Jun 2, 2025

llvmbot added the backend:DirectX label Jun 2, 2025

spall reviewed Jun 2, 2025

View reviewed changes

farzonl marked this pull request as draft June 2, 2025 21:00

fix comment up, fix up some style guide reuquired changes called out …

19668a1

…in pr comment

farzonl marked this pull request as ready for review June 2, 2025 21:49

Icohedron previously approved these changes Jun 2, 2025

View reviewed changes

add replaceAllUsesWith back to update old gep to new gep

6f4702a

Icohedron previously approved these changes Jun 3, 2025

View reviewed changes

Icohedron reviewed Jun 3, 2025

View reviewed changes

llvm/test/CodeGen/DirectX/legalize-i8.ll Outdated Show resolved Hide resolved

llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. HLSL HLSL Language Support labels Jun 3, 2025

Add support for GEPs with global ptr operands. Add support for loads …

74b05e0

…with constexpr GEPs

farzonl force-pushed the bugfix/issue-140415 branch from 221c685 to 74b05e0 Compare June 3, 2025 21:35

make geps have array types to make the validator happy

5f1e240

Icohedron reviewed Jun 3, 2025

View reviewed changes

llvm/test/CodeGen/DirectX/legalize-i8.ll Show resolved Hide resolved

Icohedron approved these changes Jun 4, 2025

View reviewed changes

Icohedron mentioned this pull request Jun 4, 2025

[DirectX] Validation error: Access to out-of-bounds memory is disallowed #140417

Open

farzonl merged commit 5411ebd into llvm:main Jun 4, 2025
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[DirectX] add GEP i8 legalization #142475

[DirectX] add GEP i8 legalization #142475

Uh oh!

farzonl commented Jun 2, 2025

Uh oh!

llvmbot commented Jun 2, 2025 •

edited

Loading

Uh oh!

Uh oh!

spall Jun 2, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Icohedron commented Jun 3, 2025

Uh oh!

Icohedron commented Jun 3, 2025

Uh oh!

farzonl commented Jun 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Icohedron commented Jun 3, 2025 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

[DirectX] add GEP i8 legalization #142475

[DirectX] add GEP i8 legalization #142475

Uh oh!

Conversation

farzonl commented Jun 2, 2025

Uh oh!

llvmbot commented Jun 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

spall Jun 2, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Icohedron commented Jun 3, 2025

Uh oh!

Icohedron commented Jun 3, 2025

Uh oh!

farzonl commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Icohedron commented Jun 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

llvmbot commented Jun 2, 2025 •

edited

Loading

farzonl commented Jun 3, 2025 •

edited

Loading

Icohedron commented Jun 3, 2025 •

edited

Loading