Skip to content

[DirectX] add GEP i8 legalization #142475

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 4, 2025
Merged

Conversation

farzonl
Copy link
Member

@farzonl farzonl commented Jun 2, 2025

fixes #140415

The i8 legalization code in DXILLegalizePass's fixI8UseChain needs to be updated to check for i8 geps.
It seems like there are i8 GEPs being left around after we remove all the other i8 instructions and this is causing problem on validation.

Since this is cleaning up a missed GEP The approach is to assume the getPointerOperand is to an alloca we further will check if this is an array alloca then do some byte offset arithmetic to figure out the memory index to use. Finally we will emit the new gep and cleanup the old one.

Finally needed to update upcastI8AllocasAndUses to account for loads off of GEPs instead of just loads from the alloca.

The i8 legalization code in DXILLegalizePass's `fixI8UseChain`
needs to be updated to check for i8 geps.
It seems like there are i8 GEPs being left around after we remove all
the other i8 instructions and this is causing problem on validation.

Since this is cleaning up a missed GEP The approach is to assume the
getPointerOperand is to an alloca we further will check if this is
an array alloca then do some byte offset arithmetic to figure out the
memory index to use. Finally we will emit the new gep and cleanup the
old one.

Finally needed to update upcastI8AllocasAndUses to account for loads off
of GEPs instead of just loads from the alloca.
@farzonl farzonl self-assigned this Jun 2, 2025
@llvmbot
Copy link
Member

llvmbot commented Jun 2, 2025

@llvm/pr-subscribers-clang
@llvm/pr-subscribers-hlsl

@llvm/pr-subscribers-backend-directx

Author: Farzon Lotfi (farzonl)

Changes

fixes #140415

The i8 legalization code in DXILLegalizePass's fixI8UseChain needs to be updated to check for i8 geps.
It seems like there are i8 GEPs being left around after we remove all the other i8 instructions and this is causing problem on validation.

Since this is cleaning up a missed GEP The approach is to assume the getPointerOperand is to an alloca we further will check if this is an array alloca then do some byte offset arithmetic to figure out the memory index to use. Finally we will emit the new gep and cleanup the old one.

Finally needed to update upcastI8AllocasAndUses to account for loads off of GEPs instead of just loads from the alloca.


Full diff: https://github.com/llvm/llvm-project/pull/142475.diff

2 Files Affected:

  • (modified) llvm/lib/Target/DirectX/DXILLegalizePass.cpp (+50-8)
  • (modified) llvm/test/CodeGen/DirectX/legalize-i8.ll (+54)
diff --git a/llvm/lib/Target/DirectX/DXILLegalizePass.cpp b/llvm/lib/Target/DirectX/DXILLegalizePass.cpp
index 23883c936a20d..f6a80ebfc8435 100644
--- a/llvm/lib/Target/DirectX/DXILLegalizePass.cpp
+++ b/llvm/lib/Target/DirectX/DXILLegalizePass.cpp
@@ -95,6 +95,8 @@ static void fixI8UseChain(Instruction &I,
     Type *ElementType = NewOperands[0]->getType();
     if (auto *AI = dyn_cast<AllocaInst>(NewOperands[0]))
       ElementType = AI->getAllocatedType();
+    if (auto *GEP = dyn_cast<GetElementPtrInst>(NewOperands[0]))
+      ElementType = GEP->getSourceElementType();
     LoadInst *NewLoad = Builder.CreateLoad(ElementType, NewOperands[0]);
     ReplacedValues[Load] = NewLoad;
     ToRemove.push_back(Load);
@@ -164,6 +166,36 @@ static void fixI8UseChain(Instruction &I,
     if (AdjustedCast)
       Cast->replaceAllUsesWith(AdjustedCast);
   }
+  if (auto *GEP = dyn_cast<GetElementPtrInst>(&I)) {
+    if (!GEP->getType()->isPointerTy() ||
+        !GEP->getSourceElementType()->isIntegerTy(8))
+      return;
+
+    Value *BasePtr = GEP->getPointerOperand();
+    if (ReplacedValues.count(BasePtr))
+      BasePtr = ReplacedValues[BasePtr];
+
+    Type *ElementType = BasePtr->getType();
+    if (auto *AI = dyn_cast<AllocaInst>(BasePtr))
+      ElementType = AI->getAllocatedType();
+    if (auto *ArrTy = dyn_cast<ArrayType>(ElementType))
+      ElementType = ArrTy->getArrayElementType();
+
+    ConstantInt *Offset = dyn_cast<ConstantInt>(GEP->getOperand(1));
+    // Note: the only way to convert an i8 offset to an i32 offset without
+    // emitting code Would be to emit code. We sould expect this value to be a
+    // ConstantInt since Offsets are very regulalrly converted.
+    assert(Offset && "Offset is expected to be a ConstantInt");
+    uint32_t ByteOffset = Offset->getZExtValue();
+    uint32_t ElemSize = GEP->getDataLayout().getTypeAllocSize(ElementType);
+    assert(ElemSize > 0 && "ElementSize must be set");
+    uint32_t Index = ByteOffset / ElemSize;
+    Value *NewGEP =
+        Builder.CreateGEP(ElementType, BasePtr, Builder.getInt32(Index),
+                          GEP->getName(), GEP->getNoWrapFlags());
+    ReplacedValues[GEP] = NewGEP;
+    ToRemove.push_back(GEP);
+  }
 }
 
 static void upcastI8AllocasAndUses(Instruction &I,
@@ -175,15 +207,12 @@ static void upcastI8AllocasAndUses(Instruction &I,
 
   Type *SmallestType = nullptr;
 
-  for (User *U : AI->users()) {
-    auto *Load = dyn_cast<LoadInst>(U);
-    if (!Load)
-      continue;
+  auto ProcessLoad = [&](LoadInst *Load) {
     for (User *LU : Load->users()) {
       Type *Ty = nullptr;
-      if (auto *Cast = dyn_cast<CastInst>(LU))
+      if (auto *Cast = dyn_cast<CastInst>(LU)) {
         Ty = Cast->getType();
-      if (CallInst *CI = dyn_cast<CallInst>(LU)) {
+      } else if (auto *CI = dyn_cast<CallInst>(LU)) {
         if (CI->getIntrinsicID() == Intrinsic::memset)
           Ty = Type::getInt32Ty(CI->getContext());
       }
@@ -191,9 +220,22 @@ static void upcastI8AllocasAndUses(Instruction &I,
       if (!Ty)
         continue;
 
-      if (!SmallestType ||
-          Ty->getPrimitiveSizeInBits() < SmallestType->getPrimitiveSizeInBits())
+      if (!SmallestType || Ty->getPrimitiveSizeInBits() <
+                               SmallestType->getPrimitiveSizeInBits()) {
         SmallestType = Ty;
+      }
+    }
+  };
+
+  for (User *U : AI->users()) {
+    if (auto *Load = dyn_cast<LoadInst>(U))
+      ProcessLoad(Load);
+    else if (auto *GEP = dyn_cast<GetElementPtrInst>(U)) {
+      for (User *GU : GEP->users()) {
+        if (auto *Load = dyn_cast<LoadInst>(GU)) {
+          ProcessLoad(Load);
+        }
+      }
     }
   }
 
diff --git a/llvm/test/CodeGen/DirectX/legalize-i8.ll b/llvm/test/CodeGen/DirectX/legalize-i8.ll
index 2602be778cd86..d1d76ccf5c76c 100644
--- a/llvm/test/CodeGen/DirectX/legalize-i8.ll
+++ b/llvm/test/CodeGen/DirectX/legalize-i8.ll
@@ -106,3 +106,57 @@ define i32 @all_imm() {
   %2 = sext i8 %1 to i32
   ret i32 %2
 }
+
+define i32 @scalar_i8_geps() {
+  ; CHECK-LABEL: define i32 @scalar_i8_geps(
+  ; CHECK-NEXT:    [[ALLOCA:%.*]] = alloca i32, align 4
+  ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds nuw i32, ptr [[ALLOCA]], i32 0
+  ; CHECK-NEXT:    [[LOAD:%.*]] = load i32, ptr [[GEP]], align 4
+  ; CHECK-NEXT:    ret i32 [[LOAD]]
+    %1 = alloca i8, align 4
+    %2 = getelementptr inbounds nuw i8, ptr %1, i32 0
+    %3 = load i8, ptr %2
+    %4 = sext i8 %3 to i32
+    ret i32 %4
+}
+
+define i32 @i8_geps_index0() {
+  ; CHECK-LABEL: define i32 @i8_geps_index0(
+  ; CHECK-NEXT:    [[ALLOCA:%.*]] = alloca [2 x i32], align 8
+  ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds nuw i32, ptr [[ALLOCA]], i32 0
+  ; CHECK-NEXT:    [[LOAD:%.*]] = load i32, ptr [[GEP]], align 4
+  ; CHECK-NEXT:    ret i32 [[LOAD]]
+  %1 = alloca [2 x i32], align 8
+  %2 = getelementptr inbounds nuw i8, ptr %1, i32 0
+  %3 = load i8, ptr %2
+  %4 = sext i8 %3 to i32
+  ret i32 %4
+}
+
+define i32 @i8_geps_index1() {
+  ; CHECK-LABEL: define i32 @i8_geps_index1(
+  ; CHECK-NEXT:    [[ALLOCA:%.*]] = alloca [2 x i32], align 8
+  ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds nuw i32, ptr [[ALLOCA]], i32 1
+  ; CHECK-NEXT:    [[LOAD:%.*]] = load i32, ptr [[GEP]]
+  ; CHECK-NEXT:    ret i32 [[LOAD]]
+  %1 = alloca [2 x i32], align 8
+  %2 = getelementptr inbounds nuw i8, ptr %1, i32 4
+  %3 = load i8, ptr %2
+  %4 = sext i8 %3 to i32
+  ret i32 %4
+}
+
+define i32 @i8_gep_store() {
+  ; CHECK-LABEL: define i32 @i8_gep_store(
+  ; CHECK-NEXT:    [[ALLOCA:%.*]] = alloca [2 x i32], align 8
+  ; CHECK-NEXT:    [[GEP:%.*]] = getelementptr inbounds nuw i32, ptr [[ALLOCA]], i32 1
+  ; CHECK-NEXT:    store i32 1, ptr [[GEP]], align 4
+  ; CHECK-NEXT:    [[LOAD:%.*]] = load i32, ptr [[GEP]]
+  ; CHECK-NEXT:    ret i32 [[LOAD]]
+  %1 = alloca [2 x i32], align 8
+  %2 = getelementptr inbounds nuw i8, ptr %1, i32 4
+  store i8 1, ptr %2
+  %3 = load i8, ptr %2
+  %4 = sext i8 %3 to i32
+  ret i32 %4
+}


ConstantInt *Offset = dyn_cast<ConstantInt>(GEP->getOperand(1));
// Note: the only way to convert an i8 offset to an i32 offset without
// emitting code Would be to emit code. We sould expect this value to be a
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should

@farzonl farzonl marked this pull request as draft June 2, 2025 21:00
@farzonl farzonl marked this pull request as ready for review June 2, 2025 21:49
Icohedron
Icohedron previously approved these changes Jun 2, 2025
@Icohedron
Copy link
Contributor

This PR appears to fail to compile this shader: https://godbolt.org/z/9MPc8Wca5

Stack trace:

While deleting: ptr addrspace(3) %
Use still stuck around after Def is destroyed:  %2 = load float, ptr addrspace(3) <badref>, align 4
Uses remain when a value is destroyed!
UNREACHABLE executed at /workspace/llvm-project/llvm/lib/IR/Value.cpp:102!
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace, preprocessed source, and associated run script.
Stack dump:
0.      Program arguments: bin/clang-dxc -T cs_6_3 -E CSMain -enable-16bit-types Reproduce.hlsl
1.      <eof> parser at end of file
2.      Code generation
3.      Running pass 'Function Pass Manager' on module 'Reproduce.hlsl'.
4.      Running pass 'DXIL Legalizer' on function '@CSMain'
 #0 0x00005faf059d7246 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) /workspace/llvm-project/llvm/lib/Support/Unix/Signals.inc:804:13
 #1 0x00005faf059d4ff0 llvm::sys::RunSignalHandlers() /workspace/llvm-project/llvm/lib/Support/Signals.cpp:106:18
 #2 0x00005faf059430bd (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) /workspace/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:74:5
 #3 0x00005faf059430bd CrashRecoverySignalHandler(int) /workspace/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:391:51
 #4 0x000075164ba40a70 __restore_rt (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x40a70)
 #5 0x000075164ba988ec __pthread_kill_implementation (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x988ec)
 #6 0x000075164ba409c6 gsignal (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x409c6)
 #7 0x000075164ba28938 abort (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x28938)
 #8 0x00005faf05949c9f (bin/clang-dxc+0x4201c9f)
 #9 0x00005faf054ffd50 llvm::Value::~Value() /workspace/llvm-project/llvm/lib/IR/Value.cpp:0:5
#10 0x00005faf054ff976 llvm::Value::deleteValue() /workspace/llvm-project/llvm/lib/IR/Value.cpp:0:3
#11 0x00005faf054673d3 llvm::iplist_impl<llvm::simple_ilist<llvm::Instruction, llvm::ilist_iterator_bits<true>, llvm::ilist_parent<llvm::BasicBlock>>, llvm::SymbolTableListTraits<llvm::Instruction, llvm::ilist_iterator_bits<true>, llvm::ilist_parent<llvm::BasicBlock>>>::erase(llvm::ilist_iterator_w_bits<llvm::ilist_detail::node_options<llvm::Instruction, true, false, void, true, llvm::BasicBlock>, false, false>) /workspace/llvm-project/llvm/include/llvm/ADT/ilist.h:206:12
#12 0x00005faf054673d3 llvm::Instruction::eraseFromParent() /workspace/llvm-project/llvm/lib/IR/Instruction.cpp:96:37
#13 0x00005faf0472710a bool std::operator==<llvm::Instruction**>(std::reverse_iterator<llvm::Instruction**> const&, std::reverse_iterator<llvm::Instruction**> const&) /nix/store/gnf3mv68i5g6jmabnbbncsar3kbg13zd-gcc-14-20241116/include/c++/14-20241116/bits/stl_iterator.h:443:25
#14 0x00005faf0472710a bool std::operator!=<llvm::Instruction**>(std::reverse_iterator<llvm::Instruction**> const&, std::reverse_iterator<llvm::Instruction**> const&) /nix/store/gnf3mv68i5g6jmabnbbncsar3kbg13zd-gcc-14-20241116/include/c++/14-20241116/bits/stl_iterator.h:457:20
#15 0x00005faf0472710a (anonymous namespace)::DXILLegalizationPipeline::runLegalizationPipeline(llvm::Function&) /workspace/llvm-project/llvm/lib/Target/DirectX/DXILLegalizePass.cpp:477:21
#16 0x00005faf0472ad22 (anonymous namespace)::DXILLegalizeLegacy::runOnFunction(llvm::Function&) /workspace/llvm-project/llvm/lib/Target/DirectX/DXILLegalizePass.cpp:522:21
#17 0x00005faf05499da9 llvm::FPPassManager::runOnFunction(llvm::Function&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1406:27
#18 0x00005faf054a0c92 llvm::FPPassManager::runOnModule(llvm::Module&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:1452:13
#19 0x00005faf0549a4a0 (anonymous namespace)::MPPassManager::runOnModule(llvm::Module&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:0:27
#20 0x00005faf0549a4a0 llvm::legacy::PassManagerImpl::run(llvm::Module&) /workspace/llvm-project/llvm/lib/IR/LegacyPassManager.cpp:539:44
#21 0x00005faf061298ea (anonymous namespace)::EmitAssemblyHelper::RunCodegenPipeline(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>&, std::unique_ptr<llvm::ToolOutputFile, std::default_delete<llvm::ToolOutputFile>>&) /workspace/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1262:9
#22 0x00005faf061298ea (anonymous namespace)::EmitAssemblyHelper::emitAssembly(clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) /workspace/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1285:3
#23 0x00005faf061298ea clang::emitBackendOutput(clang::CompilerInstance&, clang::CodeGenOptions&, llvm::StringRef, llvm::Module*, clang::BackendAction, llvm::IntrusiveRefCntPtr<llvm::vfs::FileSystem>, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>, clang::BackendConsumer*) /workspace/llvm-project/clang/lib/CodeGen/BackendUtil.cpp:1450:13
#24 0x00005faf0613d5ea std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream>>::~unique_ptr() /nix/store/gnf3mv68i5g6jmabnbbncsar3kbg13zd-gcc-14-20241116/include/c++/14-20241116/bits/unique_ptr.h:398:6
#25 0x00005faf0613d5ea clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) /workspace/llvm-project/clang/lib/CodeGen/CodeGenAction.cpp:316:3
#26 0x00005faf07c789d9 __gnu_cxx::__normal_iterator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>*, std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>>>>::__normal_iterator(std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>* const&) /nix/store/gnf3mv68i5g6jmabnbbncsar3kbg13zd-gcc-14-20241116/include/c++/14-20241116/bits/stl_iterator.h:1068:20
#27 0x00005faf07c789d9 std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>>>::begin() /nix/store/gnf3mv68i5g6jmabnbbncsar3kbg13zd-gcc-14-20241116/include/c++/14-20241116/bits/stl_vector.h:874:16
#28 0x00005faf07c789d9 void clang::finalize<std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>>>>(std::vector<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>, std::allocator<std::unique_ptr<clang::TemplateInstantiationCallback, std::default_delete<clang::TemplateInstantiationCallback>>>>&, clang::Sema const&) /workspace/llvm-project/clang/include/clang/Sema/TemplateInstCallback.h:54:16
#29 0x00005faf07c789d9 clang::ParseAST(clang::Sema&, bool, bool) /workspace/llvm-project/clang/lib/Parse/ParseAST.cpp:190:3
#30 0x00005faf067209f4 clang::FrontendAction::Execute() /workspace/llvm-project/clang/lib/Frontend/FrontendAction.cpp:1225:10
#31 0x00005faf066904ad llvm::Error::getPtr() const /workspace/llvm-project/llvm/include/llvm/Support/Error.h:278:42
#32 0x00005faf066904ad llvm::Error::operator bool() /workspace/llvm-project/llvm/include/llvm/Support/Error.h:241:16
#33 0x00005faf066904ad clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) /workspace/llvm-project/clang/lib/Frontend/CompilerInstance.cpp:1055:23
#34 0x00005faf0680e5c8 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) /workspace/llvm-project/clang/lib/FrontendTool/ExecuteCompilerInvocation.cpp:300:25
#35 0x00005faf045cd043 cc1_main(llvm::ArrayRef<char const*>, char const*, void*) /workspace/llvm-project/clang/tools/driver/cc1_main.cpp:297:15
#36 0x00005faf045c93b5 ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&, llvm::ToolContext const&) /workspace/llvm-project/clang/tools/driver/driver.cpp:223:12
#37 0x00005faf06500bd9 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0::operator()() const /workspace/llvm-project/clang/lib/Driver/Job.cpp:436:30
#38 0x00005faf06500bd9 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const::$_0>(long) /workspace/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:46:12
#39 0x00005faf05942cd2 llvm::function_ref<void ()>::operator()() const /workspace/llvm-project/llvm/include/llvm/ADT/STLFunctionalExtras.h:0:12
#40 0x00005faf05942cd2 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) /workspace/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp:427:3
#41 0x00005faf06500284 clang::driver::CC1Command::Execute(llvm::ArrayRef<std::optional<llvm::StringRef>>, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char>>*, bool*) const /workspace/llvm-project/clang/lib/Driver/Job.cpp:436:7
#42 0x00005faf064c1d78 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&, bool) const /workspace/llvm-project/clang/lib/Driver/Compilation.cpp:196:15
#43 0x00005faf064c1fc7 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&, bool) const /workspace/llvm-project/clang/lib/Driver/Compilation.cpp:251:13
#44 0x00005faf064de649 llvm::SmallVectorBase<unsigned int>::empty() const /workspace/llvm-project/llvm/include/llvm/ADT/SmallVector.h:82:46
#45 0x00005faf064de649 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*>>&) /workspace/llvm-project/clang/lib/Driver/Driver.cpp:2229:23
#46 0x00005faf045c8d14 clang_main(int, char**, llvm::ToolContext const&) /workspace/llvm-project/clang/tools/driver/driver.cpp:406:21
#47 0x00005faf045d9006 main /workspace/llvm-project/build/tools/clang/tools/driver/clang-driver.cpp:17:10
#48 0x000075164ba2a1fc __libc_start_call_main (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a1fc)
#49 0x000075164ba2a2b9 __libc_start_main@GLIBC_2.2.5 (/nix/store/h7zcxabfxa7v5xdna45y2hplj31ncf8a-glibc-2.40-36/lib/libc.so.6+0x2a2b9)
#50 0x00005faf045c6d25 _start (bin/clang-dxc+0x2e7ed25)
clang-dxc: error: clang frontend command failed with exit code 134 (use -v to see invocation)
clang version 21.0.0git (git@github.com:Icohedron/llvm-project.git 19668a1619396f3ea7447df4644b362af870c982)
Target: dxilv1.3-unknown-shadermodel6.3-compute
Thread model: posix
InstalledDir: /workspace/llvm-project/build/bin
Build config: +assertions
clang-dxc: note: diagnostic msg:

@Icohedron Icohedron dismissed their stale review June 3, 2025 17:42

Found a shader where Clang emits an i8 GEP that fails to legalize with this PR

Icohedron
Icohedron previously approved these changes Jun 3, 2025
@Icohedron
Copy link
Contributor

This PR appears to fail to compile this shader: https://godbolt.org/z/9MPc8Wca5

This shader now compiles, but it currently fails DXIL validation with the message

Explicit gep operator type does not match pointee type of pointer operand
Validation failed.

I think the i8 legalization is producing an incorrect GEP.

I see the pass converting these instructions:

  %2 = getelementptr inbounds nuw i8, ptr addrspace(3) @g, i32 4
  %3 = load float, ptr addrspace(3) %2, align 4

into

  %2 = load float, ptr addrspace(3) getelementptr inbounds nuw (ptr addrspace(3), ptr addrspace(3) @g, i32 1), align 4

but I would have expected this instead:

  %2 = load float, ptr addrspace(3) getelementptr inbounds nuw ([2 x float], ptr addrspace(3) @g, i32 0, i32 1), align 4

because the global is defined as @g = local_unnamed_addr addrspace(3) global [2 x float] zeroinitializer, align 4 and an earlier store operation in the IR looks fine

  %arrayidx.i = getelementptr inbounds nuw [2 x float], ptr addrspace(3) @g, i32 0, i32 %1
  store float 0.000000e+00, ptr addrspace(3) %arrayidx.i, align 4, !tbaa !5

@Icohedron Icohedron dismissed their stale review June 3, 2025 18:54

Invalid GEP conversion

@farzonl
Copy link
Member Author

farzonl commented Jun 3, 2025

the bug I was looking at used allocas, this one uses a globals its an easy fix to add:

if (auto *GV = dyn_cast<GlobalVariable>(BasePtr))
      ElementType = GV->getValueType();

However during my testing I found another gap. we still need to figure out how to support constExpr GEPs.

This PR appears to fail to compile this shader: https://godbolt.org/z/9MPc8Wca5

This shader now compiles, but it currently fails DXIL validation

@Icohedron
Copy link
Contributor

Icohedron commented Jun 3, 2025

the bug I was looking at used allocsa, this one uses a globals its an easy fix to add:

if (auto *GV = dyn_cast<GlobalVariable>(BasePtr))
      ElementType = GV->getValueType();

However during my testing I found another gap. we still need to figure out how to support constExpr GEPs.

This PR appears to fail to compile this shader: https://godbolt.org/z/9MPc8Wca5

This shader now compiles, but it currently fails DXIL validation

Incorporating that change still results in a shader that doesn't pass validation due to the same message: Explicit gep operator type does not match pointee type of pointer operand

The DXIL Legalize pass now emits:

@g = local_unnamed_addr addrspace(3) global [2 x float] zeroinitializer, align 4
...
define void @CSMain() local_unnamed_addr #0 {
entry:
  ...
  %2 = load float, ptr addrspace(3) getelementptr inbounds nuw (float, ptr addrspace(3) @g, i32 1), align 4

I would expect the %2 instruction to be the following to be a valid GEP:

  %2 = load float, ptr addrspace(3) getelementptr inbounds nuw ([2 x float], ptr addrspace(3) @g, i32 0, i32 1), align 4

Since vectors have been scalarized and arrays have been flattened, this should be straightforward to fix.

@llvmbot llvmbot added clang Clang issues not falling into any other category clang:codegen IR generation bugs: mangling, exceptions, etc. HLSL HLSL Language Support labels Jun 3, 2025
@farzonl farzonl force-pushed the bugfix/issue-140415 branch from 221c685 to 74b05e0 Compare June 3, 2025 21:35
@farzonl farzonl merged commit 5411ebd into llvm:main Jun 4, 2025
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend:DirectX clang:codegen IR generation bugs: mangling, exceptions, etc. clang Clang issues not falling into any other category HLSL HLSL Language Support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[DirectX] Validation Failure: Explicit gep type does not match pointee type of pointer operand
4 participants