Skip to content

[DirectX] Bug fix for Data Scalarization crash #118426

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 4 commits into from
Dec 18, 2024

Conversation

farzonl
Copy link
Member

@farzonl farzonl commented Dec 3, 2024

Two bugs here. First calling Inst->getFunction() has undefined behavior if the instruction is not tracked to a function. I suspect the replaceAllUsesWith was leaving the GEPs in a weird ghost parent situation. I switched up the visitor to be able to eraseFromParent as part of visiting and then everything started working.

The second bug was in DXILFlattenArrays.cpp. I was unaware that you can have multidimensional arrays of zeroinitializer, and undef so fixed up the initializer to handle these two cases.

fixes #117273

@farzonl farzonl requested a review from llvm-beanz December 3, 2024 04:44
@farzonl farzonl self-assigned this Dec 3, 2024
@llvmbot
Copy link
Member

llvmbot commented Dec 3, 2024

@llvm/pr-subscribers-backend-directx

Author: Farzon Lotfi (farzonl)

Changes

Two bugs here. First calling Inst->getFunction() has undefined behavior if the instruction is not tracked to a function. I suspect the replaceAllUsesWith was leaving the GEPs in a weird ghost parent situation. I switched up the visitor to be able to eraseFromParent as part of visiting and then everything started working.

The second bug was in DXILFlattenArrays.cpp. I was unaware that you can have multidimensional arrays of zeroinitializer, and undef so fixed up the initializer to handle these two cases.

fixes #117273


Full diff: https://github.com/llvm/llvm-project/pull/118426.diff

4 Files Affected:

  • (modified) llvm/lib/Target/DirectX/DXILDataScalarization.cpp (+8-20)
  • (modified) llvm/lib/Target/DirectX/DXILFlattenArrays.cpp (+12)
  • (added) llvm/test/CodeGen/DirectX/flatten-bug-117273.ll (+25)
  • (added) llvm/test/CodeGen/DirectX/scalar-bug-117273.ll (+32)
diff --git a/llvm/lib/Target/DirectX/DXILDataScalarization.cpp b/llvm/lib/Target/DirectX/DXILDataScalarization.cpp
index 0e6cf59e257508..42e40d146ff5c1 100644
--- a/llvm/lib/Target/DirectX/DXILDataScalarization.cpp
+++ b/llvm/lib/Target/DirectX/DXILDataScalarization.cpp
@@ -67,28 +67,17 @@ class DataScalarizerVisitor : public InstVisitor<DataScalarizerVisitor, bool> {
 private:
   GlobalVariable *lookupReplacementGlobal(Value *CurrOperand);
   DenseMap<GlobalVariable *, GlobalVariable *> GlobalMap;
-  SmallVector<WeakTrackingVH, 32> PotentiallyDeadInstrs;
-  bool finish();
 };
 
 bool DataScalarizerVisitor::visit(Function &F) {
   assert(!GlobalMap.empty());
-  ReversePostOrderTraversal<BasicBlock *> RPOT(&F.getEntryBlock());
-  for (BasicBlock *BB : RPOT) {
-    for (BasicBlock::iterator II = BB->begin(), IE = BB->end(); II != IE;) {
-      Instruction *I = &*II;
-      bool Done = InstVisitor::visit(I);
-      ++II;
-      if (Done && I->getType()->isVoidTy())
-        I->eraseFromParent();
-    }
+  bool MadeChange = false;
+  ReversePostOrderTraversal<Function *> RPOT(&F);
+  for (BasicBlock *BB : make_early_inc_range(RPOT)) {
+    for (Instruction &I : make_early_inc_range(*BB))
+      MadeChange |= InstVisitor::visit(I);
   }
-  return finish();
-}
-
-bool DataScalarizerVisitor::finish() {
-  RecursivelyDeleteTriviallyDeadInstructionsPermissive(PotentiallyDeadInstrs);
-  return true;
+  return MadeChange;
 }
 
 GlobalVariable *
@@ -140,7 +129,7 @@ bool DataScalarizerVisitor::visitGetElementPtrInst(GetElementPtrInst &GEPI) {
         Builder.CreateGEP(NewGlobal->getValueType(), NewGlobal, Indices);
 
     GEPI.replaceAllUsesWith(NewGEP);
-    PotentiallyDeadInstrs.emplace_back(&GEPI);
+    GEPI.eraseFromParent();
   }
   return true;
 }
@@ -252,8 +241,7 @@ static bool findAndReplaceVectors(Module &M) {
                                                 /*RemoveDeadConstants=*/false,
                                                 /*IncludeSelf=*/true);
         }
-        if (isa<Instruction>(U)) {
-          Instruction *Inst = cast<Instruction>(U);
+        if (Instruction *Inst = dyn_cast<Instruction>(U)) {
           Function *F = Inst->getFunction();
           if (F)
             Impl.visit(*F);
diff --git a/llvm/lib/Target/DirectX/DXILFlattenArrays.cpp b/llvm/lib/Target/DirectX/DXILFlattenArrays.cpp
index e4a3bc76eeacd2..fbf0e4bc35d6c6 100644
--- a/llvm/lib/Target/DirectX/DXILFlattenArrays.cpp
+++ b/llvm/lib/Target/DirectX/DXILFlattenArrays.cpp
@@ -321,6 +321,18 @@ static void collectElements(Constant *Init,
     Elements.push_back(Init);
     return;
   }
+  auto *LLVMArrayType = dyn_cast<ArrayType>(Init->getType());
+  if (isa<ConstantAggregateZero>(Init)) {
+    for (unsigned I = 0; I < LLVMArrayType->getNumElements(); ++I)
+      Elements.push_back(
+          Constant::getNullValue(LLVMArrayType->getElementType()));
+    return;
+  }
+  if (isa<UndefValue>(Init)) {
+    for (unsigned I = 0; I < LLVMArrayType->getNumElements(); ++I)
+      Elements.push_back(UndefValue::get(LLVMArrayType->getElementType()));
+    return;
+  }
 
   // Recursive case: Process each element in the array.
   if (auto *ArrayConstant = dyn_cast<ConstantArray>(Init)) {
diff --git a/llvm/test/CodeGen/DirectX/flatten-bug-117273.ll b/llvm/test/CodeGen/DirectX/flatten-bug-117273.ll
new file mode 100644
index 00000000000000..1e0f4fb3d382a1
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/flatten-bug-117273.ll
@@ -0,0 +1,25 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes='dxil-flatten-arrays,dxil-op-lower' -mtriple=dxil-pc-shadermodel6.3-library %s | FileCheck %s
+
+
+@_ZL7Palette = internal constant [2 x [3 x float]] [[3 x float] zeroinitializer, [3 x float] undef], align 16
+
+; CHECK: @_ZL7Palette.1dim = internal constant [6 x float] [float 0.000000e+00, float 0.000000e+00, float 0.000000e+00, float undef, float undef, float undef], align 16
+
+define internal void @_Z4mainDv3_j(<3 x i32> noundef %DID) {
+; CHECK-LABEL: define internal void @_Z4mainDv3_j(
+; CHECK-SAME: <3 x i32> noundef [[DID:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr [24 x float], ptr @_ZL7Palette.1dim, i32 1
+; CHECK-NEXT:    [[DOTI0:%.*]] = load float, ptr [[TMP0]], align 16
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr [24 x float], ptr @_ZL7Palette.1dim, i32 2
+; CHECK-NEXT:    [[DOTI03:%.*]] = load float, ptr [[TMP1]], align 16
+; CHECK-NEXT:    ret void
+;
+entry:
+  %0 = getelementptr [8 x [3 x float]], ptr @_ZL7Palette, i32 0, i32 1
+  %.i0 = load float, ptr %0, align 16
+  %1 = getelementptr [8 x [3 x float]], ptr @_ZL7Palette, i32 0, i32 2
+  %.i03 = load float, ptr %1, align 16
+  ret void
+}
diff --git a/llvm/test/CodeGen/DirectX/scalar-bug-117273.ll b/llvm/test/CodeGen/DirectX/scalar-bug-117273.ll
new file mode 100644
index 00000000000000..aa09fa401a6492
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/scalar-bug-117273.ll
@@ -0,0 +1,32 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes='dxil-data-scalarization,dxil-flatten-arrays,function(scalarizer<load-store>),dxil-op-lower' -mtriple=dxil-pc-shadermodel6.3-library %s | FileCheck %s
+
+
+@_ZL7Palette = internal constant [8 x <3 x float>] [<3 x float> zeroinitializer, <3 x float> splat (float 5.000000e-01), <3 x float> <float 1.000000e+00, float 5.000000e-01, float 5.000000e-01>, <3 x float> <float 5.000000e-01, float 1.000000e+00, float 5.000000e-01>, <3 x float> <float 5.000000e-01, float 5.000000e-01, float 1.000000e+00>, <3 x float> <float 5.000000e-01, float 1.000000e+00, float 1.000000e+00>, <3 x float> <float 1.000000e+00, float 5.000000e-01, float 1.000000e+00>, <3 x float> <float 1.000000e+00, float 1.000000e+00, float 5.000000e-01>], align 16
+
+; Function Attrs: alwaysinline convergent mustprogress norecurse nounwind
+define internal void @_Z4mainDv3_j(<3 x i32> noundef %DID) #1 {
+; CHECK-LABEL: define internal void @_Z4mainDv3_j(
+; CHECK-SAME: <3 x i32> noundef [[DID:%.*]]) {
+; CHECK-NEXT:  [[ENTRY:.*:]]
+; CHECK-NEXT:    [[TMP0:%.*]] = getelementptr [24 x float], ptr @_ZL7Palette.scalarized.1dim, i32 1
+; CHECK-NEXT:    [[DOTI0:%.*]] = load float, ptr [[TMP0]], align 16
+; CHECK-NEXT:    [[DOTI1:%.*]] = getelementptr float, ptr [[TMP0]], i32 1
+; CHECK-NEXT:    [[DOTI11:%.*]] = load float, ptr [[DOTI1]], align 4
+; CHECK-NEXT:    [[DOTI2:%.*]] = getelementptr float, ptr [[TMP0]], i32 2
+; CHECK-NEXT:    [[DOTI22:%.*]] = load float, ptr [[DOTI2]], align 8
+; CHECK-NEXT:    [[TMP1:%.*]] = getelementptr [24 x float], ptr @_ZL7Palette.scalarized.1dim, i32 2
+; CHECK-NEXT:    [[DOTI03:%.*]] = load float, ptr [[TMP1]], align 16
+; CHECK-NEXT:    [[DOTI14:%.*]] = getelementptr float, ptr [[TMP1]], i32 1
+; CHECK-NEXT:    [[DOTI15:%.*]] = load float, ptr [[DOTI14]], align 4
+; CHECK-NEXT:    [[DOTI26:%.*]] = getelementptr float, ptr [[TMP1]], i32 2
+; CHECK-NEXT:    [[DOTI27:%.*]] = load float, ptr [[DOTI26]], align 8
+; CHECK-NEXT:    ret void
+;
+entry:
+  %arrayidx = getelementptr inbounds [8 x <3 x float>], ptr @_ZL7Palette, i32 0, i32 1
+  %2 = load <3 x float>, ptr %arrayidx, align 16
+  %arrayidx2 = getelementptr inbounds [8 x <3 x float>], ptr @_ZL7Palette, i32 0, i32 2
+  %3 = load <3 x float>, ptr %arrayidx2, align 16
+  ret void
+}

Copy link
Contributor

@bogner bogner left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few style comments. LGTM.

@farzonl farzonl force-pushed the bugfix/issue-117273 branch from 7a4f555 to 58abdec Compare December 16, 2024 19:28
Copy link

github-actions bot commented Dec 16, 2024

✅ With the latest revision this PR passed the undef deprecator.

2. remove use of convertUsersOfConstantsToInstructions (was leaving use
   chains in unstable state).
3. remove RPOT for DXILDataScalarization.cpp. we were visiting twice.
@farzonl farzonl force-pushed the bugfix/issue-117273 branch from 58abdec to 340e46a Compare December 16, 2024 19:32
@@ -33,18 +32,13 @@ define <4 x i32> @load_array_vec_test() #0 {
; CHECK-NEXT: [[TMP6:%.*]] = load i32, ptr addrspace(3) [[TMP5]], align 4
; CHECK-NEXT: [[TMP7:%.*]] = bitcast ptr addrspace(3) getelementptr (i32, ptr addrspace(3) @arrayofVecData.scalarized.1dim, i32 3) to ptr addrspace(3)
; CHECK-NEXT: [[TMP8:%.*]] = load i32, ptr addrspace(3) [[TMP7]], align 4
; CHECK-NEXT: [[TMP9:%.*]] = bitcast ptr addrspace(3) @arrayofVecData.scalarized.1dim to ptr addrspace(3)
; CHECK-NEXT: [[TMP10:%.*]] = getelementptr [2 x [3 x float]], ptr addrspace(3) [[TMP9]], i32 0, i32 1
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the bugs from convertUsersOfConstantsToInstructions. It would split the bitcast from the GEP, but then we would never visit the GEP and so the GEP type never got flattened.

@@ -87,7 +81,7 @@ define <4 x i32> @load_vec_test() #0 {
define <4 x i32> @load_static_array_of_vec_test(i32 %index) #0 {
; CHECK-LABEL: define <4 x i32> @load_static_array_of_vec_test(
; CHECK-SAME: i32 [[INDEX:%.*]]) #[[ATTR0]] {
; CHECK-NEXT: [[DOTFLAT:%.*]] = getelementptr [12 x i32], ptr @staticArrayOfVecData.scalarized.1dim, i32 [[INDEX]]
; CHECK-NEXT: [[DOTFLAT:%.*]] = getelementptr inbounds [12 x i32], ptr @staticArrayOfVecData.scalarized.1dim, i32 [[INDEX]]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

our GEP builders were not preserving inbounds. Now we are.

; RUN: opt -S -passes='dxil-flatten-arrays,dxil-op-lower' -mtriple=dxil-pc-shadermodel6.3-library %s | FileCheck %s


@ZerroInitAndUndefArr = internal constant [2 x [3 x float]] [[3 x float] zeroinitializer, [3 x float] undef], align 16
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test case in #117273 doesn't have anything that should be producing an undef array, and I'm not sure it should be possible to create an undef array in HLSL. Do you know where this is coming from?

Copy link
Member Author

@farzonl farzonl Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to get this exact case again. My experiments today just cause one undef not a nested one.

It wasn't your exact example I just was playing around with how to change the values in the array after I saw float3(0,0,0) became zeroinitializer . I then noticed that if I created an uninitialized array and passed it to the Pallette array I could get an undef array.
example with fcgl: https://hlsl.godbolt.org/z/oPxxoKf9f
example without: https://hlsl.godbolt.org/z/rr1TocKnj

@bogner bogner self-requested a review December 18, 2024 17:27
Copy link

github-actions bot commented Dec 18, 2024

✅ With the latest revision this PR passed the C/C++ code formatter.

@farzonl farzonl force-pushed the bugfix/issue-117273 branch from aefade2 to f448ac5 Compare December 18, 2024 19:18
@farzonl farzonl merged commit 6457aee into llvm:main Dec 18, 2024
6 of 8 checks passed
@farzonl farzonl deleted the bugfix/issue-117273 branch January 16, 2025 07:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

[DirectX] DXIL Data Scalarization crash
4 participants