-
Notifications
You must be signed in to change notification settings - Fork 13.6k
[DirectX] Bug fix for Data Scalarization crash #118426
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-backend-directx Author: Farzon Lotfi (farzonl) ChangesTwo bugs here. First calling The second bug was in fixes #117273 Full diff: https://github.com/llvm/llvm-project/pull/118426.diff 4 Files Affected:
diff --git a/llvm/lib/Target/DirectX/DXILDataScalarization.cpp b/llvm/lib/Target/DirectX/DXILDataScalarization.cpp
index 0e6cf59e257508..42e40d146ff5c1 100644
--- a/llvm/lib/Target/DirectX/DXILDataScalarization.cpp
+++ b/llvm/lib/Target/DirectX/DXILDataScalarization.cpp
@@ -67,28 +67,17 @@ class DataScalarizerVisitor : public InstVisitor<DataScalarizerVisitor, bool> {
private:
GlobalVariable *lookupReplacementGlobal(Value *CurrOperand);
DenseMap<GlobalVariable *, GlobalVariable *> GlobalMap;
- SmallVector<WeakTrackingVH, 32> PotentiallyDeadInstrs;
- bool finish();
};
bool DataScalarizerVisitor::visit(Function &F) {
assert(!GlobalMap.empty());
- ReversePostOrderTraversal<BasicBlock *> RPOT(&F.getEntryBlock());
- for (BasicBlock *BB : RPOT) {
- for (BasicBlock::iterator II = BB->begin(), IE = BB->end(); II != IE;) {
- Instruction *I = &*II;
- bool Done = InstVisitor::visit(I);
- ++II;
- if (Done && I->getType()->isVoidTy())
- I->eraseFromParent();
- }
+ bool MadeChange = false;
+ ReversePostOrderTraversal<Function *> RPOT(&F);
+ for (BasicBlock *BB : make_early_inc_range(RPOT)) {
+ for (Instruction &I : make_early_inc_range(*BB))
+ MadeChange |= InstVisitor::visit(I);
}
- return finish();
-}
-
-bool DataScalarizerVisitor::finish() {
- RecursivelyDeleteTriviallyDeadInstructionsPermissive(PotentiallyDeadInstrs);
- return true;
+ return MadeChange;
}
GlobalVariable *
@@ -140,7 +129,7 @@ bool DataScalarizerVisitor::visitGetElementPtrInst(GetElementPtrInst &GEPI) {
Builder.CreateGEP(NewGlobal->getValueType(), NewGlobal, Indices);
GEPI.replaceAllUsesWith(NewGEP);
- PotentiallyDeadInstrs.emplace_back(&GEPI);
+ GEPI.eraseFromParent();
}
return true;
}
@@ -252,8 +241,7 @@ static bool findAndReplaceVectors(Module &M) {
/*RemoveDeadConstants=*/false,
/*IncludeSelf=*/true);
}
- if (isa<Instruction>(U)) {
- Instruction *Inst = cast<Instruction>(U);
+ if (Instruction *Inst = dyn_cast<Instruction>(U)) {
Function *F = Inst->getFunction();
if (F)
Impl.visit(*F);
diff --git a/llvm/lib/Target/DirectX/DXILFlattenArrays.cpp b/llvm/lib/Target/DirectX/DXILFlattenArrays.cpp
index e4a3bc76eeacd2..fbf0e4bc35d6c6 100644
--- a/llvm/lib/Target/DirectX/DXILFlattenArrays.cpp
+++ b/llvm/lib/Target/DirectX/DXILFlattenArrays.cpp
@@ -321,6 +321,18 @@ static void collectElements(Constant *Init,
Elements.push_back(Init);
return;
}
+ auto *LLVMArrayType = dyn_cast<ArrayType>(Init->getType());
+ if (isa<ConstantAggregateZero>(Init)) {
+ for (unsigned I = 0; I < LLVMArrayType->getNumElements(); ++I)
+ Elements.push_back(
+ Constant::getNullValue(LLVMArrayType->getElementType()));
+ return;
+ }
+ if (isa<UndefValue>(Init)) {
+ for (unsigned I = 0; I < LLVMArrayType->getNumElements(); ++I)
+ Elements.push_back(UndefValue::get(LLVMArrayType->getElementType()));
+ return;
+ }
// Recursive case: Process each element in the array.
if (auto *ArrayConstant = dyn_cast<ConstantArray>(Init)) {
diff --git a/llvm/test/CodeGen/DirectX/flatten-bug-117273.ll b/llvm/test/CodeGen/DirectX/flatten-bug-117273.ll
new file mode 100644
index 00000000000000..1e0f4fb3d382a1
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/flatten-bug-117273.ll
@@ -0,0 +1,25 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes='dxil-flatten-arrays,dxil-op-lower' -mtriple=dxil-pc-shadermodel6.3-library %s | FileCheck %s
+
+
+@_ZL7Palette = internal constant [2 x [3 x float]] [[3 x float] zeroinitializer, [3 x float] undef], align 16
+
+; CHECK: @_ZL7Palette.1dim = internal constant [6 x float] [float 0.000000e+00, float 0.000000e+00, float 0.000000e+00, float undef, float undef, float undef], align 16
+
+define internal void @_Z4mainDv3_j(<3 x i32> noundef %DID) {
+; CHECK-LABEL: define internal void @_Z4mainDv3_j(
+; CHECK-SAME: <3 x i32> noundef [[DID:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[TMP0:%.*]] = getelementptr [24 x float], ptr @_ZL7Palette.1dim, i32 1
+; CHECK-NEXT: [[DOTI0:%.*]] = load float, ptr [[TMP0]], align 16
+; CHECK-NEXT: [[TMP1:%.*]] = getelementptr [24 x float], ptr @_ZL7Palette.1dim, i32 2
+; CHECK-NEXT: [[DOTI03:%.*]] = load float, ptr [[TMP1]], align 16
+; CHECK-NEXT: ret void
+;
+entry:
+ %0 = getelementptr [8 x [3 x float]], ptr @_ZL7Palette, i32 0, i32 1
+ %.i0 = load float, ptr %0, align 16
+ %1 = getelementptr [8 x [3 x float]], ptr @_ZL7Palette, i32 0, i32 2
+ %.i03 = load float, ptr %1, align 16
+ ret void
+}
diff --git a/llvm/test/CodeGen/DirectX/scalar-bug-117273.ll b/llvm/test/CodeGen/DirectX/scalar-bug-117273.ll
new file mode 100644
index 00000000000000..aa09fa401a6492
--- /dev/null
+++ b/llvm/test/CodeGen/DirectX/scalar-bug-117273.ll
@@ -0,0 +1,32 @@
+; NOTE: Assertions have been autogenerated by utils/update_test_checks.py UTC_ARGS: --version 5
+; RUN: opt -S -passes='dxil-data-scalarization,dxil-flatten-arrays,function(scalarizer<load-store>),dxil-op-lower' -mtriple=dxil-pc-shadermodel6.3-library %s | FileCheck %s
+
+
+@_ZL7Palette = internal constant [8 x <3 x float>] [<3 x float> zeroinitializer, <3 x float> splat (float 5.000000e-01), <3 x float> <float 1.000000e+00, float 5.000000e-01, float 5.000000e-01>, <3 x float> <float 5.000000e-01, float 1.000000e+00, float 5.000000e-01>, <3 x float> <float 5.000000e-01, float 5.000000e-01, float 1.000000e+00>, <3 x float> <float 5.000000e-01, float 1.000000e+00, float 1.000000e+00>, <3 x float> <float 1.000000e+00, float 5.000000e-01, float 1.000000e+00>, <3 x float> <float 1.000000e+00, float 1.000000e+00, float 5.000000e-01>], align 16
+
+; Function Attrs: alwaysinline convergent mustprogress norecurse nounwind
+define internal void @_Z4mainDv3_j(<3 x i32> noundef %DID) #1 {
+; CHECK-LABEL: define internal void @_Z4mainDv3_j(
+; CHECK-SAME: <3 x i32> noundef [[DID:%.*]]) {
+; CHECK-NEXT: [[ENTRY:.*:]]
+; CHECK-NEXT: [[TMP0:%.*]] = getelementptr [24 x float], ptr @_ZL7Palette.scalarized.1dim, i32 1
+; CHECK-NEXT: [[DOTI0:%.*]] = load float, ptr [[TMP0]], align 16
+; CHECK-NEXT: [[DOTI1:%.*]] = getelementptr float, ptr [[TMP0]], i32 1
+; CHECK-NEXT: [[DOTI11:%.*]] = load float, ptr [[DOTI1]], align 4
+; CHECK-NEXT: [[DOTI2:%.*]] = getelementptr float, ptr [[TMP0]], i32 2
+; CHECK-NEXT: [[DOTI22:%.*]] = load float, ptr [[DOTI2]], align 8
+; CHECK-NEXT: [[TMP1:%.*]] = getelementptr [24 x float], ptr @_ZL7Palette.scalarized.1dim, i32 2
+; CHECK-NEXT: [[DOTI03:%.*]] = load float, ptr [[TMP1]], align 16
+; CHECK-NEXT: [[DOTI14:%.*]] = getelementptr float, ptr [[TMP1]], i32 1
+; CHECK-NEXT: [[DOTI15:%.*]] = load float, ptr [[DOTI14]], align 4
+; CHECK-NEXT: [[DOTI26:%.*]] = getelementptr float, ptr [[TMP1]], i32 2
+; CHECK-NEXT: [[DOTI27:%.*]] = load float, ptr [[DOTI26]], align 8
+; CHECK-NEXT: ret void
+;
+entry:
+ %arrayidx = getelementptr inbounds [8 x <3 x float>], ptr @_ZL7Palette, i32 0, i32 1
+ %2 = load <3 x float>, ptr %arrayidx, align 16
+ %arrayidx2 = getelementptr inbounds [8 x <3 x float>], ptr @_ZL7Palette, i32 0, i32 2
+ %3 = load <3 x float>, ptr %arrayidx2, align 16
+ ret void
+}
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few style comments. LGTM.
7a4f555
to
58abdec
Compare
✅ With the latest revision this PR passed the undef deprecator. |
2. remove use of convertUsersOfConstantsToInstructions (was leaving use chains in unstable state). 3. remove RPOT for DXILDataScalarization.cpp. we were visiting twice.
58abdec
to
340e46a
Compare
@@ -33,18 +32,13 @@ define <4 x i32> @load_array_vec_test() #0 { | |||
; CHECK-NEXT: [[TMP6:%.*]] = load i32, ptr addrspace(3) [[TMP5]], align 4 | |||
; CHECK-NEXT: [[TMP7:%.*]] = bitcast ptr addrspace(3) getelementptr (i32, ptr addrspace(3) @arrayofVecData.scalarized.1dim, i32 3) to ptr addrspace(3) | |||
; CHECK-NEXT: [[TMP8:%.*]] = load i32, ptr addrspace(3) [[TMP7]], align 4 | |||
; CHECK-NEXT: [[TMP9:%.*]] = bitcast ptr addrspace(3) @arrayofVecData.scalarized.1dim to ptr addrspace(3) | |||
; CHECK-NEXT: [[TMP10:%.*]] = getelementptr [2 x [3 x float]], ptr addrspace(3) [[TMP9]], i32 0, i32 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the bugs from convertUsersOfConstantsToInstructions
. It would split the bitcast from the GEP, but then we would never visit the GEP and so the GEP type never got flattened.
@@ -87,7 +81,7 @@ define <4 x i32> @load_vec_test() #0 { | |||
define <4 x i32> @load_static_array_of_vec_test(i32 %index) #0 { | |||
; CHECK-LABEL: define <4 x i32> @load_static_array_of_vec_test( | |||
; CHECK-SAME: i32 [[INDEX:%.*]]) #[[ATTR0]] { | |||
; CHECK-NEXT: [[DOTFLAT:%.*]] = getelementptr [12 x i32], ptr @staticArrayOfVecData.scalarized.1dim, i32 [[INDEX]] | |||
; CHECK-NEXT: [[DOTFLAT:%.*]] = getelementptr inbounds [12 x i32], ptr @staticArrayOfVecData.scalarized.1dim, i32 [[INDEX]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
our GEP builders were not preserving inbounds. Now we are.
; RUN: opt -S -passes='dxil-flatten-arrays,dxil-op-lower' -mtriple=dxil-pc-shadermodel6.3-library %s | FileCheck %s | ||
|
||
|
||
@ZerroInitAndUndefArr = internal constant [2 x [3 x float]] [[3 x float] zeroinitializer, [3 x float] undef], align 16 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test case in #117273 doesn't have anything that should be producing an undef array, and I'm not sure it should be possible to create an undef array in HLSL. Do you know where this is coming from?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure how to get this exact case again. My experiments today just cause one undef not a nested one.
It wasn't your exact example I just was playing around with how to change the values in the array after I saw float3(0,0,0) became zeroinitializer
. I then noticed that if I created an uninitialized array and passed it to the Pallette array I could get an undef array.
example with fcgl: https://hlsl.godbolt.org/z/oPxxoKf9f
example without: https://hlsl.godbolt.org/z/rr1TocKnj
✅ With the latest revision this PR passed the C/C++ code formatter. |
aefade2
to
f448ac5
Compare
Two bugs here. First calling
Inst->getFunction()
has undefined behavior if the instruction is not tracked to a function. I suspect thereplaceAllUsesWith
was leaving the GEPs in a weird ghost parent situation. I switched up the visitor to be able toeraseFromParent
as part of visiting and then everything started working.The second bug was in
DXILFlattenArrays.cpp
. I was unaware that you can have multidimensional arrays ofzeroinitializer
, andundef
so fixed up the initializer to handle these two cases.fixes #117273