[SM6.9] Native vector load/store lowering #7292

pow2clk · 2025-04-01T21:19:36Z

Enables the declaration of long vector types for raw buffers, the lowering of those
and traditional vectors in loads and stores maintaining the native types with
new dxil ops along with validation and testing support of the same.

Allow declaring long vector rawbuffer resources.
Previously disallowed along with other global types, this provides a
mechanism for indicating which buffers are raw and allowing them to
contain long vectors, continuing to produce an error for other resource
types verified by existing tests

Introduce native vector DXIL load/store intrinsics.
Add new raw buffer vector load/store intrinsics using the new vector overload types.
Include them in validation associated with similar load/stores

Lower native vector raw buffers load/stores into new ops.
When the loaded/stored type is a vector of more than 1 element, the
shader model is 6.9 or higher, and the operation is on a raw buffer,
enable the generation of a native vector raw buffer load or store.

Incidental removal of unused parameter in load translation and some refactoring
of the lowering to flow better with the new resret types.

add validation and compute shader tests

Vector to scalar raw buffer load lowering pass
Native vector loads and stores are generated for 6.9 targets and above.
This includes the 6.x target used when compiling to libraries. This adds
a pass run when linking that will lower the vector operations to scalar
operations for shader models that don't have native vector support. This
allows libraries compiled for supportive shader models to be linked to
targets without support.

Previously disallowed along with other global types, this provides a mechanism for indicating which buffers are raw and allowing them to contain long vectors, continuing to produce an error for other resource types verified by existing tests

Add new raw buffer vector load/store intrinsics using the new vector overload types. Include them in validation associated with similar load/stores

When the loaded/stored type is a vector of more than 1 element, the shader model is 6.9 or higher, and the operation is on a raw buffer, enable the generation of a native vector raw buffer load or store. Incidental removal of unused parameter in load translation add validation and compute shader tests

Native vector loads and stores are generated for 6.9 targets and above. This includes the 6.x target used when compiling to libraries. This adds a pass run when linking that will lower the vector operations to scalar operations for shader models that don't have native vector support. This allows libraries compiled for supportive shader models to be linked to targets without support.

github-actions · 2025-04-01T21:20:33Z

✅ With the latest revision this PR passed the C/C++ code formatter.

pow2clk · 2025-04-01T21:21:43Z

REVIEW GUIDANCE! The first four individual commits were designed to be reviewed individually and might make the review more pleasant. That said, I don't expect the full review would be too difficult either.

pow2clk · 2025-04-01T21:22:08Z

I clang-formatted this to within an inch of its life. No idea why it didn't fail until now.

bogner

A couple of minor things but this LGTM

bogner · 2025-04-01T21:44:51Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

+    return "DXIL scalarize vector load/stores";
+  }
+
+  bool runOnModule(Module &M) override {


I think it's worthwhile to split this into two classes - The pass DxilScalarizeLoadStores, and a separate (loca,/ anonymous namespace) VectorLoadStoreScalarizer. This separates concerns nicely and avoids the awkward situation of mutating the class member m_DM on every call to runOnModule (It also means you can avoid the polish notation for the member variable...)

Then the pass just needs:

bool runOnModule(Module &M) override { DxilModule &DM = M.GetOrCreateDxilModule(); if (DM.GetShaderModel()->IsSM69Plus()) return false; return VectorLoadStoreScalarizer(DM).scalarize(); }

the member dxilmodule was actually unnecessary and I'd meant to remove it. Most of the simplification of runOnModule you've added here would just push the iteration down to the scalarization function. I don't mind where it goes, but the benefit, small though it is, of iterating in runOnModule is I only have to iterate through the functions once.

I'm perfectly willing to consider it, but I think I remain unclear on the benefit of having separate classes.

I've split the load and store scalarizations into two non-method static functions.

bogner · 2025-04-01T21:49:46Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

+    for (auto F = M.functions().begin(), E = M.functions().end(); F != E;) {
+      Function *Func = &*(F++);


(side comment - no change needed)
It's too bad we don't have upstream llvm's make_early_inc_range available here.

Yeah, and their version of filecheck.

bogner · 2025-04-01T21:57:19Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

+  Args.emplace_back(VecLd.get_alignment());     // Alignment @5.
+
+  // Set offset to increment depending on whether the real offset is defined.
+  unsigned OffsetIdx = 0;


There's some debate about this, so feel free to use your own preference, but I would leave OffsetIdx uninitialized here since that will give a nice compiler warning if you forget to set it in one of the branches of the conditional below.

Makes sense.

utils/hct/hctdb.py

tools/clang/lib/Sema/SemaHLSL.cpp

Turned out to be unneeded This reverts commit cdaca3f.

Use IS_BASIC_OBJECT to allow global raw buffers with long vectors. The rawbuffer bits weren't needed. Remove unused DxilModule from vector load store scalarize pass Leave sure-to-be-assigned variable uninitialized Add test that long vectors in pre-6.9 StructuredBuffers produces an error Exclude the final 1.9 dxil op code for now.

lib/HLSL/HLOperationLower.cpp

llvm-beanz · 2025-04-01T21:43:16Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

not a request, but a suggestion for the future:

This PR would be significantly smaller and easier to review if this pass had been added in its own PR with its own tests and no other changes. A subsequent PR could add it to the pipeline and add the additional required changes to integrate it.

When I have a large block of code to commit that is interrelated, but separable and I'm unsure how best to economize reviewer time in the perennial conflict between doing a lot of reviews versus one big one, I try to have the best of both by dividing the separable portions into sensible commits and putting them in one PR. That's what I did here as I explained in this comment: #7292 (comment)

I've seen evidence that these comments are often missed though. I wonder if putting an inline comment at the top of the change would be more effective.

The individual PR you describe is essentially this commit: 27f0e73

A lot of this is tests and this change unblocked a lot of testing and it was always going to open the floodgates of tests for the combination of elements it enabled.

There are solutions for doing stacked PRs (https://github.com/modular/stack-pr, https://graphite.dev/blog/stacked-prs, and others). The GitHub UI for "commits" is not a good tool for reviewing completely separable changes, and relying on someone seeing a comment in the UI is lossy at best.

New passes are always separable because by LLVM's design they can be run with IR inputs and outputs and be completely independent of other changes. They make a great boundary for separating pull requests.

Have you tried it? I'd be interested to know what happens when the PR that is depended on changes. Faced with this situation again, I was thinking I'd just create a PR against the topic branch of the first PR.

That's not really the problem I meant to solve by making this one change with logically separated PRs though. I wanted this to go in together both to economize reviewer time having to look at one change and also the interdependencies. Several things could have gone in before others, but incompletely tested.

I'm not saying I made the right decision here, but I see other problems with opposing using logically separated commits so as to not have to read the comments of a PR before reviewing.

llvm-beanz · 2025-04-01T21:47:13Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

+  // Collect the information required to break this into scalar ops from args.
+  DxilInst_RawBufferVectorLoad VecLd(CI);
+  OP::OpCode OpCode = OP::OpCode::RawBufferLoad;
+  llvm::Constant *opArg = Builder.getInt32((unsigned)OpCode);


Suggested change

llvm::Constant *opArg = Builder.getInt32((unsigned)OpCode);

llvm::Constant *OpArg = Builder.getInt32((unsigned)OpCode);

New code should follow LLVM coding standards.

Missed this one.

llvm-beanz · 2025-04-02T01:08:59Z

tools/clang/test/CodeGenDXIL/hlsl/intrinsics/buffer-load-stores-sm69.hlsl

+
+// CHECK: %dx.types.ResRet.[[VTY:v[0-9]*[a-z][0-9][0-9]]] = type { <[[NUM:[0-9]*]] x [[TYPE:[a-z_0-9]*]]>, i32 }
+
+  ByteAddressBuffer RoByBuf : register(t1);


take or leave: the leading whitespace bothers me greatly... I know we don't really have a policy on formatting HLSL but this reminds me too much of the objective-c tendency to align colons that I've always found grating.

The intention here was to keep the corresponding portions aligned so that, at a glance, it's clear that these are both ByteAddressBuffers and one is RW though I wasn't consistent with that elsewhere. I don't mind removing it.

llvm-beanz · 2025-04-02T01:09:56Z

tools/clang/test/CodeGenDXIL/hlsl/intrinsics/buffer-load-stores-sm69.hlsl

+RWByteAddressBuffer RwByBuf : register(u1);
+
+StructuredBuffer< vector<TYPE, NUM> > RoStBuf : register(t2);
+RWStructuredBuffer< vector<TYPE, NUM>  > RwStBuf : register(u2);


also take or leave: I also find some of the double-spaces odd... I know we have have C++03-ish templates so you need a space, but extra spaces seem to just make lines longer.

Not sure if it's the unintended double space after the > that bothers you or the unnecessary, but I think helpful though inconsistently applied leading space before vector.

llvm-beanz · 2025-04-02T01:10:58Z

tools/clang/test/CodeGenDXIL/hlsl/types/longvec-operators-cs.hlsl

@@ -0,0 +1,708 @@
+// RUN: %dxc -HV 2018 -T cs_6_9 -DTYPE=float -DNUM=2 %s | FileCheck %s --check-prefixes=CHECK,NODBL,NOINT


Should we also be testing vector<T,1> here? Or potentially elsewhere?

It's a good suggestion as they should be treated like scalars. Given how different the output will be, I'd prefer to do it elsewhere. I had intended to do the same thing to the operators-vec1s test as this though originally I'd hoped they could remain part of the same tests, but the reorganization just got too involved and I've been critiqued for how complicated I make my CHECK lines in such cases.

tools/clang/test/CodeGenDXIL/hlsl/types/longvec-operators.hlsl

tools/clang/test/CodeGenDXIL/passes/longvec-load-stores-scalarizevecldst.ll

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

tex3d · 2025-04-02T02:12:16Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

+    bool Changed = false;
+
+    hlsl::OP *HlslOP = DM.GetOP();
+    for (auto F = M.functions().begin(), E = M.functions().end(); F != E;) {


You can iterate the overloads for specific ops you want instead with HlslOP::GetOpFuncList. For instance:

DirectXShaderCompiler/lib/HLSL/DxilPreparePasses.cpp

Line 398 in 2f357a9

for (auto it : hlslOP->GetOpFuncList(DXIL::OpCode::AnnotateHandle)) {

This is what I did originally. I ran into a lot of problems with it because GetOpFuncList was returning a list with duplicate entries for the same function leading to double deletion. I have a note to investigate that further, but in the meantime, rather than create a list of functions to delete that I iterate through again at the end to just delete them all, I opted to use the same iteration method that the cousin pass TranslateRawBuffer uses. I don't feel strongly about it though. I could probably do as the code you cite does and rely on later passes to eliminate the unused functions for me. I don't have an issue with that.

Oh, for one thing, erasing a function during iteration of this list will erase the entry from the list. Since this is a SmallMapVector, the iterator points into the vector, which will be invalidated by any modification. That doesn't seem like it should result specifically in duplicate entries and double-delete, but it is in UB territory.

If the original list contains duplicate entries, that is a cause for concern. We shouldn't allow that.

tex3d · 2025-04-02T02:26:34Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

+        if (OpClass == DXIL::OpCodeClass::RawBufferVectorLoad) {
+          for (auto U = Func->user_begin(), UE = Func->user_end(); U != UE;) {
+            CallInst *CI = cast<CallInst>(*(U++));
+            scalarizeVectorLoad(HlslOP, M.getDataLayout(), CI);


It's always a good form when iterating DXIL op class overload function users to check the opcode (even though you know in this case it should be the only one for this OpCodeClass):

Suggested change

scalarizeVectorLoad(HlslOP, M.getDataLayout(), CI);

if (HlslOP->GetDxilOpFuncCallInst(CI) !=

hlsl::OP::OpCode::RawBufferVectorLoad)

continue;

scalarizeVectorLoad(HlslOP, M.getDataLayout(), CI);

Presumably this wouldn't be required if I use the GetOpFuncList approach?

It turns out that you do technically have to check, even with GetOpFuncList. That's because this returns the list of overload functions for the OpCodeClass corresponding to the OpCode. Basically, that's all it could do, since different DXIL operations use the same function declaration when they share the same OpCodeClass.

I think the best way is to get the OpCode using: hlsl::OP::GetDxilOpFuncCallInst(CI) == hlsl::OP::OpCode::RawBufferVectorLoad (this avoids a redundant IsDxilOpFuncCallInst check on NDEBUG at least).

I looked at other code using this function and basically nobody is checking the OpCode of the CallInst users. I'm considering filing an issue about this, but I don't think there are any real bugs caused by this at the moment. Basically, you're safe if there's a 1:1 mapping between OpCode and OpCodeClass, which happens to be true in all use cases. If not, since this always lists all the overloads for the OpCodeClass that corresponds to the OpCode passed in, you could be iterating CallInst users that aren't calling with the same OpCode you're looking for. I found one use of GetOpFuncList that looks like it could potentially have this problem in OP::IsDxilOpUsed, but this is only called with OpCodes that are safe, so I don't think it doesn't cause any real bugs in practice (yet).

I think it would probably be better to make GetOpFuncList take the OpCodeClass instead of the OpCode. Then you're forced to HlslOp->GetOpFuncList(hlsl::OP::GetOpCodeClass(MyOpCode)) and it's more obvious that you need to check the OpCode for each CallInst user.

Regarding using hlsl::OP::GetDxilOpFuncCallInst, I guess hlsl::OP::getOpCode is just fine actually.

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

tex3d

Looking good. One correctness issue with the alignment for the load op, and a few minor comments.

pow2clk

Thanks for all the feedback! I'll respond to most of it in the next commit. The vec1 test will have its own commit as it'll take just a bit longer.

pow2clk · 2025-04-02T19:22:25Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

When I have a large block of code to commit that is interrelated, but separable and I'm unsure how best to economize reviewer time in the perennial conflict between doing a lot of reviews versus one big one, I try to have the best of both by dividing the separable portions into sensible commits and putting them in one PR. That's what I did here as I explained in this comment: #7292 (comment)

I've seen evidence that these comments are often missed though. I wonder if putting an inline comment at the top of the change would be more effective.

The individual PR you describe is essentially this commit: 27f0e73

A lot of this is tests and this change unblocked a lot of testing and it was always going to open the floodgates of tests for the combination of elements it enabled.

pow2clk · 2025-04-02T19:25:10Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

+    return "DXIL scalarize vector load/stores";
+  }
+
+  bool runOnModule(Module &M) override {


I've split the load and store scalarizations into two non-method static functions.

pow2clk · 2025-04-02T19:29:01Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

+    bool Changed = false;
+
+    hlsl::OP *HlslOP = DM.GetOP();
+    for (auto F = M.functions().begin(), E = M.functions().end(); F != E;) {


This is what I did originally. I ran into a lot of problems with it because GetOpFuncList was returning a list with duplicate entries for the same function leading to double deletion. I have a note to investigate that further, but in the meantime, rather than create a list of functions to delete that I iterate through again at the end to just delete them all, I opted to use the same iteration method that the cousin pass TranslateRawBuffer uses. I don't feel strongly about it though. I could probably do as the code you cite does and rely on later passes to eliminate the unused functions for me. I don't have an issue with that.

pow2clk · 2025-04-02T19:30:28Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

+        if (OpClass == DXIL::OpCodeClass::RawBufferVectorLoad) {
+          for (auto U = Func->user_begin(), UE = Func->user_end(); U != UE;) {
+            CallInst *CI = cast<CallInst>(*(U++));
+            scalarizeVectorLoad(HlslOP, M.getDataLayout(), CI);


Presumably this wouldn't be required if I use the GetOpFuncList approach?

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

pow2clk · 2025-04-02T19:35:27Z

tools/clang/test/CodeGenDXIL/hlsl/intrinsics/buffer-load-stores-sm69.hlsl

+RWByteAddressBuffer RwByBuf : register(u1);
+
+StructuredBuffer< vector<TYPE, NUM> > RoStBuf : register(t2);
+RWStructuredBuffer< vector<TYPE, NUM>  > RwStBuf : register(u2);


Not sure if it's the unintended double space after the > that bothers you or the unnecessary, but I think helpful though inconsistently applied leading space before vector.

pow2clk · 2025-04-02T19:42:12Z

tools/clang/test/CodeGenDXIL/hlsl/types/longvec-operators-cs.hlsl

@@ -0,0 +1,708 @@
+// RUN: %dxc -HV 2018 -T cs_6_9 -DTYPE=float -DNUM=2 %s | FileCheck %s --check-prefixes=CHECK,NODBL,NOINT


It's a good suggestion as they should be treated like scalars. Given how different the output will be, I'd prefer to do it elsewhere. I had intended to do the same thing to the operators-vec1s test as this though originally I'd hoped they could remain part of the same tests, but the reorganization just got too involved and I've been critiqued for how complicated I make my CHECK lines in such cases.

tools/clang/test/CodeGenDXIL/hlsl/types/longvec-operators.hlsl

tools/clang/test/CodeGenDXIL/passes/longvec-load-stores-scalarizevecldst.ll

pow2clk · 2025-04-02T20:02:56Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

+    for (auto F = M.functions().begin(), E = M.functions().end(); F != E;) {
+      Function *Func = &*(F++);


Yeah, and their version of filecheck.

Separate out the load store lowering functions from the pass class in vector scalarization pass Calculate load size on scalar element rather than resret element type to get correct alignment on scalars instead of on the whole vector. Adjust various tests accordingly that depended on or used code generated with the incorrect alignment, which always just hit the max of 8. Change iteration through matching intrinsics in link scalarization pass Capitalize varibles according to coding standards Include and remove whitespace in tests.

Add cs test for vec1s. Reformat other operator tests to be more consistent.

validate native vector loads and stores for properly defined parameters of the correct type. Add tests for both vector load/stores and the original scalar load/stores since they share a lot of validation code.

llvm-beanz · 2025-04-03T13:29:49Z

lib/DxilValidation/DxilValidation.cpp

+  }
+    LLVM_FALLTHROUGH;
+  case DXIL::OpCode::RawBufferVectorStore: {
+    Value *handle =


Suggested change

Value *handle =

Value *Handle =

This file is extremely consistent. All but a few exceptions are not capitalized. I think the golden rule applies.

Really?

m_

Member PascalCase not m_

Double Capital and camelCase

On the next line camelCase and PascalCase

That's just up to line 101.

llvm-beanz · 2025-04-03T13:30:12Z

lib/DxilValidation/DxilValidation.cpp

+    if (resClass != DXIL::ResourceClass::UAV)
+      ValCtx.EmitInstrError(CI, ValidationRule::InstrResourceClassForUAVStore);
+
+    unsigned alignIdx = DXIL::OperandIndex::kRawBufferStoreAlignmentOpIdx;


Suggested change

unsigned alignIdx = DXIL::OperandIndex::kRawBufferStoreAlignmentOpIdx;

unsigned AlignIdx = DXIL::OperandIndex::kRawBufferStoreAlignmentOpIdx;

llvm-beanz · 2025-04-03T13:30:23Z

lib/DxilValidation/DxilValidation.cpp

+    unsigned alignIdx = DXIL::OperandIndex::kRawBufferStoreAlignmentOpIdx;
+    if (DXIL::OpCode::RawBufferVectorStore == opcode) {
+      alignIdx = DXIL::OperandIndex::kRawBufferVectorStoreAlignmentOpIdx;
+      unsigned valueIx = DXIL::OperandIndex::kRawBufferVectorStoreValOpIdx;


Suggested change

unsigned valueIx = DXIL::OperandIndex::kRawBufferVectorStoreValOpIdx;

unsigned ValueIx = DXIL::OperandIndex::kRawBufferVectorStoreValOpIdx;

llvm-beanz · 2025-04-03T13:30:49Z

lib/DxilValidation/DxilValidation.cpp

-    DxilInst_RawBufferLoad bufLd(CI);
+    LLVM_FALLTHROUGH;
+  case DXIL::OpCode::RawBufferVectorLoad: {
+    Value *handle =


Suggested change

Value *handle =

Value *Handle =

llvm-beanz · 2025-04-03T13:31:10Z

lib/DxilValidation/DxilValidation.cpp

-    } else {
-      alignSize = bufLd.get_alignment_val();
-    }
+    unsigned alignIdx = DXIL::OperandIndex::kRawBufferLoadAlignmentOpIdx;


Suggested change

unsigned alignIdx = DXIL::OperandIndex::kRawBufferLoadAlignmentOpIdx;

unsigned AlignIdx = DXIL::OperandIndex::kRawBufferLoadAlignmentOpIdx;

llvm-beanz · 2025-04-03T13:31:20Z

lib/DxilValidation/DxilValidation.cpp

+    if (!isa<ConstantInt>(CI->getOperand(alignIdx)))
+      ValCtx.EmitInstrError(CI, ValidationRule::InstrConstAlignForRawBuf);
+
+    Value *offset =


Suggested change

Value *offset =

Value *Offset =

llvm-beanz · 2025-04-03T13:31:41Z

lib/DxilValidation/DxilValidation.cpp

+    if (!isa<ConstantInt>(CI->getOperand(alignIdx)))
+      ValCtx.EmitInstrError(CI, ValidationRule::InstrConstAlignForRawBuf);
+
+    Value *offset =


Suggested change

Value *offset =

Value *Offset =

lib/HLSL/DxilLinker.cpp

llvm-beanz · 2025-04-03T13:36:48Z

lib/HLSL/DxilScalarizeVectorLoadStores.cpp

There are solutions for doing stacked PRs (https://github.com/modular/stack-pr, https://graphite.dev/blog/stacked-prs, and others). The GitHub UI for "commits" is not a good tool for reviewing completely separable changes, and relying on someone seeing a comment in the UI is lossy at best.

New passes are always separable because by LLVM's design they can be run with IR inputs and outputs and be completely independent of other changes. They make a great boundary for separating pull requests.

pow2clk · 2025-04-03T20:36:02Z

I've taken the shortest path and because I really don't want to go over this again for this same file at least, I've done this #7307

tex3d

Looks good!

Just a minor comment about RUN lines in hlsl files under DXILValidation where these are not run as tests. I know other files have similar issues, but it's been misleading. Sometimes, other contributors will place tests here, thinking that it's the right location for validation shell tests, and they are simply ignored.

tools/clang/test/DXILValidation/load-store-validation.hlsl

tools/clang/test/DXILValidation/vector-validation.hlsl

pow2clk · 2025-04-04T16:49:33Z

Well, I wanted to see if resolving conflicts online would still require a new review and it does

Some of these didn't come out quite right either from the manual or automatic merge.

toji · 2025-04-04T22:41:06Z

It seems like this may have broken our ability to roll the shader compiler into Dawn for Chromium (WebGPU).

Attempting to update to ToT with this change included starts producing the following compiler errors similar to the following for us:

FAILED: 71219ff8-ffe6-4725-9dbc-3e16cf060fd2 "./libdxcompiler.so" SOLINK ./libdxcompiler.so
err: exit=1
"python3" "../../build/toolchain/gcc_solink_wrapper.py" --readelf="../../third_party/llvm-build/Release+Asserts/bin/llvm-readelf" --nm="../../third_party/llvm-build/Release+Asserts/bin/llvm-nm"  --sofile="./libdxcompiler.so" --tocfile="./libdxcompiler.so.TOC" --output="./libdxcompiler.so" -- ../../third_party/llvm-build/Release+Asserts/bin/clang++ -shared -Wl,-soname="libdxcompiler.so" -fuse-ld=lld -Wl,--fatal-warnings -Wl,--build-id=fast -fPIC -Wl,-z,noexecstack -Wl,-z,relro -Wl,-z,now -Wl,--color-diagnostics -Wl,--undefined-version -Wl,--no-call-graph-profile-sort -m64 -no-canonical-prefixes -Wl,--gdb-index -Wl,-z,defs -Wl,--as-needed -nostdlib++ --sysroot=../../build/linux/debian_bullseye_amd64-sysroot -Werror -o "./libdxcompiler.so" @"./libdxcompiler.so.rsp"  
build step: solink "./libdxcompiler.so"
siso_rule: clang/solink
stderr:
ld.lld: error: lib_hlsl/DxilLinker.o: DWARF unit at offset 0x00000000 has unsupported version 0, supported are 2-5
ld.lld: error: undefined symbol: llvm::createDxilScalarizeVectorLoadStoresPass()
>>> referenced by DxilLinker.cpp
>>>               lib_hlsl/DxilLinker.o:((anonymous namespace)::DxilLinkJob::RunPreparePass(llvm::Module&)) in archive obj/third_party/gn/dxc/lib_hlsl.a

ld.lld: error: lib_hlsl/DxcOptimizer.o: DWARF unit at offset 0x00000000 has unsupported version 0, supported are 2-5
ld.lld: error: undefined symbol: llvm::initializeDxilScalarizeVectorLoadStoresPass(llvm::PassRegistry&)
>>> referenced by DxcOptimizer.cpp
>>>               lib_hlsl/DxcOptimizer.o:(hlsl::SetupRegistryPassForHLSL()) in archive obj/third_party/gn/dxc/lib_hlsl.a
clang++: error: linker command failed with exit code 1 (use -v to see invocation)

Given that it can't resolve createDxilScalarizeVectorLoadStoresPass it definitely appears related to this PR.

dneto0 · 2025-04-07T15:45:55Z

It seems like this may have broken our ability to roll the shader compiler into Dawn for Chromium (WebGPU).

The fix was on our end: we had to add the new .cpp file to our BUILD.gn file.

https://dawn-review.googlesource.com/c/dawn/+/235214

Greg Roth added 4 commits April 1, 2025 15:13

Introduce native vector DXIL load/store intrinsics

53f5b21

Add new raw buffer vector load/store intrinsics using the new vector overload types. Include them in validation associated with similar load/stores

pow2clk requested a review from a team as a code owner April 1, 2025 21:19

github-project-automation bot moved this to New in HLSL Roadmap Apr 1, 2025

github-project-automation bot added this to HLSL Roadmap Apr 1, 2025

bogner previously approved these changes Apr 1, 2025

View reviewed changes

tex3d reviewed Apr 1, 2025

View reviewed changes

utils/hct/hctdb.py Outdated Show resolved Hide resolved

tex3d self-assigned this Apr 1, 2025

pow2clk added this to HLSL Support Apr 1, 2025

tex3d reviewed Apr 1, 2025

View reviewed changes

tools/clang/lib/Sema/SemaHLSL.cpp Outdated Show resolved Hide resolved

Greg Roth added 2 commits April 1, 2025 17:56

Revert "Allow declaring long vector rawbuffer resources"

e159bc2

Turned out to be unneeded This reverts commit cdaca3f.

pow2clk dismissed bogner’s stale review via 309a924 April 2, 2025 00:04

tex3d reviewed Apr 2, 2025

View reviewed changes

lib/HLSL/HLOperationLower.cpp Show resolved Hide resolved

llvm-beanz reviewed Apr 2, 2025

View reviewed changes

tex3d reviewed Apr 2, 2025

View reviewed changes

tools/clang/test/CodeGenDXIL/hlsl/types/longvec-operators.hlsl Show resolved Hide resolved

tex3d reviewed Apr 2, 2025

View reviewed changes

tools/clang/test/CodeGenDXIL/passes/longvec-load-stores-scalarizevecldst.ll Outdated Show resolved Hide resolved

tex3d reviewed Apr 2, 2025

View reviewed changes

lib/HLSL/DxilScalarizeVectorLoadStores.cpp Outdated Show resolved Hide resolved

tex3d reviewed Apr 2, 2025

View reviewed changes

lib/HLSL/DxilScalarizeVectorLoadStores.cpp Show resolved Hide resolved

tex3d reviewed Apr 2, 2025

View reviewed changes

pow2clk commented Apr 2, 2025

View reviewed changes

damyanp removed the request for review from a team April 2, 2025 21:49

tex3d previously approved these changes Apr 3, 2025

View reviewed changes

Greg Roth added 2 commits April 3, 2025 03:26

add and update tests

749a986

Add cs test for vec1s. Reformat other operator tests to be more consistent.

Add validation of vector load/store parameters

eb41794

validate native vector loads and stores for properly defined parameters of the correct type. Add tests for both vector load/stores and the original scalar load/stores since they share a lot of validation code.

pow2clk dismissed tex3d’s stale review via eb41794 April 3, 2025 09:29

Greg Roth added 2 commits April 3, 2025 05:33

clang-format

19bf4fc

fix vec1s test variability

4bb9311

llvm-beanz reviewed Apr 3, 2025

View reviewed changes

Greg Roth added 2 commits April 3, 2025 10:30

remove blank line

a67a400

Capitalize variables

048f18e

tex3d previously approved these changes Apr 3, 2025

View reviewed changes

tools/clang/test/DXILValidation/load-store-validation.hlsl Outdated Show resolved Hide resolved

tools/clang/test/DXILValidation/vector-validation.hlsl Outdated Show resolved Hide resolved

Merge branch 'main' into nvec_ldst_pr

3d86e5b

pow2clk dismissed tex3d’s stale review via 3d86e5b April 4, 2025 16:48

Greg Roth added 2 commits April 4, 2025 10:53

remove trappings of a filechedk tests from validation sources

b7781fa

Merge fixups

cb910db

Some of these didn't come out quite right either from the manual or automatic merge.

tex3d approved these changes Apr 4, 2025

View reviewed changes

pow2clk merged commit 0ffd60a into microsoft:main Apr 4, 2025
12 checks passed

pow2clk deleted the nvec_ldst_pr branch April 4, 2025 20:10

github-project-automation bot moved this from New to Done in HLSL Roadmap Apr 4, 2025

damyanp moved this to Closed in HLSL Support Apr 25, 2025

damyanp removed this from HLSL Support Jun 25, 2025

pow2clk mentioned this pull request Aug 11, 2025

[SM6.9] Scalarize native vector intrinsics on linking pre6.9 #7690

Merged

		for (auto F = M.functions().begin(), E = M.functions().end(); F != E;) {
		Function Func = &(F++);

	llvm::Constant *opArg = Builder.getInt32((unsigned)OpCode);
	llvm::Constant *OpArg = Builder.getInt32((unsigned)OpCode);


		// CHECK: %dx.types.ResRet.[[VTY:v[0-9][a-z][0-9][0-9]]] = type { <[[NUM:[0-9]]] x [[TYPE:[a-z_0-9]*]]>, i32 }

		ByteAddressBuffer RoByBuf : register(t1);

		@@ -0,0 +1,708 @@
		// RUN: %dxc -HV 2018 -T cs_6_9 -DTYPE=float -DNUM=2 %s \| FileCheck %s --check-prefixes=CHECK,NODBL,NOINT

-            scalarizeVectorLoad(HlslOP, M.getDataLayout(), CI);
+            if (HlslOP->GetDxilOpFuncCallInst(CI) !=
+                hlsl::OP::OpCode::RawBufferVectorLoad)
+              continue;
+            scalarizeVectorLoad(HlslOP, M.getDataLayout(), CI);

	unsigned alignIdx = DXIL::OperandIndex::kRawBufferStoreAlignmentOpIdx;
	unsigned AlignIdx = DXIL::OperandIndex::kRawBufferStoreAlignmentOpIdx;

	unsigned valueIx = DXIL::OperandIndex::kRawBufferVectorStoreValOpIdx;
	unsigned ValueIx = DXIL::OperandIndex::kRawBufferVectorStoreValOpIdx;

[SM6.9] Native vector load/store lowering #7292

[SM6.9] Native vector load/store lowering #7292

Uh oh!

Conversation

pow2clk commented Apr 1, 2025

Uh oh!

github-actions bot commented Apr 1, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pow2clk commented Apr 1, 2025

Uh oh!

pow2clk commented Apr 1, 2025

Uh oh!

bogner left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pow2clk Apr 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

tex3d left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 1, 2025 •

edited

Loading

pow2clk Apr 3, 2025 •

edited

Loading