Skip to content

Conversation

@anupamachandra
Copy link
Collaborator

Implements
HLSL:
__builtin_MatVecMul
__builtin_MatVecMulAdd
__builtin_OuterProductAccumulate
__builtin_VectorAccumulate

Lowered to
DXIL:
@dx.op.matVecMul
@dx.op.matVecMulAdd
@dx.op.outerProductAccumulate
@dx.op.vectorAccumulate

@github-actions
Copy link
Contributor

github-actions bot commented Apr 1, 2025

✅ With the latest revision this PR passed the Python code formatter.

@anupamachandra anupamachandra marked this pull request as ready for review April 1, 2025 19:26
@anupamachandra anupamachandra requested a review from a team as a code owner April 1, 2025 19:26
@damyanp
Copy link
Member

damyanp commented Apr 2, 2025

NOTE: this is a general issue with long vectors tracked by #7297. I'll keep this comment here since it has an interesting case we might want to test in it.

This applies to all the builtins I've tried so far, but the VectorAccumulate example is quite minimal. Given this code:

export void TruncatedVector(vector<half, 254> Input254, vector<half, 255> Input255) {
  __builtin_VectorAccumulate(Input254, RWBuf, 0);
  __builtin_VectorAccumulate(Input255, RWBuf, 0);
}```

This generates:

```llvm
; Function Attrs: nounwind
define void @"\01?TruncatedVector@@YAXV?$vector@$halff@$0PO@@@V?$vector@$halff@$0PP@@@@Z"(<254 x float> %Input254, <255 x float> %Input255) #0 {
  %1 = load %dx.types.Handle, %dx.types.Handle* @"\01?RWBuf@@3URWByteAddressBuffer@@A", align 4
  %2 = call %dx.types.Handle @dx.op.createHandleForLib.dx.types.Handle(i32 160, %dx.types.Handle %1)  ; CreateHandleForLib(Resource)
  %3 = call %dx.types.Handle @dx.op.annotateHandle(i32 216, %dx.types.Handle %2, %dx.types.ResourceProperties { i32 4107, i32 0 })  ; AnnotateHandle(res,props)  resource: RWByteAddressBuffer
  call void @dx.op.vectorAccumulate.v254f32(i32 308, <254 x float> %Input254, %dx.types.Handle %3, i32 0)  ; VectorAccumulate(inputVector,arrayBuffer,arrayOffset)
  %4 = shufflevector <255 x float> %Input255, <255 x float> undef, <1 x i32> zeroinitializer
  %5 = call %dx.types.Handle @dx.op.createHandleForLib.dx.types.Handle(i32 160, %dx.types.Handle %1)  ; CreateHandleForLib(Resource)
  %6 = call %dx.types.Handle @dx.op.annotateHandle(i32 216, %dx.types.Handle %5, %dx.types.ResourceProperties { i32 4107, i32 0 })  ; AnnotateHandle(res,props)  resource: RWByteAddressBuffer
  call void @dx.op.vectorAccumulate.v1f32(i32 308, <1 x float> %4, %dx.types.Handle %6, i32 0)  ; VectorAccumulate(inputVector,arrayBuffer,arrayOffset)
  ret void
}

Note how Input255 is explicitly truncated to 1xfloat before the vectorAccumulate is called. Input254 is not truncated.

@github-actions
Copy link
Contributor

github-actions bot commented Apr 2, 2025

✅ With the latest revision this PR passed the C/C++ code formatter.

anupamachandra and others added 2 commits April 2, 2025 09:50
Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
Copy link
Contributor

@tex3d tex3d left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Besides the generated content, I think this looks good. Just one small nit regarding DXIL Op descriptions.

I believe the generated content is out of date/incorrect, since I noticed some deleted operations and a missing .json file update. In any case, generated files will need to be updated before the final PR is ready for merging.

@tex3d
Copy link
Contributor

tex3d commented Apr 18, 2025

Just an FYI:
Our gcc pipelines started failing today because of a docker image update that updates Ubuntu (will be ultimately to Ubunutu v24.4 by 5/9), which will require us to bump our gcc version a few. A fix is in the works, but in the meantime, I think we should override this failure and merge if there are no other failures, once the other pipelines are complete.

@damyanp damyanp merged commit 1db8c5b into microsoft:staging-sm6.9 Apr 18, 2025
9 of 12 checks passed
@github-project-automation github-project-automation bot moved this from New to Done in HLSL Roadmap Apr 18, 2025
@damyanp damyanp moved this from Needs Review to Closed in HLSL Support Apr 22, 2025
tex3d added a commit to tex3d/DirectXShaderCompiler that referenced this pull request Apr 25, 2025
…ics (microsoft#7290)

Implements
HLSL:
__builtin_MatVecMul
__builtin_MatVecMulAdd
__builtin_OuterProductAccumulate
__builtin_VectorAccumulate

Lowered to
DXIL:
@dx.op.matVecMul
@dx.op.matVecMulAdd
 @dx.op.outerProductAccumulate
 @dx.op.vectorAccumulate

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Damyan Pepper <damyanp@microsoft.com>
Co-authored-by: Simon Moll <smoll@nvidia.com>
Co-authored-by: Tex Riddell <texr@microsoft.com>
Co-authored-by: Chris B <beanz@abolishcrlf.org>
(cherry picked from commit 1db8c5b)
bob80905 added a commit that referenced this pull request Apr 28, 2025
This PR introduces the linear algebra header file, and places it in a
location that is by default included in all HLSL compilation.
The builtins in the API aren't yet defined, and depend on the #7290 PR
merging first.
The tests that have been added have temporary diagnostic messages while
7290 is in progress. They will need to be updated.
Open to feedback on better / suggested error messages, or whether there
shouldn't be any sema-level validation for these errors.

Fixes
[#7304](#7304)

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
tex3d added a commit that referenced this pull request Apr 28, 2025
…ics (#7290) (#7381)

Authored-by: Anupama Chandrasekhar <anupamac@nvidia.com>

Implements
HLSL:
__builtin_MatVecMul
__builtin_MatVecMulAdd
__builtin_OuterProductAccumulate
__builtin_VectorAccumulate

Lowered to
DXIL:
@dx.op.matVecMul
@dx.op.matVecMulAdd
 @dx.op.outerProductAccumulate
 @dx.op.vectorAccumulate

---------

Co-authored-by: Anupama Chandrasekhar <anupamac@nvidia.com>
Co-authored-by: Simon Moll <smoll@nvidia.com>
(cherry picked from commit 1db8c5b)
bob80905 added a commit to bob80905/DirectXShaderCompiler that referenced this pull request Apr 28, 2025
This PR introduces the linear algebra header file, and places it in a
location that is by default included in all HLSL compilation.
The builtins in the API aren't yet defined, and depend on the microsoft#7290 PR
merging first.
The tests that have been added have temporary diagnostic messages while
7290 is in progress. They will need to be updated.
Open to feedback on better / suggested error messages, or whether there
shouldn't be any sema-level validation for these errors.

Fixes
[microsoft#7304](microsoft#7304)

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
bob80905 added a commit that referenced this pull request Apr 29, 2025
This PR introduces the linear algebra header file, and places it in a
location that is by default included in all HLSL compilation. The
builtins in the API aren't yet defined, and depend on the #7290 PR
merging first.
The tests that have been added have temporary diagnostic messages while
7290 is in progress. They will need to be updated. Open to feedback on
better / suggested error messages, or whether there shouldn't be any
sema-level validation for these errors.

Fixes
[#7304](#7304)

Cherrypick of
#7350

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
@damyanp damyanp removed this from HLSL Support Jun 25, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Archived in project

Development

Successfully merging this pull request may close these issues.

6 participants