Skip to content

[Matrix] Enable joint_matrix_fill for joint_matrix feature #4994

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Dec 24, 2021
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions sycl/include/CL/__spirv/spirv_ops.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -86,6 +86,12 @@ __spirv_JointMatrixSUMadINTEL(
__spv::__spirv_JointMatrixINTEL<T3, M, N, LC, S> *C,
__spv::Scope::Flag Sc = __spv::Scope::Flag::Subgroup);

template <typename T, std::size_t R, std::size_t C,
__spv::MatrixLayout L = __spv::MatrixLayout::RowMajor,
__spv::Scope::Flag S = __spv::Scope::Flag::Subgroup>
extern SYCL_EXTERNAL __spv::__spirv_JointMatrixINTEL<T, R, C, L, S> *
__spirv_CompositeConstruct(const T v);

template <typename T, std::size_t R, std::size_t C, __spv::MatrixLayout U,
__spv::Scope::Flag S = __spv::Scope::Flag::Subgroup>
extern SYCL_EXTERNAL size_t __spirv_JointMatrixWorkItemLengthINTEL(
Expand Down
17 changes: 17 additions & 0 deletions sycl/include/sycl/ext/oneapi/matrix/matrix-jit.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -202,6 +202,23 @@ joint_matrix_mad(Group sg, joint_matrix<T1, M, K, LayoutA, Group> &mA,
#endif // __SYCL_DEVICE_ONLY__
}

template <typename Group, typename T, size_t NumRows, size_t NumCols,
matrix_layout Layout>
inline __SYCL_ALWAYS_INLINE void
joint_matrix_fill(Group sg,
joint_matrix<T, NumRows, NumCols, Layout, Group> &res,
const T v) {
Comment on lines +208 to +210
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
joint_matrix_fill(Group sg,
joint_matrix<T, NumRows, NumCols, Layout, Group> &res,
const T v) {
joint_matrix_fill(joint_matrix<T, NumRows, NumCols, Layout, Group> &res,
const T v) {

sg was unused

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are not using scope on the SPIRV instruction as we are using the existing spirv_CompositeConstruct instruction. This argument will remain unused.
But we still need it on the DPC++ function to match the other DPC++ functions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case please cast it to void, otherwise the compiler will emit a diagnostic.

// We kept the unused "sg" in joint_matrix_fill to match the other DPC++
// functions
(void)sg;
Copy link
Contributor

@MrSidims MrSidims Dec 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: might be worth to put a comment here, why we kept (sg) parameter (to be aligned with other AMX API?, to may be replace __spirv_CompositeConstruct with some other instruction in case if we are not satisficed without breaking API/ABI with returning 'sg' back?)

#ifdef __SYCL_DEVICE_ONLY__
res.spvm = __spirv_CompositeConstruct<T, NumRows, NumCols>(v);
#else
(void)res;
(void)v;
#endif // __SYCL_DEVICE_ONLY__
}

template <typename T, size_t NumRows, size_t NumCols,
matrix_layout Layout = matrix_layout::row_major,
typename Group = sycl::sub_group>
Expand Down
10 changes: 4 additions & 6 deletions sycl/test/matrix/matrix-int8-test.cpp
Original file line number Diff line number Diff line change
@@ -1,4 +1,5 @@
// RUN: %clangxx -fsycl -O2 %s -o %t.out
// XFAIL: *
#include <CL/sycl.hpp>
#if (SYCL_EXT_ONEAPI_MATRIX == 2)
#include <iostream>
Expand Down Expand Up @@ -68,10 +69,7 @@ void matrix_multiply(big_matrix<T1, NUM_ROWS_C, NUM_COLS_C> &C, big_matrix<T2, N

// AMX: 8 register tiles : 1k byte size, SMmaxxSKmax =16x64
// strideX = X's cols, so strideC = N, strideA = K, strideB = N*4
joint_matrix_load(sg, sub_c,
accC.get_pointer() + (sg_startx * TM) * N +
sg_starty / SG_SZ * TN,
N, matrix_layout::row_major);
joint_matrix_fill(sg, sub_c, 0);
for (int k = 0; k < K / TK; k += 1) {
joint_matrix_load(
sg, sub_a, accA.get_pointer() + (sg_startx * TM) * K + k * TK,
Expand Down Expand Up @@ -129,8 +127,8 @@ int main() {
}
for (int i = 0; i < MATRIX_M; i++) {
for (int j = 0; j < MATRIX_N; j++) {
C[i][j] = 1;
D[i][j] = 1;
C[i][j] = 0;
D[i][j] = 0;
}
}

Expand Down