-
Notifications
You must be signed in to change notification settings - Fork 792
[SYCL][CUDA] Implementation of matrix ext using new "unified" interface #7077
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
This is a move towards the future looking joint_matrix, joint_matrix_load, joint_matrix_store APIs. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Also updated the impl functions used in the CUDA backend (Some of these functions may be also used in the HIP AMD case when that is implemented, since the interfaces will match). Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
This is for illustrative purposes: to show the advantage of the proposed change in the joint_matrix_mad interface. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <chezjakirk@gmail.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Used consistant naming convention in impl. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
/verify with intel/llvm-test-suite#1183 |
Updated all tests to use new "unified" interfaces from intel/llvm#7077. The old legacy interface implementation is deprecated but still tested via the _legacy files. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Tests require intel/llvm#7077 Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Xfail tests that are not supported yet. Tests require intel/llvm#7077 Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
@bader Can this be merged now that it has the two approvals? AMD failures are unrelated. Thanks |
/verify with intel/llvm-test-suite#1334 |
GitHub says we need one more approval from @intel/llvm-reviewers-runtime team. |
OK. It would be great if this can get a review quite quickly. I will be on holiday after tomorrow and we wanted to have this before the 2023.1 code freeze also. It would mean that we could publish the joint_matrix optimizations for SYCL-DNN and SYCL-BLAS for the 2023.1 release too. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks okay from a high-level perspective and @dkhaldi and @yubingex007-a11y have been thorough. 👍
Thanks for the review! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
I just had to merge the sycl branch to resolve the conflict with c103a6a where the now unnecessary c++17 checks were removed. |
/verify with intel/llvm-test-suite#1334 |
/verify with intel/llvm-test-suite#1334 |
std::ignore = sg; | ||
return wi_data(jm); | ||
#else | ||
// TODO add Intel impl. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AerialMantis @JackAKirk @dkhaldi
since we can't provide wi_data in both cuda&intel's header, i will make wi_data unified again and provide wi_data of host version, so the return type should be "decltype(auto)".
layout Layout> | ||
struct joint_matrix { | ||
|
||
#if defined(__SYCL_DEVICE_ONLY__) && defined(__SPIR__) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AerialMantis @JackAKirk
sorry, i remember previously it is:
#if defined(__SYCL_DEVICE_ONLY__)
#if defined(__NVPTX__)
sycl::ext::oneapi::detail::joint_matrix_cuda<T, Use, Rows, Cols, Layout>
cuda_impl;
#else
__spv::__spirv_JointMatrixINTEL<
T, Rows, Cols, spv_matrix_layout_traits<Layout>::value,
spv_scope_traits<Group>::value, spv_matrix_use_traits<Use>::value> *spvm;
#endif // defined(__SYCL_DEVICE_ONLY__)
#endif
in intel side, we can't let host compilation use sycl::ext::oneapi::detail::joint_matrix_cuda. so i go back to the previous code and i can still get passed in cuda's testcases.
…suite#1183) Updated all tests to use new "unified" interfaces from intel#7077. The old legacy interface implementation is deprecated but still tested via the _legacy files. Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
…vm-test-suite#1331) Xfail tests that are not supported yet. Tests require intel#7077 Signed-off-by: JackAKirk <jack.kirk@codeplay.com>
CUDA backend implementation using the "unified" matrix extension interface. The same interface will be used for a future Intel backend implementation of the matrix extension.
joint_matrix_load
,joint_matrix_store
,joint_matrix_mad
andjoint_matrix
interfaces match the new spec from [SYCL][Spec] Update the matrix spec based on new use argument #6662joint_matrix_*
functions into new header matrix-unified.hpp: Intel backend implementations can be called from the same functions in the future.