gpu: nvidia: Refactor to native parameters for matmul #2111

ShanoToni · 2024-09-23T17:07:19Z

Description

This refactor is required to address current issues in nvidia backend matmul. Currently the use of runtime dimensions requires native-specific member objects to be initialised, which are specific to the dimensions (Tensor descriptors, matrix layouts). This is handled currently by cleaning up the members and re-initializing on execution. This is not thread safe.

The refactor proposes abstraction of the members to separate structs, which in the non-runtime dimension case would be initialised with the primitive descriptor and passed to the primitive on its respective creation. When not using runtime dimensions the parameters are cleaned up when the primitive is destroyed.

In the runtime dimension case, the parameters are initialised only with the non native specific members. Inside the execute call a copy is created of the new params struct and only then are the native specific structs initialised for the copy. When using runtime dimensions the params are cleaned up at the end of the host_task running the matmul.

Checklist

General

[ x ] Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
[ x ] Have you formatted the code using clang-format?

densamoilov · 2024-09-27T18:11:27Z

src/gpu/nvidia/cudnn_matmul.hpp

@@ -116,24 +122,25 @@ struct cudnn_matmul_t : cudnn_matmul_base_t {
            }
            return true;
        }
+
+        std::shared_ptr<cublas_params> params_;


Is there any reason why we want to have shared ownership over the params?

It needs to be created in the primitive descriptor and it is used by the implementation, attempting to pass a unique_ptr back and forward from impl to primitive might not be desirable.

src/gpu/nvidia/cudnn_matmul_executor.hpp

densamoilov

The refactoring looks very nice to me, a long-awaited one to be honest. Thanks!

ShanoToni requested a review from a team as a code owner September 23, 2024 17:07

ShanoToni requested review from kala855, t4c1, densamoilov, mgouicem, dzarukin, sgeor255 and dylanangus September 23, 2024 17:07

github-actions bot added the platform:gpu-nvidia Codeowner: @oneapi-src/onednn-gpu-nvidia label Sep 23, 2024

ShanoToni force-pushed the cublas_matmul_runtime_refactor branch from 6da1758 to c190dbf Compare September 24, 2024 10:37

ShanoToni changed the title ~~Refactor to native parameters for matmul~~ gpu: nvidia: Refactor to native parameters for matmul Sep 24, 2024

ShanoToni force-pushed the cublas_matmul_runtime_refactor branch from c190dbf to 2341c03 Compare September 24, 2024 15:57

densamoilov reviewed Sep 27, 2024

View reviewed changes

src/gpu/nvidia/cudnn_matmul_executor.hpp Outdated Show resolved Hide resolved

densamoilov reviewed Sep 27, 2024

View reviewed changes

src/gpu/nvidia/cudnn_matmul_executor.hpp Outdated Show resolved Hide resolved

densamoilov reviewed Sep 27, 2024

View reviewed changes

ShanoToni force-pushed the cublas_matmul_runtime_refactor branch from 2341c03 to a950e3e Compare October 2, 2024 16:51

gpu: nvidia: Refactor to native parameters for matmul

b209e92

ShanoToni force-pushed the cublas_matmul_runtime_refactor branch from a950e3e to b209e92 Compare October 2, 2024 16:57

densamoilov approved these changes Oct 2, 2024

View reviewed changes

spalicki approved these changes Oct 3, 2024

View reviewed changes

spalicki merged commit 42be8d5 into oneapi-src:main Oct 3, 2024
24 checks passed

vpirogov added this to the v3.7 milestone Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpu: nvidia: Refactor to native parameters for matmul #2111

gpu: nvidia: Refactor to native parameters for matmul #2111

ShanoToni commented Sep 23, 2024

densamoilov Sep 27, 2024

ShanoToni Oct 2, 2024

densamoilov left a comment

gpu: nvidia: Refactor to native parameters for matmul #2111

gpu: nvidia: Refactor to native parameters for matmul #2111

Conversation

ShanoToni commented Sep 23, 2024

Description

Checklist

General

densamoilov Sep 27, 2024

Choose a reason for hiding this comment

ShanoToni Oct 2, 2024

Choose a reason for hiding this comment

densamoilov left a comment

Choose a reason for hiding this comment