Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gpu: nvidia: matmul: fix issues with scaling #2564

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

sgeor255
Copy link
Contributor

Description

Currently tests that use the cublasLt implementation of matmul with src, weight or dst scaling fail. This PR fixes several issues:

  • For dst scaling, the binary operation was created with the incorrect datatype for the scales tensor. Additionally, the memory descriptor for the scales tensor was not initialised correctly and was triggering an assertion due to inner_idxs not being initialised to sensible values. Additionally, a shared_ptr containing the binary_pd was freed incorrectly at the end of the create_scaling_format_desc function, resulting in a segmentation fault.
  • src & weight scaling currently cannot be supported with this implementation. The current approach was to use the binary primitive to scale the src/weights before performing matmul. However, since the cublasLt implementation only works with integer types, the output of the src/weight scaling binary was being converted to an integer type. The scaling factors are floating point numbers, therefore the scaled src/weights were being truncated and then passed to the matmul, outputting incorrect results. I tested using the alpha parameter of the gemm parameter to perform this scaling, however since the compute type is set to an integer type, the scaling is performed in integer type too, also resulting in incorrect output. This PR therefore removes the code related to src/weight scaling as it wasn't correct and adds a return in the init of the primitive for cases that have src/wei scales.

Checklist

General

  • Do all unit and benchdnn tests (make test and make test_benchdnn_*) pass locally for each commit?
  • Have you formatted the code using clang-format?

@sgeor255 sgeor255 requested review from a team as code owners January 30, 2025 11:11
@github-actions github-actions bot added the platform:gpu-nvidia Codeowner: @oneapi-src/onednn-gpu-nvidia label Jan 30, 2025
@dzarukin
Copy link
Contributor

Hi @sgeor255, I did some clean up in #2270 and merged it. Could you, please, rebase and check if all changes are still needed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
platform:gpu-nvidia Codeowner: @oneapi-src/onednn-gpu-nvidia
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants