Skip to content

[SYCL]fix ggml_sycl_mul_mat_id() to match the change of api #7436

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
May 28, 2024

Conversation

arthw
Copy link
Collaborator

@arthw arthw commented May 21, 2024

Fix the ggml_sycl_mul_mat_id() which is impacted by the api parameters changed by #6505.
Now, only 1 mul_mat_id UT case is fault: type_a=iq4_nl. It's known issue.

@github-actions github-actions bot added the SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language label May 21, 2024
@arthw arthw requested a review from airMeng May 21, 2024 13:59
@mofosyne mofosyne added the Review Complexity : High Generally require indepth knowledge of LLMs or GPUs label May 21, 2024
Copy link
Contributor

@AidanBeltonS AidanBeltonS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor things which I think should be changed.

I have also tested this branch on an Arc A770 and A100 GPU and can confirm the MUL_MAT_ID implementation is working

@AidanBeltonS
Copy link
Contributor

@arthw could you rebase this PR? I believe it will resolve the failing CI checks.

@arthw arthw force-pushed the fix_mul_mat_id branch from 32ba8d3 to e0a686b Compare May 28, 2024 02:51
@NeoZhangJianyu
Copy link
Collaborator

@arthw could you rebase this PR? I believe it will resolve the failing CI checks.

yes, rebase it.

@AidanBeltonS AidanBeltonS merged commit e2b0650 into ggml-org:master May 28, 2024
65 checks passed
@slaren
Copy link
Member

slaren commented Jun 19, 2024

In llm_load_tensors, there is still this code that will prevent using MoE models with SYCL:

https://github.com/ggerganov/llama.cpp/blob/9c77ec1d74874ee22bdef8f110e8e8d41389abf2/llama.cpp#L5283-L5288

It was added in #6505 as a workaround to avoid crashing SYCL with MoE models, but if mul_mat_id was fixed here, it should be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Review Complexity : High Generally require indepth knowledge of LLMs or GPUs SYCL https://en.wikipedia.org/wiki/SYCL - GPU programming language
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants