Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CLBlast: Fix matrix-vector multiplication #3544

Merged
merged 1 commit into from
Oct 12, 2023

Conversation

shibe2
Copy link
Contributor

@shibe2 shibe2 commented Oct 8, 2023

Fix computation for non-K quantized src0.
Add offsets into previously uploaded data, similar to #3447.
Broadcasting (#3402) works in cases affected by this change.
Tested in isolation and with few models.

@ggerganov ggerganov requested a review from 0cc4m October 8, 2023 13:37
@0cc4m
Copy link
Collaborator

0cc4m commented Oct 12, 2023

Thank you for your work. I didn't notice any issues and it ran fine. Can you tell me the case where the implementation in master fails, that is solved by this fix?

@shibe2
Copy link
Contributor Author

shibe2 commented Oct 12, 2023

For f16, q4_0, q4_1, q5_0, q5_1, q8_0: it can go beyond actual data and add extra values.

When 3D or 4D src0 was uploaded with ggml_cl_transform_tensor, only the first 2D slice of it can be accessed.

@0cc4m
Copy link
Collaborator

0cc4m commented Oct 12, 2023

Alright, thank you.

@0cc4m 0cc4m merged commit 1e0e873 into ggerganov:master Oct 12, 2023
36 checks passed
joelkuiper added a commit to vortext/llama.cpp that referenced this pull request Oct 13, 2023
* 'master' of github.com:joelkuiper/llama.cpp:
  CLBlast: Fix matrix-vector multiplication (ggerganov#3544)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants