CLBlast: Fix matrix-vector multiplication #3544

shibe2 · 2023-10-08T12:41:55Z

Fix computation for non-K quantized src0.
Add offsets into previously uploaded data, similar to #3447.
Broadcasting (#3402) works in cases affected by this change.
Tested in isolation and with few models.

0cc4m · 2023-10-12T18:23:03Z

Thank you for your work. I didn't notice any issues and it ran fine. Can you tell me the case where the implementation in master fails, that is solved by this fix?

shibe2 · 2023-10-12T19:01:59Z

For f16, q4_0, q4_1, q5_0, q5_1, q8_0: it can go beyond actual data and add extra values.

When 3D or 4D src0 was uploaded with ggml_cl_transform_tensor, only the first 2D slice of it can be accessed.

0cc4m · 2023-10-12T19:59:32Z

Alright, thank you.

* 'master' of github.com:joelkuiper/llama.cpp: CLBlast: Fix matrix-vector multiplication (ggerganov#3544)

CLBlast: Fix matrix-vector multiplication

1693fcb

ggerganov approved these changes Oct 8, 2023

View reviewed changes

ggerganov requested a review from 0cc4m October 8, 2023 13:37

0cc4m merged commit 1e0e873 into ggerganov:master Oct 12, 2023
36 checks passed

joelkuiper added a commit to vortext/llama.cpp that referenced this pull request Oct 13, 2023

Merge branch 'master' of github.com:joelkuiper/llama.cpp

06ee3a2

* 'master' of github.com:joelkuiper/llama.cpp: CLBlast: Fix matrix-vector multiplication (ggerganov#3544)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CLBlast: Fix matrix-vector multiplication #3544

CLBlast: Fix matrix-vector multiplication #3544

shibe2 commented Oct 8, 2023

0cc4m commented Oct 12, 2023

shibe2 commented Oct 12, 2023

0cc4m commented Oct 12, 2023

CLBlast: Fix matrix-vector multiplication #3544

CLBlast: Fix matrix-vector multiplication #3544

Conversation

shibe2 commented Oct 8, 2023

0cc4m commented Oct 12, 2023

shibe2 commented Oct 12, 2023

0cc4m commented Oct 12, 2023