You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The following commands fail to generate coherent text:
LLAMA_QKK_64=1 make -j && ./main -m tmp/mnt/models/open-llama/3B-v2/ggml-model-q4_k.gguf -p "I believe the meaning of life is" -t 8 -ngl 1
LLAMA_QKK_64=1 make -j && ./main -m tmp/mnt/models/open-llama/3B-v2/ggml-model-q3_k.gguf -p "I believe the meaning of life is" -t 8 -ngl 1
It works on the CPU (Arm and x86).
It also works with the following patch:
The following commands fail to generate coherent text:
It works on the CPU (Arm and x86).
It also works with the following patch:
So it seems the issue is in the
kernel_mul_mat_q4_K_f32
kernel in theQK_K == 64
branch:llama.cpp/ggml-metal.metal
Lines 1576 to 1663 in a40f2b6
Might have been broken with #2615 , but I haven't tested this yet
The text was updated successfully, but these errors were encountered: