Closed
Description
The following command fails:
./bin/perplexity -m ../models/llama-7b-v2/ggml-model-q4_0.gguf -f ../build/wikitext-2-raw/wiki.test.raw -ngl 0 -t 4 -b 1
llama_new_context_with_model: kv self size = 256.00 MB
llama_new_context_with_model: compute buffer total size = 1.61 MB
ggml_allocr_alloc: not enough space in the buffer (needed 88064, largest block available 32)
GGML_ASSERT: /Users/ggerganov/development/github/llama.cpp/ggml-alloc.c:174: !"not enough space in the buffer"
Abort trap: 6
I think this used to work until recently.