Skip to content

ggml-alloc : potential regression with n_batch == 1 #3058

Closed
@ggerganov

Description

@ggerganov

The following command fails:

./bin/perplexity -m ../models/llama-7b-v2/ggml-model-q4_0.gguf -f ../build/wikitext-2-raw/wiki.test.raw -ngl 0 -t 4 -b 1

llama_new_context_with_model: kv self size  =  256.00 MB
llama_new_context_with_model: compute buffer total size =    1.61 MB
ggml_allocr_alloc: not enough space in the buffer (needed 88064, largest block available 32)
GGML_ASSERT: /Users/ggerganov/development/github/llama.cpp/ggml-alloc.c:174: !"not enough space in the buffer"
Abort trap: 6

I think this used to work until recently.

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions