Eval bug: ASSERT failed

### Name and Version

8133
Vulkan pre-build binary distributed for Linux x86.

### Operating systems

Linux

### GGML backends

Vulkan

### Hardware

Strix Halo

### Models

https://huggingface.co/ggml-org/gpt-oss-120b-GGUF


### Problem description & steps to reproduce

./llama-b8133_vk/llama-server -m gpt-oss-120b-mxfp4-00001-of-00003.gguf -c 1310720 --host 0.0.0.0 --parallel 10 -kvu

3 request with context length 10000 => OK
4 request with context length 10000 => crash
/home/runner/work/llama.cpp/llama.cpp/ggml/src/ggml-backend.cpp:306: GGML_ASSERT(tensor->data != NULL && "tensor not allocated") failed

(request sent by llama-benchy)

### First Bad Commit

_No response_

### Relevant log output

srv  params_from_: Chat format: GPT-OSS
srv  params_from_: Chat format: GPT-OSS
srv  params_from_: Chat format: GPT-OSS
slot get_availabl: id  1 | task -1 | selected slot by LRU, t_last = -1
slot launch_slot_: id  1 | task -1 | sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> top-p -> min-p -> ?xtc -> temp-ext -> dist
slot launch_slot_: id  1 | task 428 | processing task, is_child = 0
slot get_availabl: id  0 | task -1 | selected slot by LRU, t_last = -1
slot launch_slot_: id  0 | task -1 | sampler chain: logits -> ?penalties -> ?dry -> ?top-n-sigma -> top-k -> ?typical -> top-p -> min-p -> ?xtc -> temp-ext -> dist
slot launch_slot_: id  0 | task 429 | processing task, is_child = 0
slot get_availabl: id  9 | task -1 | selected slot by LRU, t_last = 19337946462
srv  get_availabl: updating prompt cache
srv   prompt_save:  - saving prompt with length 95, total state size = 6.683 MiB
/home/runner/work/llama.cpp/llama.cpp/ggml/src/ggml-backend.cpp:306: GGML_ASSERT(tensor->data != NULL && "tensor not allocated") failed
/home/user/chat/llama-b8133_vk/libggml-base.so.0(+0x1848b) [0x7f00bca5d48b]
/home/user/chat/llama-b8133_vk/libggml-base.so.0(ggml_print_backtrace+0x21f) [0x7f00bca5d8ef]
/home/user/chat/llama-b8133_vk/libggml-base.so.0(ggml_abort+0x152) [0x7f00bca5dac2]
/home/user/chat/llama-b8133_vk/libggml-base.so.0(ggml_backend_tensor_get+0x109) [0x7f00bca74d59]
/home/user/chat/llama-b8133_vk/libllama.so.0(_ZN21llama_io_write_buffer12write_tensorEPK11ggml_tensormm+0x31) [0x7f00bc0c7dd1]
/home/user/chat/llama-b8133_vk/libllama.so.0(_ZNK14llama_kv_cache16state_write_dataER16llama_io_write_iRKNS_13cell_ranges_tE+0x156) [0x7f00bc0fdae6]
/home/user/chat/llama-b8133_vk/libllama.so.0(_ZNK14llama_kv_cache11state_writeER16llama_io_write_iij+0x295) [0x7f00bc0fdfe5]
/home/user/chat/llama-b8133_vk/libllama.so.0(_ZNK19llama_kv_cache_iswa11state_writeER16llama_io_write_iij+0x2c) [0x7f00bc10fe6c]
/home/user/chat/llama-b8133_vk/libllama.so.0(_ZN13llama_context20state_seq_write_dataER16llama_io_write_iij+0x1a) [0x7f00bc0ba62a]
/home/user/chat/llama-b8133_vk/libllama.so.0(_ZN13llama_context18state_seq_get_dataEiPhmj+0x4d) [0x7f00bc0ba6fd]
./llama-b8133_vk/llama-server(+0x130d70) [0x5606a2f48d70]
./llama-b8133_vk/llama-server(+0x13eb0b) [0x5606a2f56b0b]
./llama-b8133_vk/llama-server(+0x1819ee) [0x5606a2f999ee]
./llama-b8133_vk/llama-server(+0xa177e) [0x5606a2eb977e]
/usr/lib/x86_64-linux-gnu/libc.so.6(+0x29f75) [0x7f00bba33f75]
/usr/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x87) [0x7f00bba34027]
./llama-b8133_vk/llama-server(+0xa52d5) [0x5606a2ebd2d5]

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Eval bug: ASSERT failed #19839

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Eval bug: ASSERT failed #19839

Description

Name and Version

Operating systems

GGML backends

Hardware

Models

Problem description & steps to reproduce

First Bad Commit

Relevant log output

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions