Skip to content

Bug: Can't make imatrix quant of Q4_0_X_X #9190

@bartowski1182

Description

@bartowski1182

What happened?

When trying to quantize Q4_0_4_4 (and others) with imatrix, I get errors about GGML_ASSERT(result == nrows * row_size)

Name and Version

b3615 ubuntu 22.04

What operating system are you seeing the problem on?

Linux

Relevant log output

[   1/ 543]                    token_embd.weight - [ 7168, 64000,     1,     1], type =    f16,
| ====== llama_model_quantize_internal: did not find weights for token_embd.weight
converting to q4_0 .. size =   875.00 MiB ->   246.09 MiB
[   2/ 543]               blk.0.attn_norm.weight - [ 7168,     1,     1,     1], type =    f32, size =    0.027 MB
[   3/ 543]                blk.0.ffn_down.weight - [20480,  7168,     1,     1], type =    f16, converting to q4_0_4x4 .. ggml/src/ggml.c:20598: GGML_ASSERT(result == nrows * row_size) failed
ggml/src/ggml.c:20598: GGML_ASSERT(result == nrows * row_size) failed
ggml/src/ggml.c:20598: GGML_ASSERT(result == nrows * row_size) failed
ggml/src/ggml.c:20598: GGML_ASSERT(result == nrows * row_size) failed
ggml/src/ggml.c:20598: GGML_ASSERT(result == nrows * row_size) failed
ggml/src/ggml.c:20598: GGML_ASSERT(result == nrows * row_size) failed
ggml/src/ggml.c:20598: GGML_ASSERT(result == nrows * row_size) failed
ggml/src/ggml.c:20598: GGML_ASSERT(result == nrows * row_size) failed
./llama-quantize(+0x4f73b)[0x5b6f3927773b]
ggml/src/ggml.c:20598: GGML_ASSERT(result == nrows * row_size) failed
ggml/src/ggml.c:20598: ggml/src/ggml.c:20598: GGML_ASSERT(result == nrows * row_size) failed
GGML_ASSERT(result == nrows * row_size) failed
./llama-quantize(+0x51357)[0x5b6f39279357]
./llama-quantize(+0x8f129)[0x5b6f392b7129]
./llama-quantize(+0xeaa28)[0x5b6f39312a28]
/lib/x86_64-linux-gnu/libstdc++.so.6(+0xdc253)[0x7cdd2e8ad253]
/lib/x86_64-linux-gnu/libc.so.6(+0x94ac3)[0x7cdd2e4ebac3]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x44)[0x7cdd2e57ca04]
Aborted (core dumped)

Metadata

Metadata

Assignees

No one assigned

    Labels

    bug-unconfirmedlow severityUsed to report low severity bugs in llama.cpp (e.g. cosmetic issues, non critical UI glitches)

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions