Skip to content

Bug: QWEN2 quantization GGML_ASSERT #7805

Closed
@bartowski1182

Description

What happened?

When attempting to quantize Qwen2 7B instruct to IQ2_XS I get the following assert:

GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0

Anything I can provide to debug? Uploading the f32 file and imatrix now for recreation

Attempting IQ2_S now, will update if it fails in the same way update: it fails in the same way on the same block

Name and Version

Version b3086, ubuntu 22.04

What operating system are you seeing the problem on?

Linux

Relevant log output

[ 327/ 339]              blk.27.attn_norm.weight - [ 3584,     1,     1,     1], type =    f32, size =    0.014 MB
[ 328/ 339]               blk.27.ffn_down.weight - [18944,  3584,     1,     1], type =    f32, converting to iq2_xs .. GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0
GGML_ASSERT: ggml-quants.c:12083: grid_index >= 0

(PS: is this a high severity or medium/low?)

Metadata

Assignees

No one assigned

    Labels

    bug-unconfirmedhigh severityUsed to report high severity bugs in llama.cpp (Malfunctioning hinder important workflow)stale

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions