Possible typo in comment #680

antirez · 2024-01-05T11:36:12Z

Hi, in the Q2_K structure there is a comment stating that each weight uses 2.5625 bits:

// 2-bit quantization
// weight is represented as x = a * q + b
// 16 blocks of 16 elements each
// Effectively 2.5625 bits per weight
typedef struct {
    uint8_t scales[QK_K/16]; // scales and mins, quantized with 4 bits
    uint8_t qs[QK_K/4];      // quants
    ggml_fp16_t d;           // super-block scale for quantized scales
    ggml_fp16_t dmin;        // super-block scale for quantized mins
} block_q2_K;

But if I do the math, I obtain:

block size = 16 + 64 + 4 = 84 bytes, that is 672 bits
bits per weight = 672/256 = 2.625

Cheers

ggerganov · 2024-01-05T13:37:02Z

Yup, it'a typo - fixed

antirez · 2024-01-05T14:08:02Z

Thanks!

ggerganov added a commit that referenced this issue Jan 5, 2024

ggml : fix q2_k bpw in comments (#680)

2ece373

ggerganov closed this as completed Jan 5, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Possible typo in comment #680

Possible typo in comment #680

antirez commented Jan 5, 2024

ggerganov commented Jan 5, 2024

antirez commented Jan 5, 2024

Possible typo in comment #680

Possible typo in comment #680

Comments

antirez commented Jan 5, 2024

ggerganov commented Jan 5, 2024

antirez commented Jan 5, 2024