You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, in the Q2_K structure there is a comment stating that each weight uses 2.5625 bits:
// 2-bit quantization// weight is represented as x = a * q + b// 16 blocks of 16 elements each// Effectively 2.5625 bits per weighttypedefstruct {
uint8_tscales[QK_K/16]; // scales and mins, quantized with 4 bitsuint8_tqs[QK_K/4]; // quantsggml_fp16_td; // super-block scale for quantized scalesggml_fp16_tdmin; // super-block scale for quantized mins
} block_q2_K;
But if I do the math, I obtain:
block size = 16 + 64 + 4 = 84 bytes, that is 672 bits
bits per weight = 672/256 = 2.625
Cheers
The text was updated successfully, but these errors were encountered:
Hi, in the Q2_K structure there is a comment stating that each weight uses 2.5625 bits:
But if I do the math, I obtain:
block size = 16 + 64 + 4 = 84 bytes, that is 672 bits
bits per weight = 672/256 = 2.625
Cheers
The text was updated successfully, but these errors were encountered: