Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Possible typo in comment #680

Closed
antirez opened this issue Jan 5, 2024 · 2 comments
Closed

Possible typo in comment #680

antirez opened this issue Jan 5, 2024 · 2 comments

Comments

@antirez
Copy link

antirez commented Jan 5, 2024

Hi, in the Q2_K structure there is a comment stating that each weight uses 2.5625 bits:

// 2-bit quantization
// weight is represented as x = a * q + b
// 16 blocks of 16 elements each
// Effectively 2.5625 bits per weight
typedef struct {
    uint8_t scales[QK_K/16]; // scales and mins, quantized with 4 bits
    uint8_t qs[QK_K/4];      // quants
    ggml_fp16_t d;           // super-block scale for quantized scales
    ggml_fp16_t dmin;        // super-block scale for quantized mins
} block_q2_K;

But if I do the math, I obtain:

block size = 16 + 64 + 4 = 84 bytes, that is 672 bits
bits per weight = 672/256 = 2.625

Cheers

@ggerganov
Copy link
Owner

Yup, it'a typo - fixed

@antirez
Copy link
Author

antirez commented Jan 5, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants