Skip to content

b3010

Latest
Compare
Choose a tag to compare
@github-actions github-actions released this 25 May 12:15
Implement Q8_0 quantization fully in PyTorch.

This is equivalent to gguf.quantize_q8_0 but doesn't round-trip to
Numpy.