b3010

Latest

github-actions released this 25 May 12:15

739648f

Implement Q8_0 quantization fully in PyTorch.

This is equivalent to gguf.quantize_q8_0 but doesn't round-trip to
Numpy.

Assets 21

Provide feedback