GPTQModel v1.0.7

Qubitium released this 08 Oct 14:19

· 74 commits to main since this release

What's Changed

Fixed marlin (faster) kernel was not auto-selected for some models and autoround quantization save throwing json errors.

[FIX] marlin_inference_linear not correctly auto selected for eligible models by @ZX-ModelCloud in #413
[FIX] remove "scale" and "zp" Tensor from layer_config by @ZX-ModelCloud in #414
[FIX] Failed unit test by @ZX-ModelCloud in #420

Full Changelog: v1.0.6...v1.0.7

Contributors

ZX-ModelCloud

Assets 12