GPTQModel v0.9.6

Qubitium released this 08 Jul 02:59

· 276 commits to main since this release

v0.9.6

4fade4c

What's Changed

Intel/AutoRound QUANT_METHOD support added for a potentially higher quality quantization with lm_head module quantization support for even more vram reduction: format export to FORMAT.GPTQ for max inference compatibility.

🚀 [CORE] Add AutoRound as Quantizer option by @LRL-ModelCloud in #166
👾 [FIX] [CI] Update test by @CSY-ModelCloud in #177
👾 Cleanup Triton by @Qubitium in #178

Full Changelog: v0.9.5...v0.9.6

Contributors

Qubitium, LRL-ModelCloud, and CSY-ModelCloud

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQModel v0.9.6

What's Changed

Contributors