Skip to content

GPTQModel v0.9.6

Compare
Choose a tag to compare
@Qubitium Qubitium released this 08 Jul 02:59
· 276 commits to main since this release
4fade4c

What's Changed

Intel/AutoRound QUANT_METHOD support added for a potentially higher quality quantization with lm_head module quantization support for even more vram reduction: format export to FORMAT.GPTQ for max inference compatibility.

Full Changelog: v0.9.5...v0.9.6