Skip to content

GPTQModel v0.9.9

Compare
Choose a tag to compare
@Qubitium Qubitium released this 24 Jul 16:42
· 182 commits to main since this release
519fbe3

What's Changed

Added Llama-3.1 support, Gemma2 27B quant inference support via vLLM, auto pad_token normalization, fixed auto-round quant compat for vLLM/SGLang.

Full Changelog: v0.9.8...v0.9.9