Release GPTQModel v0.9.5 · ModelCloud/GPTQModel

What's Changed

Another large update with added support for Intel/Qbits quantization/inference on CPU. Cuda kernels have been fully deprecated in favor of better performing Exllama (v1/v2), Marlin, and Triton kernels.

🚀🚀 [KERNEL] Added Intel QBits support with [2, 3, 4, 8] bits quantization/inference on CPU by @CSY-ModelCloud in #137
✨ [CORE] BaseQuantLinear add SUPPORTED_DEVICES by @ZX-ModelCloud in #174
✨ [DEPRECATION] Remove Backend.CUDA and Backend.CUDA_OLD by @ZX-ModelCloud in #165
👾 [CI] FIX test perplexity by @ZYC-ModelCloud in #160

Full Changelog: v0.9.4...v0.9.5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GPTQModel v0.9.5

What's Changed

Contributors