diff --git a/README.md b/README.md index 1652b7cc..65af7067 100644 --- a/README.md +++ b/README.md @@ -45,6 +45,8 @@ We will backport bug fixes to AutoGPTQ on a case-by-case basis. * 🚀 Model weights sharding support * 🚀 Security: hash check of model weights on load * ✨ Alert users of sub-optimal calibration data. Most new users get this part horribly wrong. +* ✨ Increased compatiblity with newest models with auto-padding of in/out-features for [ Exllama, Exllama V2, Marlin ] backends. +* 👾 Fixed OPT quantization. Original OPT model code resulted in unusable quantized models. * 👾 Removed non-working, partially working, or fully deprecated features: Peft, ROCM, AWQ Gemm inference, Triton v1 (replaced by v2), Fused Attention (Replaced by Marlin/Exllama). * 👾 Fixed packing Performance regression on high core-count systems. Backported to AutoGPTQ * 👾 Fixed crash on H100. Backported to AutoGPTQ