[Version] v1.7.1. (#445)

Duyi-Wang · web-flow · commit 38658b162874 · 2024-06-12T13:25:05.000+08:00
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,4 +1,14 @@
 # CHANGELOG
+# [Version v1.7.0](https://github.com/intel/xFasterTransformer/releases/tag/v1.7.0)
+v1.7.1 - Continuous batching feature supports ChatGLM2/3.
+
+## Functionality
+- Add continuous batching support of ChatGLM2/3 models.
+- Qwen2Convert supports quantized Qwen2 models by GPTQ, such as GPTQ-Int8 and GPTQ-Int4, by param `from_quantized_model="gptq"`.
+
+## BUG fix
+- Fixed the segament fault error when running with more than 2 ranks in vllm-xft serving.
+
 # [Version v1.7.0](https://github.com/intel/xFasterTransformer/releases/tag/v1.7.0)
 v1.7.0 - Continuous batching feature supported.
 
diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-1.7.0
+1.7.1