[Version] v1.7.1. (#445)

intel · Jun 12, 2024 · 38658b1 · 38658b1
1 parent c8ba661
commit 38658b1
Show file tree

Hide file tree

Showing 2 changed files with 11 additions and 1 deletion.
diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -1,4 +1,14 @@
 # CHANGELOG
+# [Version v1.7.0](https://github.com/intel/xFasterTransformer/releases/tag/v1.7.0)
+v1.7.1 - Continuous batching feature supports ChatGLM2/3.
+
+## Functionality
+- Add continuous batching support of ChatGLM2/3 models.
+- Qwen2Convert supports quantized Qwen2 models by GPTQ, such as GPTQ-Int8 and GPTQ-Int4, by param `from_quantized_model="gptq"`.
+
+## BUG fix
+- Fixed the segament fault error when running with more than 2 ranks in vllm-xft serving.
+
 # [Version v1.7.0](https://github.com/intel/xFasterTransformer/releases/tag/v1.7.0)
 v1.7.0 - Continuous batching feature supported.
 

diff --git a/VERSION b/VERSION
@@ -1 +1 @@
-1.7.0
+1.7.1