Skip to content

v1.7.2 - Continuous batching feature supports Qwen 1.0 & hybrid data types.

Compare
Choose a tag to compare
@Duyi-Wang Duyi-Wang released this 18 Jun 05:07
· 43 commits to main since this release
da2a7fa

v1.7.2 - Continuous batching feature supports Qwen 1.0 & hybrid data types.

Functionality

  • Add continuous batching support of Qwen 1.0 models.
  • Enable hybrid data types for continuous batching feature, including BF16_FP16, BF16_INT8, BF16_W8A8, BF16_INT4, BF16_NF4, W8A8_INT8, W8A8_int4, W8A8_NF4.

BUG fix

  • Fixed the convert fault in Baichuan1 models.

What's Changed

Generated release nots

Full Changelog: v1.7.1...v1.7.2