Skip to content

Commit

Permalink
Rebuild docs for speed benchmark (#1045)
Browse files Browse the repository at this point in the history
* add qwen2.5 perf report

* update readme

* rebuild docs and fix format issue

* remove fuzzy in speed_benchmark.po

* fix issue

* recover function_call.po

* update

* remove unused code in speed_benchmark.po
  • Loading branch information
wangxingjun778 authored Nov 6, 2024
1 parent 0f0ecfb commit a912d23
Show file tree
Hide file tree
Showing 3 changed files with 2,534 additions and 1,523 deletions.
6 changes: 3 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ To learn more about Qwen2.5, feel free to read our documentation \[[EN](https://
- Quantization: the practice of quantizing LLMs with GPTQ, AWQ, as well as the guidance for how to make high-quality quantized GGUF files;
- Training: the instructions for post-training, including SFT and RLHF (TODO) with frameworks like Axolotl, LLaMA-Factory, etc.
- Framework: the usage of Qwen with frameworks for application, e.g., RAG, Agent, etc.
- Benchmark: the statistics about inference speed and memory footprint (to be updated for Qwen2.5).
- Benchmark: the statistics about inference speed and memory footprint (Available for Qwen2.5).

## Introduction

Expand All @@ -37,7 +37,7 @@ In the past three months since Qwen2's release, numerous developers have built n

## News

- 2024.09.19: We released the Qwen2.5 series. This time there are 3 extra model sizes: 3B, 14B, and 32B for more possibilities. Check our [blog](https://qwenlm.github.io/blog/qwen2.5) for more!
- 2024.09.19: We released the Qwen2.5 series. This time there are 3 extra model sizes: 3B, 14B, and 32B for more possibilities. Check our [blog](https://qwenlm.github.io/blog/qwen2.5) for more!
- 2024.06.06: We released the Qwen2 series. Check our [blog](https://qwenlm.github.io/blog/qwen2/)!
- 2024.03.28: We released the first MoE model of Qwen: Qwen1.5-MoE-A2.7B! Temporarily, only HF transformers and vLLM support the model. We will soon add the support of llama.cpp, mlx-lm, etc. Check our [blog](https://qwenlm.github.io/blog/qwen-moe/) for more information!
- 2024.02.05: We released the Qwen1.5 series.
Expand All @@ -46,7 +46,7 @@ In the past three months since Qwen2's release, numerous developers have built n

Detailed evaluation results are reported in this <a href="https://qwenlm.github.io/blog/qwen2.5/"> 📑 blog</a>.

For requirements on GPU memory and the respective throughput, see results [here](https://qwen.readthedocs.io/en/latest/benchmark/speed_benchmark.html) (to be updated for Qwen2.5).
For requirements on GPU memory and the respective throughput, see results [here](https://qwen.readthedocs.io/en/latest/benchmark/speed_benchmark.html) .

## Quickstart

Expand Down
Loading

0 comments on commit a912d23

Please sign in to comment.