quant_horizon

quant_horizon is a benchmarking framework designed to evaluate the performance of different GPU kernels.

Prerequisites

To run the benchmark, you need to have the following installed:

PyTorch (with CUDA support)
CUDA Toolkit

We also provide some basic docker images:

# docker-hub python3.11 torch2.5.1 cuda124
docker pull llmcompression/llmc:pure-24112502-cu124
# docker-hub python3.11 torch2.5.1 cuda121
docker pull llmcompression/llmc:pure-24112502-cu121
# aliyun-hub python3.11 torch2.5.1 cuda124
docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-24112502-cu124
# aliyun-hub python3.11 torch2.5.1 cuda121
docker pull registry.cn-hangzhou.aliyuncs.com/yongyang/llmcompression:pure-24112502-cu121

# Then create a container
docker run --gpus all -itd --ipc=host --name [name]  -v [path]:[path] --entrypoint /bin/bash [image_id]

Make sure to install the necessary dependencies using:

cd quant_horizon
pip install -v -e .

Usage

Benchmark a single shape

cd examples
python bench_single_shape.py

Benchmark all shapes in the transformer model

cd examples
# You just need to put the config.json into the model_path folder.
python bench_model_shape.py --model [model_path] --tp 1 --bs 1 --seqlen 2048

Name		Name	Last commit message	Last commit date
Latest commit History 61 Commits
examples		examples
kernels		kernels
tools/get_gpu_info		tools/get_gpu_info
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

quant_horizon

Prerequisites

Usage

Benchmark a single shape

Benchmark all shapes in the transformer model

About

Releases

Packages

Contributors 4

Languages

License

ModelTC/quant_horizon

Folders and files

Latest commit

History

Repository files navigation

quant_horizon

Prerequisites

Usage

Benchmark a single shape

Benchmark all shapes in the transformer model

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages