[EMNLP 2024 Industry Track] This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".
benchmark
deployment
tool
evaluation
pruning
quantization
post-training-quantization
awq
large-language-models
llm
vllm
smoothquant
mixtral
internlm2
lvlm
llama3
omniquant
quarot
lightllm
spinquant
-
Updated
Nov 18, 2024 - Python