Skip to content

Latest commit

 

History

History
7 lines (4 loc) · 338 Bytes

221118 SmoothQuant.md

File metadata and controls

7 lines (4 loc) · 338 Bytes

https://arxiv.org/abs/2211.10438

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models (Guangxuan Xiao, Ji Lin, Mickael Seznec, Julien Demouth, Song Han)

llm에 대한 quantization이군요. LLM.int8()과는 달리 성능을 유지하면서 속도도 향상시켰다고 합니다.

#llm #quantization