Pruning, Quantization recipe for Small Language Model
Official codebase for (One-Short) Depth Pruning + (PTQ) GPTQ framework for SLM
- We quantitatively demonstrate the results of applying one-shot pruning and post-training quantization to SLM.
- This repository plans to expand by demonstrating the results of applying more models and techniques in the future.