How to Apply Different Quantization Settings Per Layer in ExecuTorch? #6846
Labels
module: quantization
triaged
This issue has been looked at a team member, and triaged and prioritized into an appropriate module
Dear @kimishpatel @jerryzh168 @shewu-quic
I want to split a model(eg, Llama-3.2-3B) into multiple layers and apply different quantization settings(qnn_8a8w, qnn_16a4w...) to each layer.
Has such a method been tested in ExecuTorch?
If not, could you suggest how this can be achieved?
Thank you
The text was updated successfully, but these errors were encountered: