You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Squeeze/Unsqueeze OP is more like Permute OP, where we can find easy ways to modify the QuantTensor to keep those OPs affine quantization invariant. In the case of Squeeze/Unsqueeze OP, all we need to do is squeeze/unsqueeze the scale and zero point tensor accordingly.
However, OPs mentioned in #728 (reshape, flatten) are non-trivial. There are no trivial ways to modify the QuantTensor to keep those OPs affine quantization invariant. Recalculation of scale and zero point is inevitable. We may need to dequantize --> reshape/flatten --> requantize to bypass this problem, at the price of precision loss.
It looks like PyTorch doesn't solve this problem either. They don’t offer a quantized version of the flatten(); instead, they simply use torch.flatten(). QUANTIZATION API REFERENCE
No description provided.
The text was updated successfully, but these errors were encountered: