-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Export ONNX QOperator #882
Comments
Would you be able to provide the full script to generate the onnx model? I know that's just few more lines with respect to what you have already put here, but it's just to be sure to replicate exactly what you see. Many thanks! |
Hi @Giuseppe5, Here is minimum reproducible code : |
Thanks for sharing! I hope this explains this behaviour. |
The regular ONNX ReLU supports float and integer values: https://github.com/onnx/onnx/blob/main/docs/Operators.md#relu |
In general, we are working to deprecate support of QOp in favor of QCDQ (#834), so we probably won't change this behavior. Sorry for any inconvenience. |
Hi Team Brevitas,
I trying a simple toy model to check how the exported onnx model with QOps looks like. As per the ONNX_export_tutorial.ipynb, you can pass the quantized input to a QuantIdentity layer with attribute return_quant_tensor = True or alternatively set input_quant = Uint8ActPerTensorFloat. I have following toy model,
Ideally as per the model definition, there should one QuantizeLinear before the first layer and no dequantization in between the model. DeQuantizeLinear should be at the end of the layer since the return_quant_tensor = False in the last layer.
But the graph visualization with netron gives DeQuantizeLinear before every QuantReLU op, which I find it weird behaviour, since it receives quant_tensor as input and returns quant tensor. But if I skip the ReLU activation between the convolutions I am getting the right graph with QuantizeLinear before the first layer and DeQuantizeLinear at the end of the last layer.
Dependencies
torch 1.13.0
brevitas 0.10.2
Could someone explain why is this behaviour, is it intended or there is something wrong the way I have defined the model.
Thanks!
The text was updated successfully, but these errors were encountered: