-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TensorRT fails to build engine from pytorch_quantization ONNX #3577
Comments
What is the datatype of QuantizeLinear_338's input x? |
Does the onnx work with onnxruntime? This can be quickly checked by Check https://github.com/NVIDIA/TensorRT/tree/main/tools/Polygraphy for more info. |
Yes, it passed:
|
Could you please share the onnx here? Thanks! |
Could it be possible to send it to you in private? |
Please share a private google drive link and I'll request for access. Thanks! |
Great. The ONNX file is available here |
Could you please let me know when you have sent the file access request, so I know it is you and approve it? I've already got access requests from 2 different users. |
I've just request access. |
I can reproduce the issue, but the Identily layer is weird here, it served as the zero point of Q/DQ... cc @ttyio for viz, should I file an internal bug for this? |
@zerollzeng could you try the internal nightly build, we should already supported this. If not let's create a bug, thanks! |
Filed internal bug 4491468. |
Fixed in TRT 10, closed. |
@zerollzeng Hi, is the fix currently available anywhere (EA? OSS?)? I am willing to build from source. If not, when/where can we expect it to become available? |
Please wait for the TRT 10 release, I guess EA will come out in March/April |
Has the fix already been released? How can I get it on Jetson Orin? |
You have to wait for the Jetpack update, but JP has it own release schedule so sorry I cannot help with this. |
Description
I created a quantized model in pytorch using
pytorch_quantization
and exported it to ONNX.Then, I executed the following command on Jetson Orin:
Here is part of the
trtexec
output that includes the error:The error refers to the node
QuantizeLinear_898
and the error isInt8 constant is only allowed before DQ node
.Looking at the ONNX graph, I can see that there is a node related to
QuantizeLinear_898
that has no input:Any idea what went wrong and how to solve it?
Environment
Model compilation:
TensorRT Version: TensorRT v8502 (Jetson Orin)
Model quantization and export to ONNX:
OS: Windows 10
Python Version (if applicable): 3.9.12
PyTorch Version (if applicable): 1.12.1+cu116
pytorch_quantization version: 2.1.3
The text was updated successfully, but these errors were encountered: