Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error Code 2: Internal Error (Assertion !mValueMapUndo failed. ) failure of TensorRT 10.5 when running speechbrain language detection model on GPU NVIDIA GeForce RTX 3090 #4277

Open
msublee opened this issue Dec 10, 2024 · 9 comments
Assignees
Labels
Engine Build Issues with engine build internal-bug-tracked Tracked internally, will be fixed in a future release. triaged Issue has been triaged by maintainers

Comments

@msublee
Copy link

msublee commented Dec 10, 2024

Description

I converted the speechbrain language detection model to ONNX model, and tried to convert it to TensorRT through trtexec, but the error below occurred.

Error[2]: [graphShapeAnalyzer.cpp::eraseFromTensorMaps::1138] Error Code 2: Internal Error (Assertion !mValueMapUndo failed. )

Environment

TensorRT Version: 10.5.0.18 (Container version 24.10)

NVIDIA GPU: NVIDIA GeForce RTX 3090

NVIDIA Driver Version: 550.127.05

CUDA Version: 12.4

Operating System:

Python Version (if applicable): 3.10

PyTorch Version (if applicable): 2.4.1

Steps To Reproduce

Commands or scripts:

trtexec --onnx=/workspace/model.onnx --saveEngine=/workspace/output/model.plan.bsz4 --memPoolSize=workspace:8192 --minShapes=wavforms:1x1,wav_lens:1x1 --optShapes=wavforms:4x320000,wav_lens:4x1 --maxShapes=wavforms:4x320000,wav_lens:4x1 --fp16

Have you tried the latest release?: I tried container version 24.11

@msublee msublee changed the title Error Code 2: Internal Error (Assertion !mValueMapUndo failed. ) failure of TensorRT X.Y when running speechbrain language detection model on GPU NVIDIA GeForce RTX 3090 Error Code 2: Internal Error (Assertion !mValueMapUndo failed. ) failure of TensorRT 10.5 when running speechbrain language detection model on GPU NVIDIA GeForce RTX 3090 Dec 11, 2024
@lix19937
Copy link

Add --verbose and attach thebuild log here ?

@asfiyab-nvidia asfiyab-nvidia self-assigned this Dec 16, 2024
@asfiyab-nvidia asfiyab-nvidia added Engine Build Issues with engine build triaged Issue has been triaged by maintainers labels Dec 16, 2024
@TigerSong
Copy link

TigerSong commented Dec 18, 2024

I get the same error, in trt10.5 and trt10.7

[12/18/2024-12:20:52] [E] Error[2]: [graphShapeAnalyzer.cpp::eraseFromTensorMaps::1138] Error Code 2: Internal Error (Assertion !mValueMapUndo failed. )
[12/18/2024-12:20:52] [E] Engine could not be created from network
[12/18/2024-12:20:52] [E] Building engine failed
[12/18/2024-12:20:52] [E] Failed to create engine from model or file.
[12/18/2024-12:20:52] [E] Engine set up failed

add --verbose has nothing

POST my topic in forum
https://forums.developer.nvidia.com/t/trt10-5-10-7-trtexec-convert-onnx-model-failed-error-code-2-internal-error-assertion-mvaluemapundo-failed/317205

@asfiyab-nvidia
Copy link
Collaborator

@msublee please provide the ONNX model and the trtexec command used so we can investigate

@msublee
Copy link
Author

msublee commented Dec 19, 2024

build log with --verbose: trtlog.txt

The model is too large to upload. What should I do? @asfiyab-nvidia

@asfiyab-nvidia
Copy link
Collaborator

Thanks for the log @msublee . You can upload your model on Google drive and share a link. That will help us reproduce the issue locally

@asfiyab-nvidia asfiyab-nvidia added the internal-bug-tracked Tracked internally, will be fixed in a future release. label Dec 20, 2024
@msublee
Copy link
Author

msublee commented Dec 20, 2024

onnx model link: https://drive.google.com/drive/folders/1feKnT5egNIdVr2xheURHCWq9R2Q_yYuw?usp=drive_link

The link above contains two model files: "model.onnx", which is a model converted using torch.onnx.export, and "model.sim.onnx", which is a simplified version of "model.onnx" created using onnxsim.

I just tested it again, and when using "model.sim.onnx" with trtexec, Error Code 2 occurs, causing the build to completely fail. On the other hand, when using "model.onnx" with trtexec, the build succeeds, but an error appears midway through (Error Code 9 below), and when I actually run inference, the results are completely messed up.

[12/20/2024-05:38:51] [E] Error[9]: Error Code: 9: Skipping tactic 0x0000000000000000 due to exception [shape.cpp:verify_output_type:1417] Mismatched type for tensor logits', f16 vs. expected type:f32.

The trtexec command was mentioned above, but I'll write it again for clarity.

trtexec --onnx=<onnx-model-file> --saveEngine=/workspace/output/model.plan.bsz4.fp16 --memPoolSize=workspace:8192 --minShapes=wavforms:1x1,wav_lens:1x1 --optShapes=wavforms:4x320000,wav_lens:4x1 --maxShapes=wavforms:4x320000,wav_lens:4x1 --fp16

@asfiyab-nvidia
Copy link
Collaborator

Thanks @msublee . We will get back to you soon

@lix19937
Copy link

do a test on fixed shape onnx, or use the latest trt .

@asfiyab-nvidia
Copy link
Collaborator

This bug is being tracked internally and the fix for it should be released in 10.9

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Engine Build Issues with engine build internal-bug-tracked Tracked internally, will be fixed in a future release. triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

4 participants