Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_M_range_check exception encountered in ICudaEngine::createExecutionContext() #3335

Closed
lhai37 opened this issue Sep 19, 2023 · 11 comments
Closed
Assignees
Labels
triaged Issue has been triaged by maintainers

Comments

@lhai37
Copy link

lhai37 commented Sep 19, 2023

Description

Note: this issue is also posted to the NVIDIA TensorRT Forum

Loading an ONNX model (attached) via the C++ API, triggers the exception upon calling ICudaEngine::createExecutionContext():

[E] [TRT] 1: Unexpected exception vector<bool>::_M_range_check: __n (which is 0) >= this->size() (which is 0)

This is also reproducible using the released sampleOnnxMNIST code. I am attaching both the ONNX file and the code file to reproduce it here.

Interestingly, trtexec --onnx=palm.onnx can load the model just fine, so it seems that there’s a way to get this working via the C++ API, but I’m unable to pinpoint what it is.

Environment

TensorRT Version: 8.6.1

NVIDIA GPU: NVIDIA GeForce RTX 3090

NVIDIA Driver Version: 535.54.03

CUDA Version: 12.0 (but also reproducible on 11.6)

CUDNN Version: 8.8

Operating System: Ubuntu 20.04

Python Version (if applicable): N/A

Tensorflow Version (if applicable): N/A

PyTorch Version (if applicable): N/A

Baremetal or Container (if so, version):

Relevant Files

Model link: https://forums.developer.nvidia.com/uploads/short-url/lOrlbR6P44UrBKYeCakZ7CZSoKs.onnx
Source code file to repro: https://forums.developer.nvidia.com/uploads/short-url/hMuSyiIpJO9Wz7KkdId6HZK1B1r.cpp

Steps To Reproduce

  • Set up & launch the docker container build environment for TensorRT samples per instructions.
  • Replace the attached code file sampleOnnxMNIST.cpp
  • Compile (instructions) and run the executable located at /workspace/TensorRT/build/out/sample_onnx_mnist
  • Exception is raised:
Creating execution context
[09/01/2023-06:58:59] [E] [TRT] 1: Unexpected exception vector<bool>::_M_range_check: __n (which is 0) >= this->size() (which is 0)
Created execution context 
&&&& FAILED TensorRT.sample_onnx_mnist [TensorRT v8601] # ./sample_onnx_mnist

Commands or scripts:

Have you tried the latest release?: Yes

Can this model run on other frameworks? For example run ONNX model with ONNXRuntime (polygraphy run <model.onnx> --onnxrt): Yes it can be loaded with trtexec, but not via the C++ API

@zerollzeng
Copy link
Collaborator

I guess something is broken in the sampleOnnxMNIST.cpp, may I ask what modification you made on the sampleOnnxMNIST.cpp?

I'll try to reproduce later. Would be great if you can answer above questions :-)

@zerollzeng zerollzeng self-assigned this Sep 22, 2023
@zerollzeng zerollzeng added the triaged Issue has been triaged by maintainers label Sep 22, 2023
@lhai37
Copy link
Author

lhai37 commented Sep 22, 2023

Thank you for looking into this. I attached the modified sampleOnnxMNIST.cpp file in the original post so you can do a diff if needed, but the changes are trivial and shouldn't affect how the model is loaded vs the vanilla sample code.

@zerollzeng
Copy link
Collaborator

I can reproduce the issue, but I can't tell it's a bug because

  1. The original sample work well.
  2. the sampleOnnxMnist was made only for mnist. change the onnx may break something. I can see you new ONNX model has 3 outputs.
  3. trtexec works well.
&&&& PASSED TensorRT.trtexec [TensorRT v8601] # /usr/src/tensorrt/bin/trtexec --onnx=/workspace/TensorRT/palm.onnx

@TheExDeus
Copy link

TheExDeus commented Oct 3, 2023

I have the same issue. MaskRCNN based model which also converts fine, trtexec runs fine, but when loading in Triton TensorRT backend it fails with this error. Has 3 outputs as well.
I can reproduce the issue, but I can't tell it's a bug because
It is a bug if a solution crashes (without any real backtrace or message) and it doesn't run a model a different backend runs just fine (onnx backend can run it).

@lhai37
Copy link
Author

lhai37 commented Oct 3, 2023

I can reproduce the issue, but I can't tell it's a bug because

  1. The original sample work well.
  2. the sampleOnnxMnist was made only for mnist. change the onnx may break something. I can see you new ONNX model has 3 outputs.
  3. trtexec works well.
&&&& PASSED TensorRT.trtexec [TensorRT v8601] # /usr/src/tensorrt/bin/trtexec --onnx=/workspace/TensorRT/palm.onnx

Thank you for looking into this. Your comment is exactly why I think this might be a bug, because trtexec works well, and the C++ API doesn't. If you inspect the sample file I created, it is just using the standard API as per TensorRT doc to load the model, it's unrelated to the MNIST sample, I'm only using that to demonstrate that the standard way of loading an ONNX model in C++ fails.

@ASD271
Copy link

ASD271 commented Oct 16, 2023

i have simple problem when load engine from file when run SampleOnnxMNIST, which is mainly because mxxxDims is not settle well. Then i manually set those value. Here is my code
`bool SampleOnnxMNIST::build()
{

std::ifstream file("E:\\project\\tensorrt\\minist\\mnist.engine",std::ios::binary);
std::string str((std::istreambuf_iterator<char>(file)),
                std::istreambuf_iterator<char>());


mRuntime = std::shared_ptr<nvinfer1::IRuntime>(createInferRuntime(sample::gLogger.getTRTLogger()));
if (!mRuntime)
{
    return false;
}

mEngine = std::shared_ptr<nvinfer1::ICudaEngine>(
        mRuntime->deserializeCudaEngine(str.data(), str.size()), samplesCommon::InferDeleter());


if (!mEngine)
{
    return false;
}


mInputDims={4,{1, 1, 28, 28}};
mOutputDims={2,{1,10}};
return true;

}`

@ttyio
Copy link
Collaborator

ttyio commented Jul 23, 2024

We have fixed the _M_range_check: in the latest version TRT, could you retry the TRT 10 from https://developer.nvidia.com/tensorrt? sorry for the delay response.

@ttyio ttyio added the bug label Jul 23, 2024
@akhilg-nv
Copy link
Collaborator

Closing this issue due to no response after 3 weeks as per our policy. If there is still an issue, please feel free to re-open or create a new issue @lhai37.

@lix19937
Copy link

lix19937 commented Sep 7, 2024

About _M_range_check topic

I also come across this problem

[09/07/2024-23:36:08] [V] [TRT] Adding reformat layer: Reformatted Input Tensor 1 to {ForeignNode[Expand_20441...Slice_20496]} (onnx::Slice_24309) from Half(4,1) to Float(4,1)
[09/07/2024-23:36:08] [V] [TRT] Adding reformat layer: Reformatted Input Tensor 0 to PWN(Clip_20497) (onnx::Clip_24375) from Float(2,1) to Half(2,1)
[09/07/2024-23:36:08] [V] [TRT] Adding reformat layer: Reformatted Input Tensor 3 to {ForeignNode[(Unnamed Layer* 6844) [ElementWise]...Concat_20562]} (onnx::Expand_24379) from Half(2,1) to Float(2,1)
[09/07/2024-23:36:08] [V] [TRT] Formats and tactics selection completed in 404.959 seconds.
[09/07/2024-23:36:08] [V] [TRT] After reformat layers: 369 layers
[09/07/2024-23:36:08] [V] [TRT] Total number of blocks in pre-optimized block assignment: 463
[09/07/2024-23:36:08] [I] [TRT] Detected 1 inputs and 1 output network tensors.
[09/07/2024-23:36:11] [V] [TRT] Deleting timing cache: 2775 entries, served 8732 hits since creation.
[09/07/2024-23:36:11] [E] Error[1]: Unexpected exception vector::_M_range_check: __n (which is 1) >= this->size() (which is 1)
[09/07/2024-23:36:11] [E] Engine could not be created from network
[09/07/2024-23:36:11] [E] Building engine failed
[09/07/2024-23:36:11] [E] Failed to create engine from model or file.
[09/07/2024-23:36:11] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8601] # trtexec --onnx=./codetr_sim.onnx --verbose --fp16

@D3-aavery
Copy link

@ttyio How do we address this on Jetson which is still currently limited to TensorRT 8, even with JetPack 6? Upgrading to TensorRT 10 is not an acceptable solution in this scenario. Could we please re-open this bug to reflect that?

@J-xinyu
Copy link

J-xinyu commented Nov 18, 2024

I also come across this problem
vector::_M_range_check: __n (which is 2) >= this->size() (which is 2)
Upgrading to tensorrt10 cannot accepte on Jetson

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triaged Issue has been triaged by maintainers
Projects
None yet
Development

No branches or pull requests

9 participants