Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[Unsqueeze_93...Softmax_2088]} #1917

Xinchengzelin · 2022-04-12T06:40:49Z

I use trtexec (TensorRT 8.2.4.2 GA CUDA11.4) to convert my onnx model to trt engine, It show the error below. I tried TensorRT8.4EA Cuda11.5, the transform is OK. and I check the support operator in 8.2GA includes Unsequeeze Softmax, and it's same with that of 8.4EA, So I don't know why this error happens?

[04/12/2022-11:30:51] [V] [TRT] *************** Autotuning format combination: Bool(450,1), Bool(22500,50,1), Float(202500,450,9,1), Float(900,2,1) -> Float(512,1), Float(3,1) ***************
[04/12/2022-11:30:51] [V] [TRT] --------------- Timing Runner: {ForeignNode[Unsqueeze_93...Softmax_2088]} (Myelin)
[04/12/2022-11:30:51] [W] [TRT] Skipping tactic 0 due to insuficient memory on requested size of 26931712 detected for tactic 0.
[04/12/2022-11:30:51] [V] [TRT] Fastest Tactic: -3360065831133338131 Time: inf
[04/12/2022-11:30:51] [E] Error[10]: [optimizer.cpp::computeCosts::2011] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[Unsqueeze_93...Softmax_2088]}.)
[04/12/2022-11:30:51] [E] Error[2]: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
[04/12/2022-11:30:51] [E] Engine could not be created from network
[04/12/2022-11:30:51] [E] Building engine failed
[04/12/2022-11:30:51] [E] Failed to create engine from model.
[04/12/2022-11:30:51] [E] Engine set up failed

The text was updated successfully, but these errors were encountered:

ttyio · 2022-04-12T07:50:19Z

@Xinchengzelin , the ForeignNode[Unsqueeze_93...Softmax_2088] include several nodes that handle by our internal myelin compiler, seems we fixed a failure between 8.2 and 8.4, could you use 8.4 since it fixed your issue? thanks

Xinchengzelin · 2022-04-12T13:19:21Z

@Xinchengzelin , the ForeignNode[Unsqueeze_93...Softmax_2088] include several nodes that handle by our internal myelin compiler, seems we fixed a failure between 8.2 and 8.4, could you use 8.4 since it fixed your issue? thanks

Because the model deployment environment is 8.2, so I couldn't change it.
Besides, in tensorRT 8.2, when I use trtexec to convert the onnx model to .trt model, if I add argument --workspace = 32, trtexec could generate the .trt model. I could load the engine and do inferenc in python, but failed in C++. I'm still confused

Xinchengzelin · 2022-04-13T00:40:10Z

@Xinchengzelin , the ForeignNode[Unsqueeze_93...Softmax_2088] include several nodes that handle by our internal myelin compiler, seems we fixed a failure between 8.2 and 8.4, could you use 8.4 since it fixed your issue? thanks

I find the release notes:

The --workspace flag in trtexec has been deprecated. TensorRT now allocates as much workspace as available GPU memory by default when the --workspace/--memPoolSize flags are not added, instead of having 16MB default workspace size limit in the trtexec in TensorRT 8.2. To limit the workspace size, use the --memPoolSize=workspace: flag instead.

Unfortunately, my problem seems to with it, Could I have the solutions to solve it? the error happens in C++ code when createExcutionContext

handoku · 2022-04-13T03:50:30Z

[04/13/2022-11:48:34] [W] [TRT] Skipping tactic 0 due to Myelin error: Copy operation "concat" has 513 inputs.
[04/13/2022-11:48:34] [E] Error[10]: [optimizer.cpp::computeCosts::2011] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[MPS_VAR_3/strided_slice_1__843:0[Constant]...strided_slice_8__923]}.)
[04/13/2022-11:48:34] [E] Error[2]: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
[04/13/2022-11:48:34] [E] Engine could not be created from network
[04/13/2022-11:48:34] [E] Building engine failed
[04/13/2022-11:48:34] [E] Failed to create engine from model.
[04/13/2022-11:48:34] [E] Engine set up failed

similar problem here, can it be solved by any other method other than using trt8.4

ttyio · 2022-04-14T00:44:35Z

@Xinchengzelin , the ForeignNode[Unsqueeze_93...Softmax_2088] include several nodes that handle by our internal myelin compiler, seems we fixed a failure between 8.2 and 8.4, could you use 8.4 since it fixed your issue? thanks

I find the release notes:

The --workspace flag in trtexec has been deprecated. TensorRT now allocates as much workspace as available GPU memory by default when the --workspace/--memPoolSize flags are not added, instead of having 16MB default workspace size limit in the trtexec in TensorRT 8.2. To limit the workspace size, use the --memPoolSize=workspace: flag instead.

Unfortunately, my problem seems to with it, Could I have the solutions to solve it? the error happens in C++ code when createExcutionContext

Yes we changed the default workspace to max GPU memory start from 8.4. So no need to set workspace memory in 8.4.
Did you serialize cuda engine using trtexec and load it from C++? can you check the trtexec source code and fix your code since trtexec works? https://github.com/NVIDIA/TensorRT/tree/release/8.2/samples/trtexec

ttyio · 2022-04-14T00:46:03Z

[04/13/2022-11:48:34] [W] [TRT] Skipping tactic 0 due to Myelin error: Copy operation "concat" has 513 inputs.
[04/13/2022-11:48:34] [E] Error[10]: [optimizer.cpp::computeCosts::2011] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[MPS_VAR_3/strided_slice_1__843:0[Constant]...strided_slice_8__923]}.)
[04/13/2022-11:48:34] [E] Error[2]: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
[04/13/2022-11:48:34] [E] Engine could not be created from network
[04/13/2022-11:48:34] [E] Building engine failed
[04/13/2022-11:48:34] [E] Failed to create engine from model.
[04/13/2022-11:48:34] [E] Engine set up failed

similar problem here, can it be solved by any other method other than using trt8.4

@handoku , sorry, better to upgrade TRT8.4

Xinchengzelin · 2022-04-14T02:44:46Z

@Xinchengzelin , the ForeignNode[Unsqueeze_93...Softmax_2088] include several nodes that handle by our internal myelin compiler, seems we fixed a failure between 8.2 and 8.4, could you use 8.4 since it fixed your issue? thanks

I find the release notes:

The --workspace flag in trtexec has been deprecated. TensorRT now allocates as much workspace as available GPU memory by default when the --workspace/--memPoolSize flags are not added, instead of having 16MB default workspace size limit in the trtexec in TensorRT 8.2. To limit the workspace size, use the --memPoolSize=workspace: flag instead.

Unfortunately, my problem seems to with it, Could I have the solutions to solve it? the error happens in C++ code when createExcutionContext

Yes we changed the default workspace to max GPU memory start from 8.4. So no need to set workspace memory in 8.4. Did you serialize cuda engine using trtexec and load it from C++? can you check the trtexec source code and fix your code since trtexec works? https://github.com/NVIDIA/TensorRT/tree/release/8.2/samples/trtexec

@ttyio Yes, I serialize cuda engine using trtexec and load it from C++
You mean I should change the trtexec source code? Could you specificlly tell me how to fix my code?

ttyio · 2022-04-14T05:28:04Z

@Xinchengzelin , trtexec also support --saveEngine --loadEngine, did you also hit failure when using trtexec? If not you can check the trtexec source to see what's difference between yours and trtexec.

Xinchengzelin · 2022-04-14T07:09:58Z

@Xinchengzelin , trtexec also support --saveEngine --loadEngine, did you also hit failure when using trtexec? If not you can check the trtexec source to see what's difference between yours and trtexec.

@ttyio Thank you very much! I could use trtexec to load engine. now I can check the difference between my code and trtexec.cpp. trtexec.cpp, right?

ttyio · 2022-04-14T08:44:35Z

Yes @Xinchengzelin , the mainly code in 2 dirs
https://github.com/NVIDIA/TensorRT/tree/release/8.2/samples/trtexec
https://github.com/NVIDIA/TensorRT/tree/release/8.2/samples/common
Thanks!

handoku · 2022-04-20T06:20:32Z

[04/13/2022-11:48:34] [W] [TRT] Skipping tactic 0 due to Myelin error: Copy operation "concat" has 513 inputs.
[04/13/2022-11:48:34] [E] Error[10]: [optimizer.cpp::computeCosts::2011] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[MPS_VAR_3/strided_slice_1__843:0[Constant]...strided_slice_8__923]}.)
[04/13/2022-11:48:34] [E] Error[2]: [builder.cpp::buildSerializedNetwork::609] Error Code 2: Internal Error (Assertion enginePtr != nullptr failed. )
[04/13/2022-11:48:34] [E] Engine could not be created from network
[04/13/2022-11:48:34] [E] Building engine failed
[04/13/2022-11:48:34] [E] Failed to create engine from model.
[04/13/2022-11:48:34] [E] Engine set up failed

similar problem here, can it be solved by any other method other than using trt8.4

@handoku , sorry, better to upgrade TRT8.4

@ttyio hava just tried with trt 8.4.0.6, still get same error

handoku · 2022-04-20T06:29:42Z

Is there any way to prevent myelin from doing this op fusion？

It seems that myelin fused some node into a single one，however, can't find a corresponding implementation to run it.

ttyio · 2022-04-20T06:50:56Z

@handoku , we cannot disable myelin.
have you tried increase the workspace size --workspace if you are using trtexec, if it still failed, could you share your onnx here? thanks

handoku · 2022-04-20T11:04:59Z

@ttyio hello, I hava already set workspace=12Gb on T4. The model can be found here(model.onnx), thanks for looking into this.

ttyio · 2022-04-21T01:46:11Z

Thanks @handoku , internal issue created to track the failure.

ttyio · 2022-04-26T02:09:37Z

@handoku , your issue is fixed in 8.4 GA, please upgrade 8.4 GA when we release the binary in https://developer.nvidia.com/tensorrt, thanks!

Xinchengzelin · 2022-04-26T02:22:50Z

@handoku , your issue is fixed in 8.4 GA, please upgrade 8.4 GA when we release the binary in https://developer.nvidia.com/tensorrt, thanks!

Thanks for your advices, I compared and changed my code with trtexec code, now it works. Thank you very much

handoku · 2022-04-26T03:48:03Z

@ttyio thanks，and when will the GA version be released

ttyio · 2022-04-26T05:54:14Z

@handoku , GA should be available in early June, thanks!

edric1261234 · 2022-06-28T02:12:31Z

I have the similar problem, but fixed by adding /path/to/TensorRT-8.2.4.2/lib to LD_LIBRARY_PATH

a227799770055 · 2023-01-17T02:35:16Z

Hi, I meet the similar problem. And I have tried "--workspace=32", but the problem still occur.
My tensorrt version is 8.5.1.7, and os is ubuntu 20.
Thanks!

[01/17/2023-10:14:48] [W] [TRT] Skipping tactic 0x0000000000000000 due to Myelin error: autotuning: CUDA error 2 allocating 4362077693-byte buffer: out of memory
[01/17/2023-10:14:48] [E] Error[10]: [optimizer.cpp::computeCosts::3728] Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[(Unnamed Layer* 1590) [Shuffle]...Reshape_11153 + Transpose_11154]}.)
[01/17/2023-10:14:48] [E] Error[2]: [builder.cpp::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
[01/17/2023-10:14:48] [E] Engine could not be created from network
[01/17/2023-10:14:48] [E] Building engine failed
[01/17/2023-10:14:48] [E] Failed to create engine from model or file.
[01/17/2023-10:14:48] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8501] # trtexec --onnx=/home/insign/Doc/insign/Monocular-Depth-Estimation-Toolbox/toTRT/depth2023_sim.onnx --int8

proevgenii · 2023-05-02T12:11:01Z

Hi, I meet similar problem
My tensorrt version is 8.5.3.1,

[05/02/2023-12:00:12] [E] Error[2]: [builder.cpp::buildSerializedNetwork::751] Error Code 2: Internal Error (Assertion engine != nullptr failed. )
[05/02/2023-12:00:12] [E] Engine could not be created from network
[05/02/2023-12:00:12] [E] Building engine failed
[05/02/2023-12:00:12] [E] Failed to create engine from model or file.
[05/02/2023-12:00:12] [E] Engine set up failed
&&&& FAILED TensorRT.trtexec [TensorRT v8503] # trtexec --onnx=/workspace/vit_bert_model.onnx --minShapes=images:1x3x224x224,input_ids:1x128,attention_mask:1x128 --optShapes=images:4x3x224x224,input_ids:4x128,attention_mask:4x128 --maxShapes=images:8x3x224x224,input_ids:8x128,attention_mask:8x128 --explicitBatch --workspace=8000 --saveEngine=/workspace/vit_bert_model_fp16_dyn_1_4_8.trt --fp16

UPD: When I run trtexec without --fp16 model exported normally
So, is there a possibility to run with fp16 precision flag?

lix19937 · 2024-04-23T14:36:10Z

CUDA error 2 allocating 4362077693-byte buffer: out of memory

CUDA error 2 allocating 4362077693-byte buffer: out of memory

ttyio added triaged Issue has been triaged by maintainers Topic: Myelin labels Apr 12, 2022

Xinchengzelin closed this as completed Apr 13, 2022

Xinchengzelin reopened this Apr 13, 2022

zhiqwang mentioned this issue Apr 14, 2022

TensorRT environment Internal Error zhiqwang/yolort#385

Closed

Xinchengzelin closed this as completed Apr 26, 2022

grimoire mentioned this issue Apr 27, 2022

[Enhance] remove expand from mmdet rewriter open-mmlab/mmdeploy#371

Merged

TracelessLe mentioned this issue Jul 19, 2022

How to convert T5_v1.1_xxl from ONNX to TRT engine? #2167

Closed

hanrui1sensetime mentioned this issue Aug 8, 2022

Could not find any implementation for node {ForeignNode[971_57...Concat_510]} open-mmlab/mmdeploy#869

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[Unsqueeze_93...Softmax_2088]} #1917

Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[Unsqueeze_93...Softmax_2088]} #1917

Xinchengzelin commented Apr 12, 2022 •

edited

Loading

ttyio commented Apr 12, 2022

Xinchengzelin commented Apr 12, 2022

Xinchengzelin commented Apr 13, 2022

handoku commented Apr 13, 2022

ttyio commented Apr 14, 2022

ttyio commented Apr 14, 2022

Xinchengzelin commented Apr 14, 2022

ttyio commented Apr 14, 2022

Xinchengzelin commented Apr 14, 2022

ttyio commented Apr 14, 2022

handoku commented Apr 20, 2022

handoku commented Apr 20, 2022

ttyio commented Apr 20, 2022 •

edited

Loading

handoku commented Apr 20, 2022

ttyio commented Apr 21, 2022

ttyio commented Apr 26, 2022

Xinchengzelin commented Apr 26, 2022

handoku commented Apr 26, 2022

ttyio commented Apr 26, 2022 •

edited

Loading

edric1261234 commented Jun 28, 2022

a227799770055 commented Jan 17, 2023 •

edited

Loading

proevgenii commented May 2, 2023 •

edited

Loading

lix19937 commented Apr 23, 2024

Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[Unsqueeze_93...Softmax_2088]} #1917

Error Code 10: Internal Error (Could not find any implementation for node {ForeignNode[Unsqueeze_93...Softmax_2088]} #1917

Comments

Xinchengzelin commented Apr 12, 2022 • edited Loading

ttyio commented Apr 12, 2022

Xinchengzelin commented Apr 12, 2022

Xinchengzelin commented Apr 13, 2022

handoku commented Apr 13, 2022

ttyio commented Apr 14, 2022

ttyio commented Apr 14, 2022

Xinchengzelin commented Apr 14, 2022

ttyio commented Apr 14, 2022

Xinchengzelin commented Apr 14, 2022

ttyio commented Apr 14, 2022

handoku commented Apr 20, 2022

handoku commented Apr 20, 2022

ttyio commented Apr 20, 2022 • edited Loading

handoku commented Apr 20, 2022

ttyio commented Apr 21, 2022

ttyio commented Apr 26, 2022

Xinchengzelin commented Apr 26, 2022

handoku commented Apr 26, 2022

ttyio commented Apr 26, 2022 • edited Loading

edric1261234 commented Jun 28, 2022

a227799770055 commented Jan 17, 2023 • edited Loading

proevgenii commented May 2, 2023 • edited Loading

lix19937 commented Apr 23, 2024

Xinchengzelin commented Apr 12, 2022 •

edited

Loading

ttyio commented Apr 20, 2022 •

edited

Loading

ttyio commented Apr 26, 2022 •

edited

Loading

a227799770055 commented Jan 17, 2023 •

edited

Loading

proevgenii commented May 2, 2023 •

edited

Loading