num_io_tensors get error of TensorRT 8.5 when running on GPU 4090 #3803

peter5232 · 2024-04-16T17:44:36Z

Description

I have four input tensors [ "kpts0", "kpts1", "desc0", "desc1" ].

torch.onnx.export(
            lightglue,
            (kpts0, kpts1, desc0, desc1),
            lightglue_path,
            input_names=["kpts0", "kpts1", "desc0", "desc1"],
            output_names=["matches0", "mscores0"],
            opset_version=17,
            dynamic_axes={
                "kpts0": {1: "num_keypoints0"},
                "kpts1": {1: "num_keypoints1"},
                "desc0": {1: "num_keypoints0"},
                "desc1": {1: "num_keypoints1"},
                "matches0": {0: "num_matches0"},
                "mscores0": {0: "num_matches0"},
            },
        )

I convert engine with the following command. onnx file

trtexec --onnx=superpoint_lightglue.onnx --saveEngine=superpoint_lightglue.engine

But when I use the Python API to obtain the IO Tensor, I only get desc0, desc1, matches0, mscores0.

import tensorrt as trt

logger = trt.Logger(trt.Logger.WARNING)

with open("superpoint_lightglue.engine", "rb") as f:
        engine = trt.Runtime(logger).deserialize_cuda_engine(f.read())
        tensor_names = [engine.get_tensor_name(i) for i in range(engine.num_io_tensors)]
        print(tensor_names)

I get output as follow.

['desc0', 'desc1', 'matches0', 'mscores0']

Environment

TensorRT Version: v8.5.3 and v8.6.1

NVIDIA GPU: 4090

NVIDIA Driver Version: 535.129.03

CUDA Version: 11.8

CUDNN Version: 8.9.6

Operating System:

Python Version (if applicable): 3.11

Tensorflow Version (if applicable):

PyTorch Version (if applicable): 2.1.0

Baremetal or Container (if so, version):

The text was updated successfully, but these errors were encountered:

lix19937 · 2024-04-20T14:46:02Z

You can try as follow

trtexec --onnx=superpoint_lightglue.onnx  --loadEngine=superpoint_lightglue.engine   --verbose  2>&1 |tee log   

cat log |grep "Using random values for input"   
cat log |grep "Using random values for output"

can show all inputs and outputs.

peter5232 · 2024-04-21T15:16:58Z

I try this command and I get output as follow.

[04/21/2024-23:12:14] [I] Using random values for input desc0
[04/21/2024-23:12:14] [I] Using random values for input desc1

Actually have two inputs. But onnx file have four input tensors.

torch.onnx.export(
            lightglue,
            (kpts0, kpts1, desc0, desc1),
            lightglue_path,
            input_names=["kpts0", "kpts1", "desc0", "desc1"],
            output_names=["matches0", "mscores0"],
            opset_version=17,
            dynamic_axes={
                "kpts0": {1: "num_keypoints0"},
                "kpts1": {1: "num_keypoints1"},
                "desc0": {1: "num_keypoints0"},
                "desc1": {1: "num_keypoints1"},
                "matches0": {0: "num_matches0"},
                "mscores0": {0: "num_matches0"},
            },
        )

peter5232 · 2024-04-21T15:18:31Z

I try this command and I get output as follow.

[04/21/2024-23:12:14] [I] Using random values for input desc0
[04/21/2024-23:12:14] [I] Using random values for input desc1

Actually have two inputs. But onnx file have four input tensors.

torch.onnx.export(
            lightglue,
            (kpts0, kpts1, desc0, desc1),
            lightglue_path,
            input_names=["kpts0", "kpts1", "desc0", "desc1"],
            output_names=["matches0", "mscores0"],
            opset_version=17,
            dynamic_axes={
                "kpts0": {1: "num_keypoints0"},
                "kpts1": {1: "num_keypoints1"},
                "desc0": {1: "num_keypoints0"},
                "desc1": {1: "num_keypoints1"},
                "matches0": {0: "num_matches0"},
                "mscores0": {0: "num_matches0"},
            },
        )

lix19937 · 2024-04-21T15:36:57Z

@peter5232
can you run follow cmd

trtexec --onnx=superpoint_lightglue.onnx  --saveEngine=superpoint_lightglue.engine  --verbose 2>&1 | tee  build.log

and then upload the build.log file

zerollzeng · 2024-04-25T14:51:20Z

what does polygraphy inspect model superpoint_lightglue.onnx output? Or how many inputs you can see in netron?

lix19937 · 2024-05-04T03:11:33Z

Check inputs/outputs by netron is not always right. Sometimes netron can not see the hidden inputs/outputs.

lix19937 · 2024-05-06T03:32:39Z

@zerollzeng I come across one case, the onnx(39MB) open by netron show nothing, but use trtexec can build pass.

[05/06/2024-11:23:47] [I] Engine deserialized in 0.113882 sec.
[05/06/2024-11:23:47] [V] [TRT] Total per-runner device persistent memory is 0
[05/06/2024-11:23:47] [V] [TRT] Total per-runner host persistent memory is 0
[05/06/2024-11:23:47] [V] [TRT] Allocated activation device memory of size 0
[05/06/2024-11:23:47] [I] [TRT] [MemUsageChange] TensorRT-managed allocation in IExecutionContext creation: CPU +0, GPU +0, now: CPU 0, GPU 39 (MiB)
[05/06/2024-11:23:47] [I] Setting persistentCacheLimit to 0 bytes.
[05/06/2024-11:23:47] [V] Using enqueueV3.
[05/06/2024-11:23:47] [I] Using random values for output 82
[05/06/2024-11:23:47] [I] Created output binding for 82 with dimensions 1x256x200x200
[05/06/2024-11:23:47] [I] Starting inference
[05/06/2024-11:23:50] [I] The e2e network timing is not reported since it is inaccurate due to the extra synchronizations when the profiler is enabled.
[05/06/2024-11:23:50] [I] To show e2e network timing report, add --separateProfileRun to profile layer timing in a separate run or remove --dumpProfile to disable the profiler.
[05/06/2024-11:23:50] [I]
[05/06/2024-11:23:50] [I] === Profile (1032 iterations ) ===
[05/06/2024-11:23:50] [I]                                            Layer   Time (ms)   Avg. Time (ms)   Median Time (ms)   Time %
[05/06/2024-11:23:50] [I]  Reformatting CopyNode for Output Tensor 0 to 82      384.10           0.3722             0.3758    100.0
[05/06/2024-11:23:50] [I]                                            Total      384.10           0.3722             0.3758    100.0
[05/06/2024-11:23:50] [I]
&&&& PASSED TensorRT.trtexec [TensorRT v8510] # trtexec --onnx=positional_encoding_poly.onnx --verbose --dumpProfile

zerollzeng self-assigned this Apr 25, 2024

zerollzeng added the triaged Issue has been triaged by maintainers label Apr 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

num_io_tensors get error of TensorRT 8.5 when running on GPU 4090 #3803

num_io_tensors get error of TensorRT 8.5 when running on GPU 4090 #3803

peter5232 commented Apr 16, 2024

lix19937 commented Apr 20, 2024 •

edited

Loading

peter5232 commented Apr 21, 2024

peter5232 commented Apr 21, 2024

lix19937 commented Apr 21, 2024

zerollzeng commented Apr 25, 2024

lix19937 commented May 4, 2024 •

edited

Loading

lix19937 commented May 6, 2024 •

edited

Loading

num_io_tensors get error of TensorRT 8.5 when running on GPU 4090 #3803

num_io_tensors get error of TensorRT 8.5 when running on GPU 4090 #3803

Comments

peter5232 commented Apr 16, 2024

Description

Environment

lix19937 commented Apr 20, 2024 • edited Loading

peter5232 commented Apr 21, 2024

peter5232 commented Apr 21, 2024

lix19937 commented Apr 21, 2024

zerollzeng commented Apr 25, 2024

lix19937 commented May 4, 2024 • edited Loading

lix19937 commented May 6, 2024 • edited Loading

lix19937 commented Apr 20, 2024 •

edited

Loading

lix19937 commented May 4, 2024 •

edited

Loading

lix19937 commented May 6, 2024 •

edited

Loading