Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Triton 24.05 crashes on Ubuntu when loading TensorRT RetinaNet model trained with TAO #7397

Open
mar-jas opened this issue Jul 1, 2024 · 0 comments

Comments

@mar-jas
Copy link

mar-jas commented Jul 1, 2024

Description
Triton 24.05 crashes without any info during model initialization - see log below.
We use a RetinaNet model trained with TAO Toolkit (two versions checked: 5.0.0 and 5.3.0) to be run with TensortRT backend (platform: “tensorrt_plan”). We used it successfully with an old Triton 23.05.

The model.plan file is converted from onnx file using “trtexec” inside this tritonserver:24.05 container. We checked that trtexec successfully performs the inference with this model.plan. We have also run the model on Jetson AGX Orin with JetPack 6 and Triton 24.05 without any problem. So we assume that the issue is limited to Linux and Triton.

Triton Information
tritonserver:24.05

Are you using the Triton container or did you build it yourself?
downloaded: nvcr.io/nvidia/tritonserver:24.05-py3

To Reproduce

  • Train RetinaNet model with TAO Toolkit v5.3.0 (also with v5.0.0)
  • Export to ONNX with “tao model retinanet export …”
  • Run triton container: nvcr.io/nvidia/tritonserver:24.05-py3
  • Convert to plan: “trtexec --onnx=model.onnx --saveEngine=model.plan”
  • Use model.plan in Triton model

name: "v5.3.0_20240624-internal"
platform: "tens

orrt_plan"
input {
name: "Input"
data_type: TYPE_FP32
dims: [-1, 3, 512, 608]
}
output {
name: "NMS"
data_type: TYPE_FP32
dims: [-1, 1, 200, 7]
}
output {
name: "NMS_1"
data_type: TYPE_FP32
dims: [-1, 1, 1, 1]
}
default_model_filename: "model.plan"

log:

root@53c28d76c621:/work# tritonserver --model-repository=/models
I0627 14:08:29.972225 2613 pinned_memory_manager.cc:275] "Pinned memory pool is created at '0x7fdb4c000000' with size 268435456"
I0627 14:08:29.972367 2613 cuda_memory_manager.cc:107] "CUDA memory pool is created on device 0 with size 67108864"
I0627 14:08:29.973512 2613 model_lifecycle.cc:472] "loading: v5.3.0_20240624:1"
I0627 14:08:29.973531 2613 model_lifecycle.cc:472] "loading: v5.3.0_20240624-internal:1"
I0627 14:08:29.982372 2613 tensorrt.cc:65] "TRITONBACKEND_Initialize: tensorrt"
I0627 14:08:29.982382 2613 tensorrt.cc:75] "Triton TRITONBACKEND API version: 1.19"
I0627 14:08:29.982387 2613 tensorrt.cc:81] "'tensorrt' TRITONBACKEND API version: 1.19"
I0627 14:08:29.982390 2613 tensorrt.cc:105] "backend configuration:\n{"cmdline":{"auto-complete-config":"true","backend-directory":"/opt/tritonserver/backends","min-compute-capability":"6.000000","default-max-batch-size":"4"}}"
I0627 14:08:29.982574 2613 tensorrt.cc:231] "TRITONBACKEND_ModelInitialize: v5.3.0_20240624-internal (version 1)"
I0627 14:08:30.050148 2613 logging.cc:46] "Loaded engine size: 91 MiB"
Segmentation fault (core dumped)
root@53c28d76c621:/work# I0627 14:08:30.329108 2637 pb_stub.cc:2121] Non-graceful termination detected.

Expected behavior
Model should be initialised and ready for usage, as for the old Triton.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

1 participant