-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to convert T5_v1.1_xxl from ONNX to TRT engine? #2167
Comments
@TracelessLe Could you please share how you fixed this? I would really appreciate it! |
Hi @nrakltx , I just try to increase the workspace of TRT config (in the From: DEFAULT_TRT_WORKSPACE_MB = 3072
self.trt_inference_config = CreateConfig(
tf32=True,
fp16=network_metadata.precision.fp16,
max_workspace_size=result.DEFAULT_TRT_WORKSPACE_MB * 1024 * 1024,
profiles=profiles,
obey_precision_constraints=result.use_obey_precision_constraints()
) To: DEFAULT_TRT_WORKSPACE_MB = 3072
self.trt_inference_config = CreateConfig(
tf32=True,
fp16=network_metadata.precision.fp16,
max_workspace_size=result.DEFAULT_TRT_WORKSPACE_MB * 10 * 1024 * 1024,
profiles=profiles,
obey_precision_constraints=result.use_obey_precision_constraints()
) |
This is with FP32 and not FP16, correct? |
Sure, maybe some NAN errors will occur when using FP16 in T5 xxl model as said in: You can have a try. :) |
Cool, so 30GB VRAM was enough for the FP32 T5 v1.1 XXL TensorRT engine building process? |
Did you use 80G or 40G A100? I tried increasing DEFAULT_TRT_WORKSPACE_MB. It gave a "OutOfMemory" msg with both 32g V100 and 40g A100. |
80GB, 40GB is not enough - my average VRAM usage was 45GB~ during compilation. |
Thank you! |
Description
I try to use the sample script in TensorRT/demo/HuggingFace/notebooks/t5.ipynb to converte the google/t5-v1_1-xxl 11b model to onnx format and trt engine file. The pytorch->onnx step is ok, but when I try to load it and convert it to trt model, it always fail after running about 2h with error as below:
the code I use as below:
Ps. I use the jupyter script to convert t5-small, t5-large and t5-3b with no problem, when I come to work with t5-v1.1-xxl, it always fail... :(
Environment
TensorRT Version: 8.2.5.1
NVIDIA GPU: Tested on A100 and 3090Ti
NVIDIA Driver Version: 470.57.02
CUDA Version: 11.4
CUDNN Version: 8.2
Operating System: Ubuntu 18.04
Python Version (if applicable): 3.7
Tensorflow Version (if applicable):
PyTorch Version (if applicable): 1.11.0+cu113
Baremetal or Container (if so, version):
Relevant Files
I find some similar problems such as #1686 and #1937 and #1917, and I try to increase the workspace but no effect。
I open the trt verbose print, the infomation in running as below:
t5xxl_trt_log.txt
Steps To Reproduce
The text was updated successfully, but these errors were encountered: