-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DJL-TensorRT-LLM Bug: TypeError: Got unsupported ScalarType BFloat16 #1816
Comments
Hi Riley, thanks for raising the issue. It seems like this is most likely an error with the checkpoint conversion script in NVIDIA/TensorRT-LLM, since it is directly loading the weights and converting to numpy on CPU, while BFloat is a gpu-only type. I'd suggest creating a ticket in the TensorRT-LLM repo about this issue. To work-around this issue in the meantime, you could manually convert and save the model in fp32 before loading it. |
Hello @ydm-amazon, Thanks for following up. I'll check w/ the TensorRT-LLM repo about the issue. Also wanted to point out that I don't get this issue using the following args in the dockerfile:
|
That's right - we know that TensorRT-LLM switched to a different way of loading the model from 0.7.1 to 0.8.0, so that may have caused the issue. We're also looking into our trtllm toolkit 0.8.0 to see if there's something there that may also contribute to the issue. |
Description
(A clear and concise description of what the bug is.)
I'm am building the DJL-Serving TensorRT-LLM LMI inference container from scratch, and deploying on Sagemaker Endpoints for Zephyr-7B model. Unfortunately, I run into an error from the
tensorrt_llm_toolkit
:TypeError: Got unsupported ScalarType BFloat16
Expected Behavior
(what's the expected behavior?)
Expected the DJL-Serving Image derived from here (https://github.com/deepjavalibrary/djl-serving/blob/master/serving/docker/tensorrt-llm.Dockerfile) to run successfully on Sagemaker Endpoints.
Error Message
(Paste the complete error message, including stack trace.)
How to Reproduce?
(If you developed your own code, please provide a short script that reproduces the error. For existing examples, please provide link.)
The text was updated successfully, but these errors were encountered: