Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building from source fails with tensorrt_llm backend #7382

Open
arya-samsung opened this issue Jun 27, 2024 · 2 comments
Open

Building from source fails with tensorrt_llm backend #7382

arya-samsung opened this issue Jun 27, 2024 · 2 comments

Comments

@arya-samsung
Copy link

Description
While building from source, the build fails when tensorrt_llm backend is chosen.

Triton Information
What version of Triton are you using? r24.04

Are you using the Triton container or did you build it yourself?
Building from source

To Reproduce
Steps to reproduce the behavior.
checkout r24.04 branch of server
run:
./build.py -v --backend=python --enable-logging --endpoint=http --enable-tracing --enable-stats --enable-gpu --backend=tensorrtllm

this gives the error
CMake Error at tensorrt_llm/CMakeLists.txt:107 (message):
The batch manager library is truncated or incomplete. This is usually
caused by using Git LFS (Large File Storage) incorrectly. Please try
running command git lfs install && git lfs pull.

so we tried adding:
self.cmd(f"cd {subdir} && git submodule init && git submodule update --merge && git lfs install && git lfs pull && cd ..", check_exitcode=True,)

after the git clone step here:

server/build.py

Line 325 in bf430f8

)

but this did not help

Describe the models (framework, inputs, outputs), ideally include the model configuration file (if using an ensemble include the model configuration file for that as well).
NA

Expected behavior
The built should have completed successfully, with no errors, and the docker image should have been ready

Additional Details:
Build was attempted using the steps given here: https://github.com/triton-inference-server/tensorrtllm_backend/tree/main#option-1-build-via-the-buildpy-script-in-server-repo

But this failed with the following error:

cp: cannot stat '/tmp/tritonbuild/tensorrtllm/build/triton_tensorrtllm_worker': No such file or directory
error: build failed

@SeibertronSS
Copy link

这大概是因为你的Batch Manager的静态文件不完整导致的

@arya-samsung
Copy link
Author

这大概是因为你的Batch Manager的静态文件不完整导致的

https://github.com/NVIDIA/TensorRT-LLM/tree/main/cpp/tensorrt_llm/batch_manager/aarch64-linux-gnu - this right?

thanks for the lead, will check on this :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

2 participants