You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am using the following Docker image: vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:latest.
🐛 Describe the bug
On the main branch of the vllm-fork repository, I attempted to run the "meta-llama/Meta-Llama-3-70B" model using the following code:
from vllm import LLM, SamplingParams
import sys
import os
os.environ['PT_HPU_LAZY_MODE'] = '1'
prompts = [
"The president of the United States is",
"The capital of France is",
]
sampling_params = SamplingParams(n=1, temperature=0, max_tokens=30)
llm = LLM(model="meta-llama/Meta-Llama-3-70B", max_num_seqs=32, tensor_parallel_size=8)
outputs = llm.generate(prompts, sampling_params)
However, I encountered the following error:
(RayWorkerWrapper pid=26165) ERROR 08-16 05:39:57 worker_base.py:382] File "/usr/local/lib/python3.10/dist-packages/torch/distributed/c10d_logger.py", line 75, in wrapper [repeated 6x across cluster]
(RayWorkerWrapper pid=26165) ERROR 08-16 05:39:57 worker_base.py:382] File "/usr/local/lib/python3.10/dist-packages/torch/distributed/distributed_c10d.py", line 2220, in all_reduce [repeated 6x across cluster]
(RayWorkerWrapper pid=26165) ERROR 08-16 05:39:57 worker_base.py:382] work = group.allreduce([tensor], opts) [repeated 6x across cluster]
(RayWorkerWrapper pid=26165) ERROR 08-16 05:39:57 worker_base.py:382] RuntimeError: collective nonSFG is not supported during hpu graph capturing [repeated 6x across cluster]
The text was updated successfully, but these errors were encountered:
Hi @xinsu626, please set this variable: PT_HPU_ENABLE_LAZY_COLLECTIVES=true. It is required to make HPU graphs working with tensor parallelism.
Please check: Environment variables
Hi @xinsu626, please set this variable: PT_HPU_ENABLE_LAZY_COLLECTIVES=true. It is required to make HPU graphs working with tensor parallelism. Please check: Environment variables
Your current environment
I am using the following Docker image: vault.habana.ai/gaudi-docker/1.17.0/ubuntu22.04/habanalabs/pytorch-installer-2.3.1:latest.
🐛 Describe the bug
On the main branch of the vllm-fork repository, I attempted to run the "meta-llama/Meta-Llama-3-70B" model using the following code:
However, I encountered the following error:
The text was updated successfully, but these errors were encountered: