Couldn't detect gpu when generation using ray data_parallel_size > 1 #2591

zhaocaibei123 · 2024-12-23T07:12:54Z

I'm trying to use data_parallel to generation, this is my command:

CUDA_VISIBLE_DEVICES="0,1"
lm_eval --model vllm \
    --model_args pretrained=./downloads/llama3.2-1b-instruct,trust_remote_code=true,tensor_parallel_size=1,dtype=auto,gpu_memory_utilization=0.3,data_parallel_size=2 \
    --tasks truthfulqa_gen \
    --batch_size auto

I got an error:

And I try to print cuda devices like this:

ray: 2.39.0
vllm: 0.6.4.post1
torch: 2.5.1+cu124
I wonder is there something wrong with my environment and how to solve this problem.

The text was updated successfully, but these errors were encountered:

zhaocaibei123 · 2024-12-23T10:57:37Z

I reset my environment and it's ok now.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Couldn't detect gpu when generation using ray data_parallel_size > 1 #2591

Couldn't detect gpu when generation using ray data_parallel_size > 1 #2591

zhaocaibei123 commented Dec 23, 2024

zhaocaibei123 commented Dec 23, 2024

Couldn't detect gpu when generation using ray data_parallel_size > 1 #2591

Couldn't detect gpu when generation using ray data_parallel_size > 1 #2591

Comments

zhaocaibei123 commented Dec 23, 2024

zhaocaibei123 commented Dec 23, 2024