Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Couldn't detect gpu when generation using ray data_parallel_size > 1 #2591

Open
zhaocaibei123 opened this issue Dec 23, 2024 · 1 comment
Open

Comments

@zhaocaibei123
Copy link

I'm trying to use data_parallel to generation, this is my command:

CUDA_VISIBLE_DEVICES="0,1"
lm_eval --model vllm \
    --model_args pretrained=./downloads/llama3.2-1b-instruct,trust_remote_code=true,tensor_parallel_size=1,dtype=auto,gpu_memory_utilization=0.3,data_parallel_size=2 \
    --tasks truthfulqa_gen \
    --batch_size auto

I got an error:
image
And I try to print cuda devices like this:
image
ray: 2.39.0
vllm: 0.6.4.post1
torch: 2.5.1+cu124
I wonder is there something wrong with my environment and how to solve this problem.

@zhaocaibei123
Copy link
Author

I reset my environment and it's ok now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant