Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GPUs are not visible from the container #293

Closed
martindellavecchia opened this issue Sep 26, 2024 · 3 comments
Closed

GPUs are not visible from the container #293

martindellavecchia opened this issue Sep 26, 2024 · 3 comments
Labels
bug Something isn't working

Comments

@martindellavecchia
Copy link

martindellavecchia commented Sep 26, 2024

Which OS are you using?

  • OS: [Ubuntu 24.04.]
  • Env [Docker version 27.3.1, build ce12230 - ROOTFUL]

I thought this may be an issue from my two gpu rig running containers, but.. for some reason after wiping out docker and nvidia divers and cuda, I cannot get it working and I keep getting this error, it's something like my docker in not able tor each my GPUs

[+] Running 2/2
✔ Network whisper-webui_default Created 0.1s
✔ Container whisper-webui-app-1 Created 0.1s
Attaching to app-1
app-1 | /Whisper-WebUI/venv/lib/python3.11/site-packages/torch/cuda/init.py:619: UserWarning: Can't initialize NVML
app-1 | warnings.warn("Can't initialize NVML")
app-1 | Use "faster-whisper" implementation
app-1 | Device "auto" is detected
app-1 | Running on local URL: http://0.0.0.0:7860
app-1 |
app-1 | To create a public link, set share=True in launch().

Any thought or idea on how can I make my GPUs visible?? (they are visible in the host - nvidia-smi validated.)

@martindellavecchia martindellavecchia added the bug Something isn't working label Sep 26, 2024
@jhj0517
Copy link
Owner

jhj0517 commented Sep 27, 2024

Similar with #294, it seems that "CUDA" ( the software that comes with the Nvidia GPU) is not installed or detected on your PC.
Can you do

nvcc --version

And see what it prints out?
It should print CUDA 12.4 version, as this project's recommended version.

+) As far as I know, CUDA should be installed on your base machine not the container, since debian:bookworm-slim is used instead of pytorch/pytorch:2.4.1-cuda12.4-cudnn9-runtime for the lighter image.

@martindellavecchia
Copy link
Author

Yep, correct, on my host it's working perfectly:

$ nvcc --version
nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2024 NVIDIA Corporation
Built on Wed_Aug_14_10:10:22_PDT_2024
Cuda compilation tools, release 12.6, V12.6.68
Build cuda_12.6.r12.6/compiler.34714021_0

It works perfectly on the host:

$ nvidia-smi
Fri Sep 27 10:20:11 2024
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 560.35.03 Driver Version: 560.35.03 CUDA Version: 12.6 |
|-----------------------------------------+------------------------+----------------------+
| GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=========================================+========================+======================|
| 0 NVIDIA GeForce RTX 3060 Off | 00000000:01:00.0 Off | N/A |
| 0% 38C P8 12W / 170W | 4MiB / 12288MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+
| 1 NVIDIA GeForce RTX 3070 Ti Off | 00000000:03:00.0 Off | N/A |
| 0% 40C P8 10W / 310W | 3228MiB / 8192MiB | 0% Default |
| | | N/A |
+-----------------------------------------+------------------------+----------------------+

+-----------------------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=========================================================================================|
| 1 N/A N/A 28496 C python3.10 3218MiB |
+-----------------------------------------------------------------------------------------+

@martindellavecchia
Copy link
Author

It's workin on my host, but not from the container, I assuime it's my rig..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants