forked from mlcommons/mlcube
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Updating GPU (accelerator) support in MLCube. (mlcommons#351)
This commit changes how MLCube runtime works with GPUs. Users have two options to provide information on required accelerators: - It can be done in MLCube configuration file in the `platform` section (platform.accelerator_count). This value is optional, and if present, may be an empy / non-set string, or an integer. This values is the number of required accelerators, and semantically, is equivalent to docker's `--gpu=N` CLI argument. - It can be done on a command line using `--gpus` argument, e.g., `mlcube run ... --gpus=4`. This parameter accepts same values as docker's `--gpus` CLI argument does. Concretely: - When not set MLCube will use platform.accelerator_count value if present. - When set, this overrides any value assigned to platform.accelerator_count. Could be empty (`--gpus=`) to disable GPUs, all (`--gpus=all`) to use all available GPUs, GPU count (`--gpus=N`) or list of concrete GPUs (`--gpus="device=0,2"`). In this list GPU indices or UUIDs can be used. Docker runner will pretty much use this value unmodified and will pass it to docker run command, e.g., `--gpus` flag will present. Singularity runner will pass `--nv` command when GPUs are requested. No CUDA_VISIBLE_DEVICES, SINGULARITYENV_CUDA_VISIBLE_DEVICES or any other env variable will be set by MLCube runtime (docker NVIDIA runtime may set NVIDIA_VISIBLE_DEVICES). To debug for possble issues, enable debug mode (e.g., `mlcube --log-level=debug run ...`) and search the output for the log lines that contain `DEBUG Device spec (...) resolved to ...` and `INFO Device params ... resolved to ...`. They will provide additional information on how MLCube runtime determines how GPUs should be used.
- Loading branch information
1 parent
5f79b7a
commit 8833ec0
Showing
3 changed files
with
398 additions
and
32 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.