A Runtime-Based Computational Performance Predictor for Deep Neural Network Training
- Installation
- Building from source
- Usage example
- Development Environment Setup
- Release process
- Release history
- License
- Research paper
- Contributing
DeepView.Predict is a tool that predicts a deep neural network's training iteration execution time on a given GPU. It currently supports PyTorch. To learn more about how DeepView.Predict works, please see our research paper.
To run DeepView.Predict, you need:
- Python 3.8+
- Pytorch 1.13.1+
- A system equiped with an Nvidia GPU with properly configured CUDA
Currently, we have predictors for the following Nvidia GPUs:
GPU | Generation | Memory | Mem. Type | SMs |
---|---|---|---|---|
P4000 | Pascal | 8 GB | GDDR5 | 14 |
P100 | Pascal | 16 GB | HBM2 | 56 |
V100 | Volta | 16 GB | HBM2 | 80 |
RTX 2070 | Turing | 8 GB | GDDR6 | 36 |
RTX 2080Ti | Turing | 11 GB | GDDR6 | 68 |
T4 | Turing | 16 GB | GDDR6 | 40 |
RTX 3090 | Ampere | 24 GB | GDDR6X | 82 |
A100 | Ampere | 40 GB | HBM2 | 108 |
A40 | Ampere | 48 GB | GDDR6 | 84 |
RTX A4000 | Ampere | 16 GB | GDDR6 | 48 |
RTX 4000 | Turing | 8 GB | GDDR6 | 36 |
Install via pip with the following command
pip install deepview-predict
- Install CUPTI
CUPTI is a profiling interface required by DeepView.Predict. Select your version of CUDA here and follow the instructions to add NVIDIA's repository. Then, install CUPTI with:
sudo apt-get install cuda-cupti-xx-x
where xx-x
represents the version of CUDA you have installed.
Alternatively, if you do not have root access on your machine, you can use conda
to install CUPTI. Select your version of CUDA here and follow the instructions. For example if you have CUDA 11.6.0, you can install CUPTI with:
conda install -c "nvidia/label/cuda-11.6.0" cuda-cupti
After installing CUPTI, add $CONDA_HOME/extras/CUPTI/lib64/
to LD_LIBRARY_PATH
to ensure the library is linked.
-
Install CMake 3.17+.
-
Note that CMake 3.24.0 and 3.24.1 has a bug that breaks DeepView.Predict as it is not able to find the CUPTI directory and you should not use those versions
-
Run the following commands to download and install a precompiled version of CMake 3.24.2
wget https://github.com/Kitware/CMake/releases/download/v3.24.2/cmake-3.24.2-linux-x86_64.sh chmod +x cmake-3.24.2-linux-x86_64.sh mkdir /opt/cmake sh cmake-3.24.2-linux-x86_64.sh --prefix=/opt/cmake --skip-license ln -s /opt/cmake/bin/cmake /usr/local/bin/cmake
-
You can verify the version of CMake you installed with the following command
cmake --version
-
-
Install Git Large File Storage
-
Clone the DeepView.Predict package
git clone https://github.com/CentML/DeepView.Predict cd DeepView.Predict
-
Get the pre-trained models used by DeepView.Predict
git submodule init && git submodule update git lfs pull
-
Finally build DeepView.Predict with the following command
./analyzer/install-dev.sh
DeepView.Predict has been tested to work on the latest version of NVIDIA NGC PyTorch containers.
- To build DeepView.Predict with Docker, first run the NGC container where
docker run --gpus all -it --rm nvcr.io/nvidia/pytorch:XX.XX-py3
- Inside the container, clone the repository then build and install DeepView.Predict Python package:
git clone --recursive https://github.com/CentML/DeepView.Predict
./habitat/analyzer/install-dev.sh
Note: DeepView.Predict needs access to your GPU's performance counters, which requires special permissions if you are running with a recent driver (418.43 or later). If you encounter a CUPTI_ERROR_INSUFFICIENT_PRIVILEGES
error when running DeepView.Predict, please follow the instructions here and in issue #5.
You can verify your DeepView.Predict installation by running the simple usage example:
# example.py
import habitat
import torch
import torchvision.models as models
# Define model and sample inputs
model = models.resnet50().cuda()
image = torch.rand(8, 3, 224, 224).cuda()
# Measure a single inference
tracker = habitat.OperationTracker(device=habitat.Device.RTX2080Ti)
with tracker.track():
out = model(image)
trace = tracker.get_tracked_trace()
print("Run time on source:", trace.run_time_ms)
# Perform prediction to a single target device
pred = trace.to_device(habitat.Device.V100)
print("Predicted time on V100:", pred.run_time_ms)
python3 example.py
See experiments/run_experiment.py for other examples of DeepView.Predict usage.
See Releases
The code in this repository is licensed under the Apache 2.0 license (see
LICENSE
and NOTICE
), with the exception of the files mentioned below.
This software contains source code provided by NVIDIA Corporation. These files are:
- The code under
cpp/external/cupti_profilerhost_util/
(CUPTI sample code) cpp/src/cuda/cuda_occupancy.h
The code mentioned above is licensed under the NVIDIA Software Development Kit End User License Agreement.
We include the implementations of several deep neural networks under
experiments/
for our evaluation. These implementations are copyrighted by
their original authors and carry their original licenses. Please see the
corresponding README
files and license files inside the subdirectories for
more information.
DeepView.Predict began as a research project in the EcoSystem Group at the University of Toronto. The accompanying research paper appeared in the proceedings of USENIX ATC'21. If you are interested, you can read a preprint of the paper here.
If you use DeepView.Predict in your research, please consider citing our paper:
@inproceedings{habitat-yu21,
author = {Yu, Geoffrey X. and Gao, Yubo and Golikov, Pavel and Pekhimenko,
Gennady},
title = {{Habitat: A Runtime-Based Computational Performance Predictor for
Deep Neural Network Training}},
booktitle = {{Proceedings of the 2021 USENIX Annual Technical Conference
(USENIX ATC'21)}},
year = {2021},
}
Check out CONTRIBUTING.md for more information on how to help with Habitat.