Benchmark inference speed of CNNs with various quantization methods with TensorRT!
⭐ if it helps you.
Run:
inference_tensorrt.py
TRT notes TensorRT compiled models in the noted precision.
Latency of image inference (1,3,256,256) [ms]
TRT FP32 | TRT FP16 | TRT INT8 | |
---|---|---|---|
resnet18 | 26 | 18 | |
resnet34 | 48 | 30 | |
resnet50 | 79 | 42 |
Jetson Nano does not support INT8..
TRT notes TensorRT compiled models in the noted precision.
Latency of image inference (1,3,256,256) [ms]
resnet18 | resnet34 | resnet50 | |
---|---|---|---|
PytorchRaw | 11 | 12 | 16 |
TRT FP32 | 3.8 | 5.6 | 9.9 |
TRT FP16 | 2.1 | 3.3 | 4.4 |
TRT INT8 | 1.7 | 2.7 | 3.0 |
TRT notes TensorRT compiled models in the noted precision.
Latency of image inference (1,3,512,512) [ms]
fcn_resnet50 | fcn_resnet101 | deeplabv3_resnet50 | deeplabv3_resnet101 | |
---|---|---|---|---|
PytorchRaw | 200 | 344 | 281 | 426 |
TRT FP32 | 173 | 290 | 252 | 366 |
TRT FP16 | 36 | 57 | 130 | 151 |
TRT INT8 | 21 | 32 | 97 | 108 |
Latency of image inference (1,3,256,256) [ms]
fcn_resnet50 | |
---|---|
PytorchRaw | 6800 |
TRT FP32 | 767 |
TRT FP16 | 40 |
TRT INT8 | NA |
The hardware setup seems tricky.
- Install pytorch
https://forums.developer.nvidia.com/t/pytorch-for-jetson-nano-version-1-4-0-now-available/72048
The stable version for Jetson nano seems to be torch==1.1
For Xavier, torch==1.3 worked fine for me.
- Install torchvision
I followed this instruction and installed torchvision==0.3.0
sudo apt-get install libjpeg-dev zlib1g-dev
git clone -b v0.3.0 https://github.com/pytorch/vision torchvision
cd torchvision
sudo python3 setup.py install
- Install torch2trt
Followed readme.
https://github.com/NVIDIA-AI-IOT/torch2trt
sudo apt-get install libprotobuf* protobuf-compiler ninja-build
git clone https://github.com/NVIDIA-AI-IOT/torch2trt
cd torch2trt
sudo python3 setup.py install --plugins