MMDeploy provides some useful tools. It is easy to deploy models in OpenMMLab to various platforms. You can convert models in our pre-defined pipeline or build a custom conversion pipeline by yourself. This guide will show you how to convert a model with MMDeploy and integrate MMDeploy's SDK to your application!
First we should install MMDeploy following build.md. Note that the build steps are slightly different among the supported backends. Here are some brief introductions to these backends:
- ONNXRuntime: ONNX Runtime is a cross-platform inference and training machine-learning accelerator. It has best support for ONNX IR.
- TensorRT: NVIDIA® TensorRT™ is an SDK for high-performance deep learning inference. It includes a deep learning inference optimizer and runtime that delivers low latency and high throughput for deep learning inference applications. It is a good choice if you want to deploy your model on NVIDIA devices.
- ncnn: ncnn is a high-performance neural network inference computing framework optimized for mobile platforms. ncnn is deeply considerate about deployment and uses on mobile phones from the beginning of design.
- PPLNN: PPLNN, which is short for "PPLNN is a Primitive Library for Neural Network", is a high-performance deep-learning inference engine for efficient AI inferencing. It can run various ONNX models and has better support for OpenMMLab.
- OpenVINO: OpenVINO™ is an open-source toolkit for optimizing and deploying AI inference. The open-source toolkit allows to seamlessly integrate with Intel AI hardware, the latest neural network accelerator chips, the Intel AI stick, and embedded computers or edge devices.
Choose the backend which can meet your demand and install it following the link provided above.
Once you have installed MMDeploy, you can convert the PyTorch model in the OpenMMLab model zoo to the backend model with one magic spell! For example, if you want to convert the Faster-RCNN in MMDetection to TensorRT:
# Assume you have installed MMDeploy in ${MMDEPLOY_DIR} and MMDetection in ${MMDET_DIR}
# If you do not know where to find the path. Just type `pip show mmdeploy` and `pip show mmdet` in your console.
python ${MMDEPLOY_DIR}/tools/deploy.py \
${MMDEPLOY_DIR}/configs/mmdet/detection/detection_tensorrt_dynamic-320x320-1344x1344.py \
${MMDET_DIR}/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
${CHECKPOINT_DIR}/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
${INPUT_IMG} \
--work-dir ${WORK_DIR} \
--device cuda:0 \
--dump-info
${MMDEPLOY_DIR}/tools/deploy.py
is a tool that does everything you need to convert a model. Read how_to_convert_model for more details. The converted model and other meta-info will be found in ${WORK_DIR}
. And they make up of MMDeploy SDK Model that can be fed to MMDeploy SDK to do model inference.
detection_tensorrt_dynamic-320x320-1344x1344.py
is a config file that contains all arguments you need to customize the conversion pipeline. The name is formed as
<task name>_<backend>-[backend options]_<dynamic support>.py
It is easy to find the deployment config you need by name. If you want to customize the conversion, you can edit the config file by yourself. Here is a tutorial about how to write config.
Now you can do model inference with the APIs provided by the backend. But what if you want to test the model instantly? We have some backend wrappers for you.
from mmdeploy.apis import inference_model
result = inference_model(model_cfg, deploy_cfg, backend_models, img=img, device=device)
The inference_model
will create a wrapper module and do the inference for you. The result has the same format as the original OpenMMLab repo.
You might wonder that does the backend model have the same precision as the original one? How fast can the model run? MMDeploy provides tools to test the model. Take the converted TensorRT Faster-RCNN as an example:
python ${MMDEPLOY_DIR}/tools/test.py \
${MMDEPLOY_DIR}/configs/detection/detection_tensorrt_dynamic-320x320-1344x1344.py \
${MMDET_DIR}/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
--model ${BACKEND_MODEL_FILES} \
--metrics ${METRICS} \
--device cuda:0
Read how to evaluate a model for more details about how to use tools/test.py
Make sure to turn on MMDEPLOY_BUILD_SDK
to build and install SDK by following build.md.
After that, the structure in the installation folder will show as follows,
install
├── example
├── include
│ ├── c
│ └── cpp
└── lib
where include/c
and include/cpp
correspond to C and C++ API respectively.
Caution: The C++ API is highly volatile and not recommended at the moment.
In the example directory, there are several examples involving classification, object detection, image segmentation and so on. You can refer to these examples to learn how to use MMDeploy SDK's C API and how to link ${MMDeploy_LIBS} to your application.
Here is an example of how to deploy and inference Faster R-CNN model of MMDetection from scratch.
Please run the following command in Anaconda environment to install MMDetection.
conda create -n openmmlab python=3.7 -y
conda activate openmmlab
conda install pytorch==1.8.0 torchvision==0.9.0 cudatoolkit=10.2 -c pytorch -y
# install the latest mmcv
pip install mmcv-full -f https://download.openmmlab.com/mmcv/dist/cu102/torch1.8.0/index.html
# install mmdetection
git clone https://github.com/open-mmlab/mmdetection.git
cd mmdetection
pip install -r requirements/build.txt
pip install -v -e .
Download the checkpoint from this link and put it in the {MMDET_ROOT}/checkpoints
where {MMDET_ROOT}
is the root directory of your MMDetection codebase.
Please run the following command in Anaconda environment to install MMDeploy.
conda activate openmmlab
git clone https://github.com/open-mmlab/mmdeploy.git
cd mmdeploy
git submodule update --init --recursive
pip install -e .
Once we have installed the MMDeploy, we should select an inference engine for model inference. Here we take ONNX Runtime as an example. Run the following command to install ONNX Runtime:
pip install onnxruntime==1.8.1
Then download the ONNX Runtime library to build the mmdeploy plugin for ONNX Runtime:
wget https://github.com/microsoft/onnxruntime/releases/download/v1.8.1/onnxruntime-linux-x64-1.8.1.tgz
tar -zxvf onnxruntime-linux-x64-1.8.1.tgz
cd onnxruntime-linux-x64-1.8.1
export ONNXRUNTIME_DIR=$(pwd)
export LD_LIBRARY_PATH=$ONNXRUNTIME_DIR/lib:$LD_LIBRARY_PATH
cd ${MMDEPLOY_DIR} # To MMDeploy root directory
mkdir -p build && cd build
# build ONNXRuntime custom ops
cmake -DMMDEPLOY_TARGET_BACKENDS=ort -DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR} ..
make -j$(nproc)
# build MMDeploy SDK
cmake -DMMDEPLOY_BUILD_SDK=ON \
-DCMAKE_CXX_COMPILER=g++-7 \
-DOpenCV_DIR=/path/to/OpenCV/lib/cmake/OpenCV \
-Dspdlog_DIR=/path/to/spdlog/lib/cmake/spdlog \
-DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR} \
-DMMDEPLOY_TARGET_BACKENDS=ort \
-DMMDEPLOY_CODEBASES=mmdet ..
make -j$(nproc) && make install
Once we have installed MMDetection, MMDeploy, ONNX Runtime and built plugin for ONNX Runtime, we can convert the Faster R-CNN to a .onnx
model file which can be received by ONNX Runtime. Run following commands to use our deploy tools:
# Assume you have installed MMDeploy in ${MMDEPLOY_DIR} and MMDetection in ${MMDET_DIR}
# If you do not know where to find the path. Just type `pip show mmdeploy` and `pip show mmdet` in your console.
python ${MMDEPLOY_DIR}/tools/deploy.py \
${MMDEPLOY_DIR}/configs/mmdet/detection/detection_onnxruntime_dynamic.py \
${MMDET_DIR}/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py \
${MMDET_DIR}/checkpoints/faster_rcnn_r50_fpn_1x_coco_20200130-047c8118.pth \
${MMDET_DIR}/demo/demo.jpg \
--work-dir work_dirs \
--device cpu \
--show \
--dump-info
If the script runs successfully, two images will display on the screen one by one. The first image is the infernce result of ONNX Runtime and the second image is the result of PyTorch. At the same time, an onnx model file end2end.onnx
and three json files (SDK config files) will generate on the work directory work_dirs
.
After model conversion, SDK Model is saved in directory ${work_dir}. Here is a recipe for building & running object detection demo.
cd build/install/example
# path to onnxruntime ** libraries **
export LD_LIBRARY_PATH=/path/to/onnxruntime/lib
mkdir -p build && cd build
cmake -DOpenCV_DIR=path/to/OpenCV/lib/cmake/OpenCV \
-DMMDeploy_DIR=${MMDEPLOY_DIR}/build/install/lib/cmake/MMDeploy ..
make object_detection
# suppress verbose logs
export SPDLOG_LEVEL=warn
# running the object detection example
./object_detection cpu ${work_dirs} ${path/to/an/image}
If the demo runs successfully, an image named "output_detection.png" is supposed to be found showing detection objects.
If the models you want to deploy have not been supported yet in MMDeploy, you can try to support them by yourself. Here are some documents that may help you:
- Read how_to_support_new_models to learn more about the rewriter.
Finally, we welcome your PR!