Deeploy is an ONNX-to-C compiler that generates low-level optimized C Code for multi-cluster, heterogeneous SoCs. Its goal is to enable configurable deployment flows from a bottom-up compiler perspective, modeling target hardware in a fine-grained and modular manner.
Deeploy is developed as part of the PULP project, a joint effort between ETH Zurich and the University of Bologna.
Unless specified otherwise in the respective file headers, all code checked into this repository is made available under a permissive license. All software sources and tool scripts are licensed under Apache 2.0, except for files contained in the scripts
directory, which are licensed under the MIT license, and files contained in the DeeployTest/Tests
directory, which are licensed under the Creative Commons Attribution-NoDerivates 4.0 International license (CC BY-ND 4.0).
Installing Deeploy is as simple as running:
pip install -e . --extra-index-url=https://pypi.ngc.nvidia.com
However, to run the code generated by Deeploy on a certain target, you need the toolchains and the simulators associated with this platform.
We provide a Docker container where Deeploy works Out-of-the-Box (i.e. with all the dependencies pre-installed). To pull the docker image, run:
docker pull ghcr.io/pulp-platform/deeploy:main
Then you can start the container in interactive mode with:
docker run -it ghcr.io/pulp-platform/deeploy:main
From the container, clone Deeploy, its submodules, and install the package with:
git clone https://github.com/pulp-platform/Deeploy.git && cd Deeploy
git submodule update --init --recursive
pip install -e . --extra-index-url=https://pypi.ngc.nvidia.com
Congratulations, you installed Deeploy and its dependencies! Now, to test your installation let's run one simple test on each platform with the following commands:
cd DeeployTest && source /app/install/pulp-sdk/configs/siracusa.sh
python testRunner_generic.py -t Tests/Adder
python testRunner_cortexm.py -t Tests/Adder
python testRunner_mempool.py -t Tests/Adder
python testRunner_siracusa.py -t Tests/Adder --cores=8
You can find the ONNX file in DeeployTest/Tests/Adder
, to visualize it, you can use Netron. You can also find the generated code for the platform X in TEST_X
in DeeployTest
and you should notice that the generated code for the Adder
test is very simple. However, this gets more complex when you add tiling. Let's generate the code for a single layer but using tiling this time:
python testRunner_tiled_siracusa.py -t Tests/testMatMul --cores=8 --l1=16000
Now you can open the generated code in DeeployTest/TEST_SIRACUSA/Tests/testMatMul/Network.c
and see how we executed a tiled layer.
- Generic CPU:
- CortexM Processors:
- Simulators: QEMU
- MemPool extended with ITA:
- Hardware: Mempool paper, ITA paper
- Simulators: Banshee
- Siracusa:
- Hardware: Siracusa paper
- Simulators: GVSOC
To build the documentation, simply run:
make docs
Then open docs/_build/html/index.html
for more extensive documentation & getting-started guides.
ESWEEK 2024: Deeploy: Enabling Energy-Efficient Deployment of Small Language Models On Heterogeneous Microcontrollers
@article{schererDeeployEnablingEnergyEfficient2024,
title = {Deeploy: {{Enabling Energy-Efficient Deployment}} of {{Small Language Models}} on {{Heterogeneous Microcontrollers}}},
shorttitle = {Deeploy},
author = {Scherer, Moritz and Macan, Luka and Jung, Victor J. B. and Wiese, Philip and Bompani, Luca and Burrello, Alessio and Conti, Francesco and Benini, Luca},
year = {2024},
month = nov,
journal = {IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems},
volume = {43},
number = {11},
pages = {4009--4020},
issn = {1937-4151},
doi = {10.1109/TCAD.2024.3443718},
}
The preprint version is also available on arXiv: arXiv:2408.04413.