This repository contains the tool CNNBench which can be used to generate and evaluate different Convolutional Neural Network (CNN) architectures pertinent to the domain of Machine-Learning Accelerators. The tool can be used to search among a large set of CNN architectures.
git clone https://github.com/jha-lab/cnn_design-space.git
cd cnn_design-space
- PIP
virtualenv cnnbench
source cnnbench/bin/activate
pip install -r requirements.txt
- CONDA
conda env create -f environment.yaml
Running a basic version of the tool comprises of the following:
- CNNs with modules comprising of upto two vertices, each is one of the operations in
[MAXPOOL3X3, CONV1X1, CONV3X3]
- Each module is stacked three times. A base stem of 3x3 convolution with 128 output channels is used. The stack of modules is followed by global average pooling and a final dense softmax layer.
- Training on the CIFAR-10 dataset.
cd cnnbench
python dataset_downloader.py
To use another dataset (among CIFAR-10, CIFAR-100, MNIST, or ImageNet) use input arguments; check: python dataset_downloader.py --help
.
python generate_library.py
This will create a .json
file of all graphs at: dataset/dataset.json
using the SHA-256 hashing algorithm and three modules per stack.
python run_boshnas.py
All training scripts use bash and have been implemented using SLURM. This will have to be setup before running the experiments.
Other flags can be used to control the training procedure (check using python run_boshnas.py --help
). This script uses the SLURM scheduler over mutiple compute nodes in a cluster (each cluster assumed to have 1 GPU, this can be changed in the script job_scripts/job_train.sh
). SLURM can also be used in scenarios where distributed nodes are not available.
Shikhar Tuli. For any questions, comments or suggestions, please reach me at stuli@princeton.edu.
Cite our work using the following bitex entry:
@article{tuli2022codebench,
author = {Tuli, Shikhar and Li, Chia-Hao and Sharma, Ritvik and Jha, Niraj K.},
title = {{CODEBench}: A Neural Architecture and Hardware Accelerator Co-Design Framework},
year = {2022}, publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
issn = {1539-9087},
url = {https://doi.org/10.1145/3575798},
doi = {10.1145/3575798},
note = {Just Accepted},
journal = {ACM Trans. Embed. Comput. Syst.},
month = {dec}}
BSD-3-Clause. Copyright (c) 2022, Shikhar Tuli and Jha Lab. All rights reserved.
See License file for more details.