SSD: Single Shot MultiBox Detector On Caltech Pedestrian Dataset

Introduction

In this work we apply Single Shot Multibox Detector SSD on Caltech Pedestrian Dataset. In addition to caltech dataset, we also used ETH pedestrian dataset and TUDBrussels dataset. We are also finetuning from SSD512 model trained on 07++12+COCO. We were able to reach state-of-art results while having a real-time speed.

Results are shown below:

Model	Overall miss-rate	Reasonable miss-rate	FPS (Geforce GTX Titan X)	Input resolution
SSD512 (VGG16) (training from scratch + no hyper-parameters optimization)	65.17%	20.32%	22	640 x 480
SSD512 (VGG16)	54.44%	11.89%	24	512 x 512
SSD640 (VGG16)	53.11%	11.85%	20	640 x 480
F-DNN	50.5%	8.65%	6.25	640 x 480

Fixed-Point 16-bit Quantization

We also worked on quantizing the model to dynamic 16-bit Fixed Point using caffe ristretto. The script to test the quantized model is available under ssd-ristretto branch by going to models/VGGNet/caltech/SSD_512x512_ft_quantized (Caffe ristretto doesn't require changing the .caffemodel file for quantization, only the .prototxt file is modified). The model performance decreased by less than 0.01%. We do not report the speed on the Quantized model, Ristretto simulates the 16-bit fixed point arithmetic using floating point arithmetic because there's no hardware support for fixed point arithmetic on the GPU, but with hardware support we expect the model to be faster.

Model	Overall miss-rate	Reasonable miss-rate
SSD512 (VGG16) not quantized	54.4362%	11.8868%
SSD512 (VGG16) quantized	54.4374%	11.8937%

Citing

Please cite this paper in your publications if it helps your research:

@inproceedings{feasac2017ssdc,
  title = {An FPGA-Accelerated Design for Deep Learning Pedestrian Detection in Self-Driving Vehicles},
  author = {Moussawi, Abdallah and Haddad, Kamal and Chahine, Anthony},
  booktitle = {FEASAC},
  year = {2017}
}

and of course, please cite the great work done by Wei Liu et. al:

@inproceedings{liu2016ssd,
  title = {{SSD}: Single Shot MultiBox Detector},
  author = {Liu, Wei and Anguelov, Dragomir and Erhan, Dumitru and Szegedy, Christian and Reed, Scott and Fu, Cheng-Yang and Berg, Alexander C.},
  booktitle = {ECCV},
  year = {2016}
}

Installation

Get the code. We will call the directory that you cloned Caffe into $CAFFE_ROOT

git clone https://github.com/amoussawi/caffe.git
cd caffe
git checkout ssd

Build the code. Please follow Caffe instruction to install all necessary packages and build it.

# Modify Makefile.config according to your Caffe installation.
cp Makefile.config.example Makefile.config
make -j8
# Make sure to include $CAFFE_ROOT/python to your PYTHONPATH.
make py
make test -j8
# (Optional)
make runtest -j8

Preparation

examples/ssd/ contains python scripts to train with two initializations: with finetuning (ends with _ft) and without finetuning.

Download fully convolutional reduced (atrous) VGGNet If you want to start training from scratch, and SSD512 07++12+COCO if you want to finetune. Atrous VGGNet should be stored in $CAFFE_ROOT/models/VGGNet/. And pretrained SSD512 model should be stored in $CAFFE_ROOT/models/VGGNet/VOC0712Plus/SSD_512x512/
Download Caltech, ETH, and TUDBrussels pedestrian datasets from Caltech. By default, we assume the data is stored in $HOME/data/caltech_code/
You need Matlab in order to use the caltech evaluation code. The code is available in data/caltech/caltech_code/. We extract 1 frame every 5 frames from caltech training dataset, and all frames of ETH and TUDBrussels datasets. (we also used the external ETH car dataset here consisting of ~2100 pedestrians, though we don't think it made a huge difference, so you may not need it). To extract datasets, you need to run extractDatasets.m matlab script in data/caltech/caltech_code. This will extract the datasets into ../trainval/ and ../test/ accordingly. If you want to extract more images from caltech dataset, just set the skip variable of usatrain in dbInfo.m accordingly.
Create the LMDB file.

cd $CAFFE_ROOT
# Create the trainval.txt, test.txt, and test_name_size.txt in data/caltech/
./data/caltech/create_list.sh
# It will create lmdb files for trainval and test with encoded original image:
#   - $CAFFE_ROOT/data/caltech/caltech_trainval_lmdb
#   - $CAFFE_ROOT/data/caltech/caltech_test_lmdb
# and make soft links at examples/caltech/
./data/caltech/create_data.sh

Train/Eval

Train your model and evaluate the model on the fly.

# It will create model definition files and save snapshot models in:
#   - $CAFFE_ROOT/models/VGGNet/caltech/SSD_512x512/
# and job file, log file, and the python script in:
#   - $CAFFE_ROOT/jobs/VGGNet/caltech/SSD_512x512/
# and save temporary evaluation results in:
#   - $CAFFE_ROOT/examples/results/SSD_512x512/
# It should reach 11.8* % at 20k iterations.
python examples/ssd/ssd_caltech_512_ft.py

Evaluate the most recent snapshot.

# If you would like to test a model you trained, you can do:
python examples/ssd/score_ssd_caltech_ft.py

Test your model using a webcam. Note: press esc to stop.

# If you would like to attach a webcam to a model you trained, you can do:
python examples/ssd/ssd_caltech_webcam_ft.py

Here is a demo video of running a SSD512 model on a video of a car driving in the streets of Beirut.

Models

SSD512 Caltech

Name		Name	Last commit message	Last commit date
Latest commit History 4,139 Commits
.github		.github
cmake		cmake
data		data
docker		docker
docs		docs
examples		examples
include/caffe		include/caffe
matlab		matlab
models		models
python		python
scripts		scripts
src		src
tools		tools
.Doxyfile		.Doxyfile
.gitignore		.gitignore
.travis.yml		.travis.yml
CMakeLists.txt		CMakeLists.txt
CONTRIBUTING.md		CONTRIBUTING.md
CONTRIBUTORS.md		CONTRIBUTORS.md
INSTALL.md		INSTALL.md
LICENSE		LICENSE
Makefile		Makefile
Makefile.config.example		Makefile.config.example
README.md		README.md
caffe.cloc		caffe.cloc

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SSD: Single Shot MultiBox Detector On Caltech Pedestrian Dataset

Introduction

Fixed-Point 16-bit Quantization

Citing

Contents

Installation

Preparation

Train/Eval

Models

About

Releases

Packages

Languages

License

amoussawi/caffe

Folders and files

Latest commit

History

Repository files navigation

SSD: Single Shot MultiBox Detector On Caltech Pedestrian Dataset

Introduction

Fixed-Point 16-bit Quantization

Citing

Contents

Installation

Preparation

Train/Eval

Models

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages