This repository contains an op-for-op PyTorch reimplementation of Very Deep Convolutional Networks for Large-Scale Image Recognition.
Contains MNIST, CIFAR10&CIFAR100, TinyImageNet_200, MiniImageNet_1K, ImageNet_1K, Caltech101&Caltech256 and more etc.
Please refer to README.md
in the data
directory for the method of making a dataset.
Both training and testing only need to modify the test_config.py
file.
Modify test_config.py
file.
- line 17:
model_arch_name
change tovgg11
. - line 31:
model_num_classes
change to1000
. - line 24:
mode
change to./data/ImageNet_1K/ILSVRC2012_img_val
. - line 37:
model_weights_path
change to./results/pretrained_models/VGG11-ImageNet_1K-64f6524f.pth.tar
.
python3 test.py
Modify train_config.py
file.
- line 18:
model_arch_name
change tovgg11
. - line 19:
model_num_classes
change to1000
. - line 25:
mode
change to./data/ImageNet_1K/ILSVRC2012_img_train
. - line 26:
mode
change to./data/ImageNet_1K/ILSVRC2012_img_val
. - line 37:
pretrained_model_weights_path
change to./results/pretrained_models/VGG11-ImageNet_1K-64f6524f.pth.tar
.
python3 train.py
Modify train_config.py
file.
- line 18:
model_arch_name
change tovgg11
. - line 19:
model_num_classes
change to1000
. - line 25:
mode
change to./data/ImageNet_1K/ILSVRC2012_img_train
. - line 26:
mode
change to./data/ImageNet_1K/ILSVRC2012_img_val
. - line 40:
resume_model_weights_path
change to./samples/VGG11-ImageNet_1K/epoch_xxx.pth.tar
.
python3 train.py
Source of original paper results: https://arxiv.org/pdf/1409.1556v6.pdf)
In the following table, the top-x error value in ()
indicates the result of the project, and -
indicates no test.
Model | Dataset | Top-1 error (val) | Top-5 error (val) |
---|---|---|---|
VGG11 | ImageNet_1K | 29.6%(30.9%) | 10.4%(11.3%) |
VGG11_BN | ImageNet_1K | -(29.6%) | -(10.2%) |
VGG13 | ImageNet_1K | 28.7%(30.1%) | 9.9%(10.8%) |
VGG13_BN | ImageNet_1K | -(28.4%) | -(9.6%) |
VGG16 | ImageNet_1K | 27.0%(28.4%) | 8.8%(9.6%) |
VGG16_BN | ImageNet_1K | -(26.6%) | -(8.5%) |
VGG19 | ImageNet_1K | 27.3%(27.6%) | 9.0%(9.1%) |
VGG19_BN | ImageNet_1K | -(25.7%) | -(8.1%) |
# Download `VGG11-ImageNet_1K-64f6524f.pth.tar` weights to `./results/pretrained_models`
# More detail see `README.md<Download weights>`
python3 ./inference.py
Input:
Output:
Build VGG11 model successfully.
Load VGG11 model weights `/VGG-PyTorch/results/pretrained_models/VGG11-ImageNet_1K-64f6524f.pth.tar` successfully.
tench, Tinca tinca (74.97%)
barracouta, snoek (23.09%)
gar, garfish, garpike, billfish, Lepisosteus osseus (0.81%)
reel (0.45%)
armadillo (0.25%)
If you find a bug, create a GitHub issue, or even better, submit a pull request. Similarly, if you have questions, simply post them as GitHub issues.
I look forward to seeing what the community does with these models!
Karen Simonyan, Andrew Zisserman
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting. Our main contribution is a thorough evaluation of networks of increasing depth using an architecture with very small (3×3) convolution filters, which shows that a significant improvement on the prior-art configurations can be achieved by pushing the depth to 16–19 weight layers. These findings were the basis of our ImageNet Challenge 2014 submission, where our team secured the first and the second places in the localisation and classification tracks respectively. We also show that our representations generalise well to other datasets, where they achieve state-of-the-art results. We have made our two best-performing ConvNet models publicly available to facilitate further research on the use of deep visual representations in computer vision.
@article{simonyan2014very,
title={Very deep convolutional networks for large-scale image recognition},
author={Simonyan, Karen and Zisserman, Andrew},
journal={arXiv preprint arXiv:1409.1556},
year={2014}
}