CONVOLUTIONAL NEURAL NETWORK FOR POKÉMON DETECTION

Convolutional neural network experiment for 1st gen Pokémons detection, made with TensorFlow.

TRY IT OUT!

The model is available on this HuggingFace repository. Here is a Colab Notebook to test it.

DATASET

The training dataset contains 17000 pictures divided in 143 classes of 1st gen Pokémons.

ARCHITECTURE

The input has been preprocessed with 200x200 resizing and augmented with random orizontal flip and 20% zoom range.

Every convolutional layer has a LeakyReLU (alpha = 0.15) activation function to prevent vanishing gradients and disappearing relu issues, with padding 'same' and 'he_normal' kernel initialization. In every layers pack there are a Batch Normalization, a Max Pooling layer (2x2) and a 20% Dropout to prevent overfitting.

From the 2nd to the 4th convolutional layers pack, the dilation rate increases from 1x1 to 3x3, to upgrade the area of intervention of every filer and increase the features detection performance, according with the same principle described in this paper. This solution increases the accuracy on validation and test set of 3% (95% accuracy).

After the convolutional layers, the learning and classification is performed by two Dense layers with 512 and 256 weights, with a 40% Dropout.

Here is a plot of the feature maps extracted from every convolutional layers pack:

TESTED SOLUTIONS

If you are curious about the visual differences between the feature maps of different kinds of layers, I made a few plots comparing them with the same 6 filters (initializer = GlorotUniform(seed=5)).

This is the list of tested solutions:

Classic Conv2D, 1 layer, 3x3 kernel
Separable Depthwise Convolution, 1 layer, 3x3 kernel
Classic Conv2D, 1 layer, 5x5 kernel
Dilated Convolution, 1 layer, 3x3 kernel, 2x2 dilation rate
Dilated Convolution, 1 layer, 3x3 kernel, 3x3 dilation rate
Dilated Convolution, 3 layers, 3x3 kernel, dilation rates 1x1-2x2-3x3
Classic Conv2D, 3 layers, 3x3 kernel, 2 layers for MaxPooling 2x2
Classic Conv2D, 3 layers, 3x3 kernel, 2 layers for AveragePooling 2x2
Classic Conv2D, 3 layers, 3x3 kernel, 1 layer for MaxPooling 2x2, 1 layer for AveragePooling 2x2
Dilated Convolution, 3 layers, 3x3 kernel, dilation rates 1x1-2x2-3x3, 2 layers for MaxPooling 2x2
Dilated Convolution, 3 layers, 3x3 kernel, dilation rates 1x1-2x2-3x3, 2 layers for AveragePooling 2x2
Dilated Convolution, 3 layers, 3x3 kernel, dilation rates 1x1-2x2-3x3, 1 layer for MaxPooling 2x2, 1 layer for AveragePooling 2x2

Name		Name	Last commit message	Last commit date
Latest commit History 52 Commits
pictures		pictures
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
vera_test_pokemon_vision.ipynb		vera_test_pokemon_vision.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CONVOLUTIONAL NEURAL NETWORK FOR POKÉMON DETECTION

TRY IT OUT!

DATASET

ARCHITECTURE

TESTED SOLUTIONS

About

Languages

License

Firefly55lm/cnn_for_pokemon_detection

Folders and files

Latest commit

History

Repository files navigation

CONVOLUTIONAL NEURAL NETWORK FOR POKÉMON DETECTION

TRY IT OUT!

DATASET

ARCHITECTURE

TESTED SOLUTIONS

About

Topics

Resources

License

Stars

Watchers

Forks

Languages