DropActivation: Implicit Parameter Reduction and Harmonic Regularization

Original paper, Submitted on 14 Nov 2018

DropActivation is a new random activation function which reconciles Dropout and Batch Normalization (cf Understanding the disharmony between dropout and batch normalization by variance shift).

I implemented this new layer into Keras, and also Randomized-ReLU

Implemented features

Note: ìmport keras vs import tensorflow.keras

I don't use keras python package, I use tensorflow.keras ! (I did not use Keras for 2 years ... it's so misleading to have Keras within TensorFlow now). So If you use raw python package keras, please be sure that the version of keras is the same as the version of Keras in tensorflow. Today (Nov 2018), if you pip install keras it will be keras.__version__==2.2.0 while the Keras branch in TensorFlow is tensorflow.keras.__version__==2.1.6 ...

If necessary, I will release a pure Keras version of the new layers.

Implemented features:

DropActivation layer in drop_activation.py
RandomizedReLU layer (Empirical Evaluation of Rectified Activations in Convolutional Network) in randomized_relu.py
ResNet-56 on CIFAR-10 with Keras (TF backend), with MomentumSGD (0.9)
data augmentation: random crop, horizontal flips and per sample standardization
4 notebooks (same code, just different networks) with seeded initialization (same initial random weights):
- Relu
- RandomizedRelu
- DropActivation
- Relu with DropOut x BatchNorm

Drop Activation: new activation layer which combines Dropout and ReLU (and it's compatible with BatchNorm!)

DropActivation combines ReLU and Dropout. In training mode, the activation function is random. Like relu, if the neural activation is positive, th identity function is used. If the input is negative, with a proba of p (p=0.95), the output is zero (like with ReLU). But with a proba 0.05, the identity function is used.

So here, we switch randomly between ReLU(95%) and identity mapping(5%) function.

At testing time, we use a LeakyReLU activation (deterministic) with a slope 1/(1-p) !

Comparision: ReLU, Drop Activation and Randomized ReLU

Here, a short comparision of ReLU, DropActivation and Randomized-ReLU

Results (from my code)

Model & training configuration:

ResNet56 for CIFAR10 (no bottleneck block)
L2 regularization only on kernels (conv and final dense layer) with weight 0.0002
optimization with SGD and Momentum (0.9)
learning rate scheduling : 0.1, 0.01, 0.001 and 0.0001 (changes at epochs 91, 136 and 182)
batch_size=128 and epochs=200
Data augmentation (only train, no test time augmentation): sample wise normalization (mean and std), random crop (5 pixels), random horizontal flips (no vertical)
training set vs validation : 80/20 % of initial training set (shuffle and stratified split)
test set from CIFAR-10 as final test set !

Activation function	ACC-validation	ACC-test (generalization gap)
Relu	92.47	92.58 (+ 0.11)
Dropout with ReLU	90.04	89.92 (- 0.12)
Randomized ReLU	91.18	90.65 (- 0.53)
Drop Activation	93.36	93.27 (- 0.09)

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
images		images
notebooks		notebooks
.gitignore		.gitignore
README.md		README.md
ResNet-56_CIFAR10_DropActivation.ipynb		ResNet-56_CIFAR10_DropActivation.ipynb
ResNet-56_CIFAR10_RReLU.ipynb		ResNet-56_CIFAR10_RReLU.ipynb
ResNet-56_CIFAR10_ReLU.ipynb		ResNet-56_CIFAR10_ReLU.ipynb
ResNet-56_CIFAR10_ReLUxDropout.ipynb		ResNet-56_CIFAR10_ReLUxDropout.ipynb
drop_activation.py		drop_activation.py
drop_activation_test.py		drop_activation_test.py
randomized_relu.py		randomized_relu.py
resnet.py		resnet.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DropActivation: Implicit Parameter Reduction and Harmonic Regularization

Implemented features

Drop Activation: new activation layer which combines Dropout and ReLU (and it's compatible with BatchNorm!)

Comparision: ReLU, Drop Activation and Randomized ReLU

Results (from my code)

About

Releases

Packages

Languages

JGuillaumin/drop_activation_tf

Folders and files

Latest commit

History

Repository files navigation

DropActivation: Implicit Parameter Reduction and Harmonic Regularization

Implemented features

Drop Activation: new activation layer which combines Dropout and ReLU (and it's compatible with BatchNorm!)

Comparision: ReLU, Drop Activation and Randomized ReLU

Results (from my code)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages