Using adaptive activation functions in pre-trained artificial neural network models

Implementation of the experiment as published in the paper "Using adaptive activation functions in pre-trained artificial neural network models" by Yevgeniy Bodyanskiy and Serhii Kostiuk.

Goals of the experiment

The experiment:

demonstrates the method of activation function replacement in pre-trained models using the VGG-like KerasNet ¹ CNN as the example base model;
evaluates the inference result differences between the base pre-trained model and the same model with replaced activation functions;
demonstrates the effectiveness of activation function fine-tuning when all other elements of the model are fixed (frozen);
evaluates performance of the KerasNet variants with different activation functions (adaptive and non-adaptive) trained in different regimes.

Description of the experiment

The experiment consists of the following steps:

Train the base KerasNet network on the CIFAR-10 ² dataset for 100 epochs using the standard training procedure and RMSprop. 4 variants of the network: are trained: with ReLU ³, SiLU ⁴, Tanh and Sigmoid ⁵ activation functions.
Save the pre-trained network.
Evaluate performance of the base pre-trained network on the test set of CIFAR-10.
Load the base pre-trained network and replace all activation functions with the corresponding adaptive alternatives (ReLU, SiLU -> AHAF ⁶; Sigmoid, Tanh -> F-Neuron Activation ⁷).
Evaluate performance of the base derived network on the test set of CIFAR-10.
Fine-tune the adaptive activation functions on the CIFAR-10 dataset.
Evaluate the network performance after the activation function fine-tuning.
Compare the evaluation results collected on steps 3, 5 and 7.

Running experiments

NVIDIA GPU recommended with at least 2 GiB of VRAM.
Install the requirements from requirements.txt.
Set CUBLAS_WORKSPACE_CONFIG=:4096:8 in the environment variables.
Use the root of this repository as the current directory.
Add the current directory to PYTHONPATH so it can find the modules

Example:

user@host:~/repo_path$ export CUBLAS_WORKSPACE_CONFIG=:4096:8
user@host:~/repo_path$ export PYTHONPATH=".:$PYTHONPATH"
user@host:~/repo_path$ python3 experiments/train_new_base.py

Or in a single line, to keep assignments local to the executable:

user@host:~/repo_path$ CUBLAS_WORKSPACE_CONFIG=:4096:8 PYTHONPATH=".:$PYTHONPATH" python3 experiments/train_new_base.py

References

Chollet, F., et al. (2015) Train a simple deep CNN on the CIFAR10 small images dataset. https://github.com/keras-team/keras/blob/1.2.2/examples/cifar10_cnn.py ↩
Krizhevsky, A. (2009) Learning Multiple Layers of Features from Tiny Images. Technical Report TR-2009, University of Toronto, Toronto. ↩
Agarap, A. F. (2018). Deep Learning using Rectified Linear Units (ReLU). https://doi.org/10.48550/ARXIV.1803.08375 ↩
Elfwing, S., Uchibe, E., & Doya, K. (2017). Sigmoid-Weighted Linear Units for Neural Network Function Approximation in Reinforcement Learning. CoRR, abs/1702.03118. Retrieved from http://arxiv.org/abs/1702.03118 ↩
Cybenko, G. Approximation by superpositions of a sigmoidal function. Math. Control Signal Systems 2, 303–314 (1989). https://doi.org/10.1007/BF02551274 ↩
Bodyanskiy, Y., & Kostiuk, S. (2022). Adaptive hybrid activation function for deep neural networks. In System research and information technologies (Issue 1, pp. 87–96). Kyiv Politechnic Institute. https://doi.org/10.20535/srit.2308-8893.2022.1.07 ↩
Bodyanskiy, Y., & Kostiuk, S. (2022). Deep neural network based on F-neurons and its learning. Research Square Platform LLC. https://doi.org/10.21203/rs.3.rs-2032768/v1 ↩

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
adaptive_afs		adaptive_afs
experiments		experiments
misc		misc
nns_aaf		nns_aaf
post_experiment		post_experiment
runs		runs
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Using adaptive activation functions in pre-trained artificial neural network models

Goals of the experiment

Description of the experiment

Running experiments

References

About

Releases 1

Packages

Languages

License

s-kostyuk/af_replacement

Folders and files

Latest commit

History

Repository files navigation

Using adaptive activation functions in pre-trained artificial neural network models

Goals of the experiment

Description of the experiment

Running experiments

References

Footnotes

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages