Nadam as a more polyvalent Adam with the increase of capacity in recurrent neural networks

Abstract - In current state-of-the-art deep learning implementations, neural networks can have up to millions of parameters. In addition, the loss to be optimized is almost always highly non-convex and therefore challenging to optimize. In the early days of deep learning, second-order methods, such as Hessian Free, were widely adopted to tackle these challenging problems. A bit later, first-order methods such as SGD were investigated with the intent of replacing second-order methods by using momentum and Nesterov accelerated gadient (NAG). Since then, throughout the years, research has focused on first-order methods for their simplicity and low computational cost. Since SGD, new first-order methods have emerged, the most popular being Adam and RMSprop, with promises of greater performance. Adam, which is an extension of RMSprop with momentum is certainly performing well, but seems to struggle on certain problems where RMSprop delivers greater results. Indeed, momentum may impede the performance of Adam in specific cases, for instance when the difference in curvature varies between dimensions. A similar problem is described with classical momentum (CM). There, the authors propose to use NAG to achieve improved performance. In this paper, we investigate whether Nadam, a version of Adam incorporating NAG, gives more versatile results than Adam. Furthermore, Nadam is compared with RMSprop when Adam does not perform as well as the former. In particular, we examine how Nadam, RMSprop and Adam perform when the number of parameters, and thus the difficulty of the problem, changes.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
doc		doc
images		images
notebooks		notebooks
README.md		README.md
miniproject_description.pdf		miniproject_description.pdf
report.pdf		report.pdf
roadmap.md		roadmap.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Nadam as a more polyvalent Adam with the increase of capacity in recurrent neural networks

About

Releases

Packages

Languages

marvande/optimisation_for_ML

Folders and files

Latest commit

History

Repository files navigation

Nadam as a more polyvalent Adam with the increase of capacity in recurrent neural networks

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages