You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Present a simple module capable of learning arithmetic functions such as add, sub, mult, div, etc. And can generalize well on unseen data or unseen inference scheme.
DNNs with Non-linearities Struggle to Learn Identity Function
Train an autoencoder to reconstruct its input ranged [-5, 5].
All autoencoders are identical in its parameterization (3 hidden layers of size 8), only using different non linearities.
Trained with MSE loss.
Tested in [-20, 20], the error increase severely both below and above the range of numbers seen during training.
The Neural Accumulator (NAC) & Neural Arithmetic Logit Unit (NALU)
NAC: A special case of linear layer, whose weight matrix W only consists of {-1, 0, 1}, defined as:
W = tanh(\hat{W}) * σ(\hat{M})
The elements of W are guaranteed to be [-1, 1], and biased towards {-1, 0, 1} during learning, since {-1, 0, 1} corresponds to the saturation points of either tanh(.) or σ(.)
Its output are additions or subtractions of rows in the input vector.
NALU: Learns a weighted sum between two sub-cells:
One is the original NAC, capable of learning to add and subtract.
The other one operates in log space, capable of multiply and divid, e.g., log(XY) = logX + logY; log(X/Y) = logX - log Y; exp(log(X)) = X
Altogether, NALU can learn to perform general arithmetic operations.
Can handle either add/subtract or mult/div operations but not a combination of both.
For mult/div operations, it cannot handle negative targets as the mult/div gate output is the result of an exponentiation operation which always yeilds positive results.
Power operations are only possible when the exponent is in the range of [0, 1].
Metadata
The text was updated successfully, but these errors were encountered: