You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Fixing min/max ops scale propagation (PR #68 ) had the side effect of breaking MNIST training. Early investigation showing a divergence of the scale factors after a couple of iterations, similarly to an unstable dynamical system.
Todo: additional investigation to understand the dynamic of the issue.
The text was updated successfully, but these errors were encountered:
Fixing
min/max
ops scale propagation (PR #68 ) had the side effect of breaking MNIST training. Early investigation showing a divergence of the scale factors after a couple of iterations, similarly to an unstable dynamical system.Todo: additional investigation to understand the dynamic of the issue.
The text was updated successfully, but these errors were encountered: