-
Notifications
You must be signed in to change notification settings - Fork 193
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added mathematical description to Noisy SGD #857
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks great to me, thanks!
@@ -992,11 +992,27 @@ def noisy_sgd( | |||
gamma: float = 0.55, | |||
seed: int = 0 | |||
) -> base.GradientTransformation: | |||
r"""A variant of SGD with added noise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First line needs to end with ".", so I would leave the first line as it is, and add your description below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
and the line after should be blank, so something like:
r"""A variant of SGD with added noise.
Noisy SGD is ...
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you clarify if the requirement for the first line to end with "." is driven by stylistic guidelines or technical reasons? The list of optimizers at the beginning of the documention isn't effected and displays the first sentence of the docstring correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its a requirement of the python style: https://peps.python.org/pep-0257/#multi-line-docstrings
Internally we get an error when importing the code if the docstring doesn't follow this convention
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're right though that this would be something worth adding to https://optax.readthedocs.io/en/latest/development.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing this!
As you are on it, could you change the description of gamma
lines 1046-1047 of this file by
gamma: A parameter controlling the annealing of noise over time ``t``, the
variance decays according to ``(1+t)**(-gamma)``.
Thank you again!
optax/_src/alias.py
Outdated
The update :math:`u_t` is modified to include this noise as follows: | ||
|
||
.. math:: | ||
u_t \leftarrow -\alpha_t g_t + N(0, \sigma_t^2), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for doing that!
Shouldn't it rather be the following?
u_t \leftarrow -\alpha_t (g_t + N(0, \sigma_t^2))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, you're right. We first add the noise to the gradient and then scale by the learning rate. I will review my more carefully in the future.
thanks @hmludwig ! Having accurate, reliable and helpful documentation is the most important part of a scientific project like this one. Your contribution makes a big difference 🙌🏼 |
I added mathematical description to noisy SGD to improve clarity (see #757)
The results look like this: