Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added mathematical description to Noisy SGD #857

Merged
merged 5 commits into from
Mar 10, 2024

Conversation

hmludwig
Copy link
Contributor

@hmludwig hmludwig commented Mar 8, 2024

I added mathematical description to noisy SGD to improve clarity (see #757)

The results look like this:
Screenshot 2024-03-08 at 12 46 34

Copy link
Member

@fabianp fabianp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks great to me, thanks!

@@ -992,11 +992,27 @@ def noisy_sgd(
gamma: float = 0.55,
seed: int = 0
) -> base.GradientTransformation:
r"""A variant of SGD with added noise.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First line needs to end with ".", so I would leave the first line as it is, and add your description below

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

and the line after should be blank, so something like:

r"""A variant of SGD with added noise.

Noisy SGD is ...
"""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you clarify if the requirement for the first line to end with "." is driven by stylistic guidelines or technical reasons? The list of optimizers at the beginning of the documention isn't effected and displays the first sentence of the docstring correctly.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its a requirement of the python style: https://peps.python.org/pep-0257/#multi-line-docstrings

Internally we get an error when importing the code if the docstring doesn't follow this convention

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you're right though that this would be something worth adding to https://optax.readthedocs.io/en/latest/development.html

Copy link
Collaborator

@vroulet vroulet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing this!
As you are on it, could you change the description of gamma lines 1046-1047 of this file by

gamma: A parameter controlling the annealing of noise over time ``t``, the
      variance decays according to ``(1+t)**(-gamma)``.

Thank you again!

The update :math:`u_t` is modified to include this noise as follows:

.. math::
u_t \leftarrow -\alpha_t g_t + N(0, \sigma_t^2),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for doing that!
Shouldn't it rather be the following?
u_t \leftarrow -\alpha_t (g_t + N(0, \sigma_t^2))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're right. We first add the noise to the gradient and then scale by the learning rate. I will review my more carefully in the future.

@copybara-service copybara-service bot merged commit c275337 into google-deepmind:main Mar 10, 2024
6 checks passed
@fabianp
Copy link
Member

fabianp commented Mar 11, 2024

thanks @hmludwig ! Having accurate, reliable and helpful documentation is the most important part of a scientific project like this one. Your contribution makes a big difference 🙌🏼

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants