Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add normalize_by_update_norm #958

Merged
merged 5 commits into from
Jul 9, 2024
Merged

feat: add normalize_by_update_norm #958

merged 5 commits into from
Jul 9, 2024

Conversation

SauravMaheshkar
Copy link
Contributor

Reference: #594

This PR aims to add a new GradientTransformation which allows to scale by the gradient norm.

Request for Review: @fabianp

@SauravMaheshkar
Copy link
Contributor Author

CI is green again 😄

@SauravMaheshkar
Copy link
Contributor Author

Gentle ping: @fabianp

@fabianp
Copy link
Member

fabianp commented May 16, 2024

thanks for the ping, will look into this ASAP

@fabianp
Copy link
Member

fabianp commented May 31, 2024

Hi @SauravMaheshkar ,

a couple more comments after discussing this issue with the dev team:

  • let's promote this to the main package, since normalized GD is a fairly classical method :-)
  • we found the name scale_by_gradient_norm a bit misleading, as it's really scaled by the inverse of the gradient norm. A better name for the method would (IMO) be normalize_by_update_norm.

what do you think about these comments? do they sound reasonable?

@SauravMaheshkar
Copy link
Contributor Author

Hi @SauravMaheshkar ,

a couple more comments after discussing this issue with the dev team:

  • let's promote this to the main package, since normalized GD is a fairly classical method :-)
  • we found the name scale_by_gradient_norm a bit misleading, as it's really scaled by the inverse of the gradient norm. A better name for the method would (IMO) be normalize_by_update_norm.

what do you think about these comments? do they sound reasonable?

Hey @fabianp glad to know there's interest. That sounds reasonable.

Let me do some refactoring. Since it's going into main, I'll put some more effort in the docstrings as well. Will push soon 😄

@fabianp
Copy link
Member

fabianp commented Jun 1, 2024

amazing, thanks!

@SauravMaheshkar
Copy link
Contributor Author

@fabianp I've refactored my branch to add the transform to main rather than contrib. Request for Review 😄

@SauravMaheshkar SauravMaheshkar changed the title feat: add scale_by_gradient_norm feat: add normalize_by_update_norm Jun 25, 2024
Copy link
Member

@fabianp fabianp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're getting very close!

One last thing, could you add a doctest (i.e., an example in the docstring) of the new function ? see methods in alias.py for inspiration :-)

optax/_src/transform.py Outdated Show resolved Hide resolved
optax/_src/transform.py Outdated Show resolved Hide resolved
optax/_src/transform.py Outdated Show resolved Hide resolved
SauravMaheshkar and others added 2 commits June 28, 2024 14:02
Co-authored-by: Fabian Pedregosa <pedregosa@google.com>
@SauravMaheshkar
Copy link
Contributor Author

I've added a doctest 👍🏽 @fabianp

@fabianp
Copy link
Member

fabianp commented Jun 28, 2024

looks good to me, but I forgot to ask you one (hopefully) last thing:

please add this new function to the API documentation: https://github.com/google-deepmind/optax/blob/main/docs/api/transformations.rst

thanks! 🙏🏼

@SauravMaheshkar
Copy link
Contributor Author

looks good to me, but I forgot to ask you one (hopefully) last thing:

please add this new function to the API documentation: https://github.com/google-deepmind/optax/blob/main/docs/api/transformations.rst

thanks! 🙏🏼

Added in 5c01f6f

@fabianp
Copy link
Member

fabianp commented Jun 28, 2024

thanks @SauravMaheshkar !

@fabianp fabianp closed this Jul 8, 2024
@fabianp fabianp reopened this Jul 8, 2024
@fabianp
Copy link
Member

fabianp commented Jul 8, 2024

duplicated #1000 since I'm having some issues getting this merged due to internal errors

@SauravMaheshkar
Copy link
Contributor Author

duplicated #1000 since I'm having some issues getting this merged due to internal errors

Typically copybara I assume 😅

@copybara-service copybara-service bot merged commit 2660a04 into google-deepmind:main Jul 9, 2024
13 checks passed
@fabianp
Copy link
Member

fabianp commented Jul 9, 2024

yup. Finally merged 🎉

@SauravMaheshkar SauravMaheshkar deleted the saurav/scale_by_grad_norm branch July 9, 2024 10:07
@SauravMaheshkar
Copy link
Contributor Author

yup. Finally merged 🎉

Thanks a ton. Feels good to have it merged. Now on to #652

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants