Skip to content

How to use a adamw optimizer with gradient clipping? #445

Closed Answered by rosshemsley
aniquetahir asked this question in Q&A
Discussion options

You must be logged in to vote

If you'd like to use clipping with adamw, you could something like the following:

opt = optax.chain(
   optax.clip_by_global_norm(1.0),
   optax.adamw(1e-4),
)

This will cause the clipping to be applied to the gradients before they are forwarded to the adam optimizer.

For the scale(-1.0) question - this is effectively flips the sign of the updates since the updates are applied by adding them to the parameters.

Replies: 2 comments 2 replies

Comment options

You must be logged in to vote
1 reply
@chkda
Comment options

Answer selected by fabianp
Comment options

You must be logged in to vote
1 reply
@fabianp
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
5 participants