Skip to content

why separate apply_updates from update? #155

Answered by mtthss
jeremiecoullon asked this question in Q&A
Discussion options

You must be logged in to vote

Hello,

Separating the transformation of updates from their application to the params has several advantages

  1. allows you to combine multiple transformations using chain
    (e.g. you might want to create custom optimisers by chaining together different existing gradient transformations,
    without having to rewrite the entire thing as a single monolithic optimiser,
    For a very trivial example, you might want to first clip gradient then rescale them using Adam, or viceversa.
    but you may also do more sophisticated combinations.
    If you take a look at alias.py you can see many popular optimisers are actually build from a relatively small set of primitives,
    by freely combining these you can experiment …

Replies: 1 comment 1 reply

Comment options

You must be logged in to vote
1 reply
@jeremiecoullon
Comment options

Answer selected by jeremiecoullon
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants