Skip to content

SVRG #226

Answered by mkunesch
ulricharmel asked this question in Q&A
SVRG #226
Nov 13, 2021 · 1 comment
Discussion options

You must be logged in to vote

Hi! Thanks for the question!

I'll answer for the equations of SVRG as outlined in this blogpost, i.e. the update rule:

w_t = w_{t−1}−η_t [∇ψ_{i_t}(w_{t−1})−∇ψ_{i_t} (w̃ )+∇P(w̃ )],

with variable names as in the blog post.

The design of optax is such that users calculate gradients and apply updates themselves (see #155 for reasons for the latter), so the best approach with optax would be to let the user calculate the three gradient terms in the square brackets, use optax gradient transforms to transform the gradients if necessary, and then write a custom apply_updates function that implements the equation above similar to optax.apply_updates. This function would be a great contribution to u…

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by ulricharmel
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants