Step size should be defined by the optimizee #26

anandtrex · 2017-04-05T12:13:33Z

We want the step size to be a hyper-parameter of the optimizee rather than an optimizer hyper-parameter, since each optimizee has different requirements, and may even have different required step sizes for each dimension.

One of the solutions suggested by @guillaumeBellec would be to allow the current bounding function to also scale individual so that all optimizers only have to generate individuals between 0 and 1. The optimizee knows the required scaling along with the bounding and returns an appropriately scaled and bounded individual.

We have to think about how this would affect the optimizers we have, since it would change the error landscape they see. Specifically, how would the cross-entropy optimizer have to be modified to be able to do this? (i.e. how would it compensate for the scaling?).

@maharjun @bohnstingl @franzscherr Any thoughts about this?

Other possible ways to achieve optimizer independent step-size are also welcome.

guillaumeBellec · 2017-04-07T09:13:29Z

The problem is that some parameters are typically 20 (time constants in ms) some others are 1e-3 (learning rate) and good values are expected to be within different range (10 to 30 ms for membrane time constant, 1e-5 to 1e-2 for learning rate).
Beside that learning rate might vary on a logarithmic scale which is another topic, having a step size of 0.01 does not have the same meaning in both cases. Probably 0.3 would be good as a step size for the time constant and 0.0001 for the learning rate. Crucially we don't want to waste 1000 iterations of adaptative learning rates to figure out this scaling if it is already clear from the view point of the user who designs the Optimizee.

I see two alternatives, at least it would be good to share a common philosophy on how to implement this:

What is currently meant to be done as far as I understood is to pass all the step-sizes as parameters to each optimizers. The drawback is that, it costs some times to redefine the hyperparameters of each Optimizer when one wants to try a new optimizer.
To minimize the number of optimizer specific step-size-like arguments. One can scale all parameters on the optimizee side before the communication with the optimizer, then the optimizer can be by default programmed for scaled parameters (zero mean and good values expected between -1 and 1).

If we prefer the second philosophy, a fast way of automating this is to implement this once when the Optimizee is defined in something like the bounding function. This bounding function would probably require some modifications and it should then be renamed something like bounding_and_scaling. It might still generate some additional issues: for instance will the result report contain a list of non-meaningfull parameters, the natural scaling should be used in those report instead ? If we have a scaling function between optimizer and optimizee, I think we actually need a reverse_scaling function as well between optimizee and optimizer, would we make it more messy to implement both ? Should we just define the values and have something in the base class that do it for us ?

@Tyrannas you probably want to think about this for the Neftci LTL optimizee, one should keep an eye on this to see what solutions people are coming up with.

anandtrex changed the title ~~Step size should be implicitly defined by the optimizee~~ Step size should be defined by the optimizee Apr 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Step size should be defined by the optimizee #26

Step size should be defined by the optimizee #26

anandtrex commented Apr 5, 2017

guillaumeBellec commented Apr 7, 2017 •

edited

Loading

Step size should be defined by the optimizee #26

Step size should be defined by the optimizee #26

Comments

anandtrex commented Apr 5, 2017

guillaumeBellec commented Apr 7, 2017 • edited Loading

guillaumeBellec commented Apr 7, 2017 •

edited

Loading