Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Step size should be defined by the optimizee #26

Open
anandtrex opened this issue Apr 5, 2017 · 1 comment
Open

Step size should be defined by the optimizee #26

anandtrex opened this issue Apr 5, 2017 · 1 comment

Comments

@anandtrex
Copy link
Contributor

We want the step size to be a hyper-parameter of the optimizee rather than an optimizer hyper-parameter, since each optimizee has different requirements, and may even have different required step sizes for each dimension.

One of the solutions suggested by @guillaumeBellec would be to allow the current bounding function to also scale individual so that all optimizers only have to generate individuals between 0 and 1. The optimizee knows the required scaling along with the bounding and returns an appropriately scaled and bounded individual.

We have to think about how this would affect the optimizers we have, since it would change the error landscape they see. Specifically, how would the cross-entropy optimizer have to be modified to be able to do this? (i.e. how would it compensate for the scaling?).

@maharjun @bohnstingl @franzscherr Any thoughts about this?

Other possible ways to achieve optimizer independent step-size are also welcome.

@anandtrex anandtrex changed the title Step size should be implicitly defined by the optimizee Step size should be defined by the optimizee Apr 5, 2017
@guillaumeBellec
Copy link
Collaborator

guillaumeBellec commented Apr 7, 2017

The problem is that some parameters are typically 20 (time constants in ms) some others are 1e-3 (learning rate) and good values are expected to be within different range (10 to 30 ms for membrane time constant, 1e-5 to 1e-2 for learning rate).
Beside that learning rate might vary on a logarithmic scale which is another topic, having a step size of 0.01 does not have the same meaning in both cases. Probably 0.3 would be good as a step size for the time constant and 0.0001 for the learning rate. Crucially we don't want to waste 1000 iterations of adaptative learning rates to figure out this scaling if it is already clear from the view point of the user who designs the Optimizee.

I see two alternatives, at least it would be good to share a common philosophy on how to implement this:

  • What is currently meant to be done as far as I understood is to pass all the step-sizes as parameters to each optimizers. The drawback is that, it costs some times to redefine the hyperparameters of each Optimizer when one wants to try a new optimizer.

  • To minimize the number of optimizer specific step-size-like arguments. One can scale all parameters on the optimizee side before the communication with the optimizer, then the optimizer can be by default programmed for scaled parameters (zero mean and good values expected between -1 and 1).

If we prefer the second philosophy, a fast way of automating this is to implement this once when the Optimizee is defined in something like the bounding function. This bounding function would probably require some modifications and it should then be renamed something like bounding_and_scaling. It might still generate some additional issues: for instance will the result report contain a list of non-meaningfull parameters, the natural scaling should be used in those report instead ? If we have a scaling function between optimizer and optimizee, I think we actually need a reverse_scaling function as well between optimizee and optimizer, would we make it more messy to implement both ? Should we just define the values and have something in the base class that do it for us ?

@Tyrannas you probably want to think about this for the Neftci LTL optimizee, one should keep an eye on this to see what solutions people are coming up with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants