-
Notifications
You must be signed in to change notification settings - Fork 193
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
For backtracking linesearch: - Add debugging option for backtracking_linesearch. - Add info entry in BacktrackingLinesearchState to potentially help debugging by looking at outputs (could be useful for example in vmap setting, and mimics the setup for the zoom linesearch). - Adding mechanism to prevent the linesearch to make a step if that would end up getting NaNs or infinite values in the function. For zoom_linesearch: - Simplifies a bit the debugging information for the zoom linesearch and added prints of some relevant values for debugging. - Added a note in the zoom linesearch that using curv_tol=inf, would let this method make an efficient alternative to the backtracking linesearch using polynomial interpolation strategies. - Most importantly, added an option to define the initial guess for the linesearch. Looking up Nocedal and Wright, this initial guess should always be one for Newton or quasi-Newton methods. Could be refined for other methods (for now, for such other methods like gradient descent, we may simply keep the previous learning rate). This largely improved the performance in the public notebook. For lbfgs: - Use clipped gradient step for the very first step (when scale_init_precond=True). The scale of the preconditioner for the very first iteration is not detailed anywhere in the literature I've seen. But using such clipped gradient step ensures to capture approximately the right scale. This made for example one of the tests pass without any further modifications of the default hyperparameters of the objective. - Revised the notebook in view fo these changes. Added some tips and an example of benchmark. PiperOrigin-RevId: 694183028
- Loading branch information
Showing
5 changed files
with
369 additions
and
121 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.