Control over Learning Rate and/or initial step size in linesearch #108

matillda123 · 2025-01-08T13:31:25Z

Hi,

is there a convenient way to influence the step size of the available minimizers/least_square solvers? (E.g. in optax one has the learning_rate argument for all optimizers)
So far with the optimistix solvers I was only able to gain some control by changing solver.search.__dict__["step_init"], which doesnt seem right.

I guess one could just define a custom solver. But it seems weird that on one side there is so much customizability with the modular approach of optimistix, while with the predefined solvers there is very little user-control.

The text was updated successfully, but these errors were encountered:

johannahaffner · 2025-01-08T13:51:41Z

Hi Matilda,

you could indeed define a custom solver by re-using the searches and descents and tweaking their constants, if you desire smaller steps on average.

A learning rate would not necessarily be a meaningful concept if used with a backtracking line search or a classical trust region. They choose the step length based on optimality criteria, rather than just going into some direction regardless of whether this actually improves the objective function (by the expected amount) or not.

You could still bias these toward smaller step sizes (e.g. by requiring that BacktrackingArmijo starts with a step size of 0.5, rather than 1.0), but this would not alter their acceptance criteria.
For the trust region radius, you could change the value of the high_constant and low_constant parameters, which are used to multiply the radius depending on whether the search judges the objective to be sufficiently quadratic.

Instead of tweaking solver attributes as above, you can also modify the solvers like this:

import equinox as eqx
import optimistix as optx

solver = optx.SomeSolver(...)
search = optx.BacktrackingArmijo(..., step_init=...)  # Use your pick of constants

get = lambda s: s.search
solver = eqx.tree_at(get, solver, search)

I do this when I quickly want to try something out and don't want to bother writing down the attributes and constructor, but if I would actually use the solver many times, I'd do it properly :)

patrick-kidger · 2025-01-08T17:24:15Z

@matillda123 do you have an example of the kind of thing you'd like to do, using a particular solver for clarity? :)

matillda123 · 2025-01-08T23:57:04Z

I'm not sure if it's possible to give a short example here. What im trying to do is basically fitting of a high-dimensional function to some data. The function basically consists of some FFTs.
Its similar to what is described in this paper.

I'm not using one specific solver, rather I'm trying to get a feeling which solvers work nicely for this problem
I have the suspicion, that some solvers may sometimes work "significantly" better when the step size is biased towards bigger or smaller stepsizes. I noticed something like that in optax when tuning learning_rate with adam(+linesearch) and lbfgs.
So, I was looking for a way to test this with the predefined optimistix solvers and didn't find one.

I hope its clearer now what the issue is.

johannahaffner · 2025-01-09T08:10:59Z

I see, tough optimisation problem!

It sounds like you want to do hyperparameter tuning - what do you get if a solver is not working well? Convergence to a local minimum, running out of steps, divergence?

And are you working with complex numbers, or have you split the problem into complex and real parts?

matillda123 · 2025-01-09T13:47:36Z

I have real inputs but I use those to make complex numbers.
If a solver doesn't work well, for me, that generally means it's stuck in a local minimum or converging really slowly.

johannahaffner · 2025-01-09T16:10:40Z

So optimistix only ever sees real numbers?

The functions you are working with can have several local minima, right? In this case, you might want to try multi-starting your optimisation.
If you'd like step sizes larger than one wherever your optimisation landscape looks suitably quadratic, maybe try optx.BFGS with a ClassicalTrustRegion? The trust region radius can get larger than one, and has the widest range of step sizes, since these are updated multiplicatively and can increase as well as decrease. You can also take a look at the combinations here, such as

optimistix/tests/helpers.py

Line 126 in dcafc48

class BFGSIndirectDampedNewton(optx.AbstractBFGS):

(It sounds like you tried BacktrackingArmijo, which will start with the full Newton step in BFGS and then chop that in half until it's satisfied.)

Those are but two suggestions - I'm afraid I can't say much more with only a basic idea of what your problem is.

matillda123 · 2025-01-10T16:57:23Z

Yes, optimistix should only see real numbers.

So far I only used standard solvers (mainly BFGS, NonlinearCG, GaussNewton and Dogleg). I think Dogleg has a trustregion linesearch. But it's tricky as that is one of the cases which can get stuck in local minima.

Alright, so I guess there is no way around modifying the solvers with different searches and tuning their parameters.

Thanks for the suggestions. :)

johannahaffner · 2025-01-10T18:52:13Z

Alright! Happy solving :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Control over Learning Rate and/or initial step size in linesearch #108

Control over Learning Rate and/or initial step size in linesearch #108

matillda123 commented Jan 8, 2025

johannahaffner commented Jan 8, 2025 •

edited

Loading

patrick-kidger commented Jan 8, 2025

matillda123 commented Jan 8, 2025

johannahaffner commented Jan 9, 2025

matillda123 commented Jan 9, 2025

johannahaffner commented Jan 9, 2025 •

edited

Loading

matillda123 commented Jan 10, 2025

johannahaffner commented Jan 10, 2025

Control over Learning Rate and/or initial step size in linesearch #108

Control over Learning Rate and/or initial step size in linesearch #108

Comments

matillda123 commented Jan 8, 2025

johannahaffner commented Jan 8, 2025 • edited Loading

patrick-kidger commented Jan 8, 2025

matillda123 commented Jan 8, 2025

johannahaffner commented Jan 9, 2025

matillda123 commented Jan 9, 2025

johannahaffner commented Jan 9, 2025 • edited Loading

matillda123 commented Jan 10, 2025

johannahaffner commented Jan 10, 2025

johannahaffner commented Jan 8, 2025 •

edited

Loading

johannahaffner commented Jan 9, 2025 •

edited

Loading