Design of CIL Algorithms and the base class #1064

epapoutsellis · 2021-11-10T13:58:20Z

epapoutsellis
Nov 10, 2021
Maintainer

I think it is time to decide the design style for our algorithms and the base class Algorithm . At the moment, we have the following algorithms:

Very soon, more algorithms will be added in CIL so it is urgent to decide the style that we want for our users.

In order to define our algorithms, we use 2 methods __init__ and set_up. The __init__ calls the set_up method with the same signature. In most or all of the algorithms, the __init__ method does not do anything. In practice, using the kwargs in the signature of __init__, we have access to two important kwargs from the base Algorithm class: max_iteration, update_objective_interval.

Let's focus on a specific example, e.g., the GD algorithm

GradientDescent

In the GD class, we have:

class GD(Algorithm):
    ''' 
    
        Gradient Descent algorithm
        
        
    '''

    def __init__(self, initial=None, objective_function=None, step_size =None, **kwargs):
        '''GD algorithm creator
        
        initialisation can be done at creation time if all 
        proper variables are passed or later with set_up
        
        :param initial: initial guess
        :param objective_function: objective function to be minimised
        :param step_size: step size for gradient descent iteration
        :param alpha: optional parameter to start the backtracking algorithm, default 1e6
        :param beta: optional parameter defining the reduction of step, default 0.5.
                    It's value can be in (0,1)
        :param rtol: optional parameter defining the relative tolerance comparing the 
                     current objective function to 0, default 1e-5, see numpy.isclose
        :param atol: optional parameter defining the absolute tolerance comparing the 
                     current objective function to 0, default 1e-8, see numpy.isclose
        '''
        super(GD, self).__init__(**kwargs)
        if kwargs.get('x_init', None) is not None:
            if initial is None:
                warnings.warn('The use of the x_init parameter is deprecated and will be removed in following version. Use initial instead',
                   DeprecationWarning, stacklevel=4)
                initial = kwargs.get('x_init', None)
            else:
                raise ValueError('{} received both initial and the deprecated x_init parameter. It is not clear which one we should use.'\
                    .format(self.__class__.__name__))

        self.alpha = kwargs.get('alpha' , 1e6)
        self.beta = kwargs.get('beta', 0.5)
        self.rtol = kwargs.get('rtol', 1e-5)
        self.atol = kwargs.get('atol', 1e-8)
        if initial is not None and objective_function is not None :
            self.set_up(initial=initial, objective_function=objective_function, step_size=step_size)
    
    def set_up(self, initial, objective_function, step_size):
        '''initialisation of the algorithm
        
        :param initial: initial guess
        :param objective_function: objective function to be minimised
        :param step_size: step size'''
        print("{} setting up".format(self.__class__.__name__, ))
            
        self.x = initial.copy()
        self.objective_function = objective_function
        

        if step_size is None:
            self.k = 0
            self.update_step_size = True
            self.x_armijo = initial.copy()
            # self.rate = self.armijo_rule() * 2
            # print (self.rate)
        else:
            self.step_size = step_size
            self.update_step_size = False
        
        
        self.update_objective()

        self.x_update = initial.copy()

        self.configured = True
        print("{} configured".format(self.__class__.__name__, ))

To configure the Gradient Descent algorithm, we need 3 things:
- initial point
- objective function (with a gradient method): Function class
- step size (constant by default)
x^{n+1} = x^{n} - \gamma^{n}\nabla f(x^{n})

See for example GradientDescent.

In general (non convex/strongly convex objective), the initial point is very important. Also, the step size is important for the convergence speed and we certainly need a function that is differentiable. Finally, for an algorithm the number of iterations is also an imporant parameter.

At the moment, these arguments are by default None which in my opinion is wrong. These should be required parameters.

Let's continue with the kwargs. In practice, we have 2 types of kwargs.

kwargs that are used for the corresponding algorithm. For example in GD, we have:
```
    self.alpha = kwargs.get('alpha' , 1e6)
    self.beta = kwargs.get('beta', 0.5)
```
that are used in the armijo_rule and also
```
        self.rtol = kwargs.get('rtol', 1e-5)
        self.atol = kwargs.get('atol', 1e-8)
```
that are used in the should_stop method of GD that basically overrides the should_stop of the Algorithm base class.
kwargs from the Algorithm base class, e.g., max_iteration, update_objective_interval.

At the moment the UI for GD is :

    ig = ImageGeometry(12,13,14)
    initial = ig.allocate()
    b = ig.allocate('random')
    identity = IdentityOperator(ig)
    
    norm2sq = LeastSquares(identity, b)
    rate = norm2sq.L / 3.
    
    alg = GD(initial=initial, 
            objective_function=norm2sq, 
            rate=rate, atol=1e-9, rtol=1e-6)
    alg.max_iteration = 1000
    alg.update_objective_interval = 10
    alg.run(verbose=0)

or

    alg = GD(initial=initial, 
            objective_function=norm2sq, 
            rate=rate, atol=1e-9, rtol=1e-6, max_iteration = 1000, update_objective_interval=10)

Note: In the above example, rate is used as kwargs but there is no actual rate in the GD class. Therefore, it is not used, the correct name is step_size. We need to be careful what is passed in kwargs and if it is used. For example, we need to check for the allowed kwargs, e.g., atol,rtol,alpha,beta.

Below some questions:

Question 1: What do we considered as required parameters for an algorithm.

Question 2: How do we configure required parameters for an algorithm?**

Question 3: What do we considered as optional parameters(kwargs) for an algorithm.

Question 4: What do we considered as optional parameters(kwargs) for the algorithm base class.

Question 5: How do we configure kwargs parameters? What do we want for the UI of an algorithm?**

Personally, provided that we check for the allowed kwargs (of the algorithm and its base), I like the following UI:

    alg = GD(initial=initial, objective_function=norm2sq, step_size=step_size, atol=1e-9, rtol=1e-6, max_iteration=1000, update_objective_interval = 10)
   alg.run()

Another option :

    alg = GD()
    alg.initial = initital
    alg.objective_function = norm2sq
    alg.step_size = step_size
    alg.atol = atol
    alg.rtol = rtol
    alg.max_iterations = 1000
    alg.update_objective_interval = 10

   alg.run()

I will continue with the discussion adding more examples.

epapoutsellis · 2021-11-12T07:24:15Z

epapoutsellis
Nov 12, 2021
Maintainer Author

This is an example from Pytorch for the [Stochastic Gradient Descent].(https://pytorch.org/docs/stable/_modules/torch/optim/sgd.html#SGD)

In the __init__ there are no kwargs, only parameters for the actual algorithm. The values of these parameters are checked inside the __init__ and passed as defaults in the base class Optimizer. Then, the main algorithm is executed in the step method which is similar to our update method.

0 replies

paskino · 2021-11-17T11:08:30Z

paskino
Nov 17, 2021
Maintainer

I believe @gfardell @jakobsj @paskino discussed a similar topic for the recon module. Unfortunately this was not captured well in an issue PR, except #1047 (comment).

Thanks for bringing this up, it is a great topic for the developer guidelines.

The agreement is that for each class there are essential, and non-essential parameters. The non-essential can be further be divided in often configured and advanced parameters:

essential
non-essential
- often-configured
- advanced

To create an instance of a class, the creator of a class should require the essential and often-configured parameters as named parameters. It should not accept positional arguments *args or keyworded arguments **kwargs. The class should provide setter methods to change all the parameters at any time. The definition of what are the essential, often-configured and advanced parameters depends on the actual class.

For all iterative algorithms I'd argue that max_iteration is an essential parameter and update_objective_interval an often-changed parameter. However, as you point out, the current design hides them to the user, who is forced to either set them with the property or know that they can set it and pass it in the creator. This is clearly wrong.

Looking at the Algorithm base class, the parameters are max_iteration, update_objective_interval and log_file. I would consider max_iteration an essential parameter and update_objective_interval an often-changed parameter, while log_file an advanced parameter.

CIL/Wrappers/Python/cil/optimisation/algorithms/Algorithm.py

Lines 39 to 66 in 843b899

    
               def __init__(self, **kwargs): 
        
                   '''Constructor 
        
                   Set the minimal number of parameters: 
        
                   :param max_iteration: maximum number of iterations 
        
                   :type max_iteration: int, optional, default 0 
        
                   :param update_objectice_interval: the interval every which we would save the current\ 
        
                                                  objective. 1 means every iteration, 2 every 2 iteration\ 
        
                                                  and so forth. This is by default 1 and should be increased\ 
        
                                                  when evaluating the objective is computationally expensive. 
        
                   :type update_objective_interval: int, optional, default 1 
        
                   :param log_file: log verbose output to file 
        
                   :type log_file: str, optional, default None 
        
                   ''' 
        
                   self.iteration = -1 
        
                   self.__max_iteration = kwargs.get('max_iteration', 0) 
        
                   self.__loss = [] 
        
                   self.memopt = False 
        
                   self.configured = False 
        
                   self.timing = [] 
        
                   self._iteration = [] 
        
                   self.update_objective_interval = kwargs.get('update_objective_interval', 1) 
        
                   # self.x = None 
        
                   self.iter_string = 'Iter' 
        
                   self.logger = None 
        
                   self.__set_up_logger(kwargs.get('log_file', None))

Trying to answer all you questions:

What do we considere as required parameters for an algorithm? Depends on the actual class, plus all of Algorithm base class essential+often-changed.
How do we configure required parameters for an algorithm? Essential+often-changed parameter can be passed to the creator and can be changed with appropriate setter methods (set_<parameter>)
What do we consider as optional parameters(kwargs) for an algorithm? Depends on the actual class, plus all of Algorithm base class advanced parameters. kwargs should not be accepted.
What do we consider as optional parameters(kwargs) for the algorithm base class? I'd say log_file is advanced, update_objective_interval is often-changed, max_iteration is essential. kwargs should not be accepted.
How do we configure kwargs parameters? What do we want for the UI of an algorithm? I don't think the creator should accept any positional or keyworded arguments.

In the case of FDK the only essential parameter is the acquisition_data, while image_geometry and filter` are often-changed by the user. They are available in the creator but can have a sensible default, so the user can skip setting them. The additional advanced parameters are available to be configured via setter methods.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Design of CIL Algorithms and the base class #1064

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 2 comments

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Design of CIL Algorithms and the base class #1064

epapoutsellis Nov 10, 2021 Maintainer

GradientDescent

Question 1: What do we considered as required parameters for an algorithm.

Question 2: How do we configure required parameters for an algorithm?**

Question 3: What do we considered as optional parameters(kwargs) for an algorithm.

Question 4: What do we considered as optional parameters(kwargs) for the algorithm base class.

Question 5: How do we configure kwargs parameters? What do we want for the UI of an algorithm?**

Replies: 2 comments

epapoutsellis Nov 12, 2021 Maintainer Author

paskino Nov 17, 2021 Maintainer

epapoutsellis
Nov 10, 2021
Maintainer

epapoutsellis
Nov 12, 2021
Maintainer Author

paskino
Nov 17, 2021
Maintainer