Refactored and integrated Grey Wolf Optimizer #121

hellkite500 · 2024-04-30T22:43:38Z

As noted in #99, a Grey Wolf Optimizer was implemented by @xfeng2021. This original implementation is captured in the history in 92abc92 and a small suite of unit tests were developed to exercise that implementation for its three key semantics -- creation, optimization, and "restart" from a non-zero iteration.

A few minor fixes were required on the original code to get these tests to work correctly, and then a refactor an consolidation of the code was applied.

Note that the PR doesn't currently expose this optimizer for use to search.py or __main__ just yet. That functionality is being reviewed from the original implementation and I can add those to this PR or open a new one if this gets merged.

Additions

ngen.cal.optimizers subpackage
grey_wolf.py implementation of Grey Wolf Optmization
test_gwo.py for testing the grey wolf implementation

Testing

Tested with pytest locally

Notes

Additional functionality required to integrate the optimizer into the search module

Todos

Could consider some different attribute handling/semantics and possibly different serialization for swarm history/cost attributes.

Checklist

Target Environment support

Linux
MacOS

This commit combines two files implementing the grey wolf optimizer as a pyswarm sublcass. The code from each as originaly authored by Xia Feng are included here in the history. These will be refactored and refined to fit into the general ngen_cal architecture and assist in maintaining the implementation in the future. Co-authored-by: hellkite500 <nels.frazier@noaa.gov>

xfeng2021 · 2024-05-01T20:36:46Z

Thanks for the updates.

…t gwo in public `optimizers` This refactor adds a layer of indirection to safeguard the `optimizers` public api. The `_optimizers` subpackage provides a place for _both_ stable and unstable / experimental optimizers to live. Once an optimizer's api and features are fleshed out, it can then be imported by the `optimizers` subpackage in `__init__.py` effectively moving it into the public api.

robertbartel

I have several things I think need at least sanity checks. I got a bit nit-picky, especially on some simple style things, though I tried to offer suggestion inline for those things.

aaraney · 2024-05-02T16:20:25Z

python/ngen_cal/src/ngen/cal/optimizers/grey_wolf.py

+    """_summary_
+
+    Args:
+        SwarmOptimizer (_type_): _description_


Nope, this is just @hellkite500's IDE defaults. @hellkite500, can you add docs for this?

A bit of an aside: I originally started my review for grey_wolf.py before it got moved to the private subpackage, but didn't finish before it got moved, and Github made me refresh and, seemingly, required I start over in the "other" grey_wolf.py. So some of my comments may appear to be duplicated.

python/ngen_cal/src/ngen/cal/optimizers/grey_wolf.py

robertbartel · 2024-05-02T14:09:24Z

python/ngen_cal/setup.cfg

@@ -36,6 +36,7 @@ install_requires =
    hydrotools.metrics
    hydrotools.nwis_client
    pyarrow
+    pyswarms


Should we add pyswarms to requirements.txt?

Yeah? I am inclined to remove the root repo requirements.txt in favor of a ns package level requirements.txt instead. In either case, we should use pip freeze to generate requirements.txt instead of just listing out all the deps.

Lets open this as a TODO in an issue and cycle back to this.

python/ngen_cal/src/ngen/cal/optimizers/grey_wolf.py

robertbartel · 2024-05-02T15:36:06Z

python/ngen_cal/src/ngen/cal/_optimizers/grey_wolf.py

+            df (DataFrame): dataframe containing swarm history or cost
+            name (str): history or attribute variable to extract
+            iter_range (Optional[List], optional): If provided, only extract
+                hitory data and the best, otherwise extract cost for current iteration.


Suggested change

hitory data and the best, otherwise extract cost for current iteration.

history data and the best, otherwise extract cost for current iteration.

robertbartel · 2024-05-02T15:37:35Z

python/ngen_cal/src/ngen/cal/_optimizers/grey_wolf.py

+
+        pos = [np.array(df[df['iteration']==i].iloc[:,0:self.dimensions]) for i in iter_range]
+        current_pos = pos[len(pos)-1] 
+        return current_pos, pos


I feel like at least something isn't right here. pos seems like it has to be a List[np.ndarray], which would make current_pos an np.ndarray rather than an int. But I'm not sure if the hints are the off or you want the function to be doing something slightly different.

robertbartel · 2024-05-02T15:41:54Z

python/ngen_cal/src/ngen/cal/_optimizers/grey_wolf.py

+        df.to_csv(fname, index=False)
+        return df
+
+    def read_hist_iter_file(self) -> Tuple: 


This function isn't actually returning anything, despite the hint and docstring.

robertbartel · 2024-05-02T15:54:54Z

python/ngen_cal/src/ngen/cal/_optimizers/grey_wolf.py

+        """Update history."""
+        _hist={ a: getattr(self.swarm, a, None) for a in self._cost_attrs+self._pos_attrs}
+        _hist['mean_pbest_cost'] = np.mean(self.swarm.pbest_cost)
+        _hist['mean_leader_cost'] = np.mean(self.swarm.leader_cost)


The leader_cost attribute for self.swarm seems to be getting added dynamically. Just as a sanity check, are we sure update_history cannot be called on an instance for which self.swarm.leader_cost has not been added?

robertbartel · 2024-05-02T15:59:11Z

python/ngen_cal/src/ngen/cal/optimizers/__init__.py

@@ -0,0 +1 @@
+from .._optimizers.grey_wolf import GreyWolfOptimizer


Wait, why move GreyWolfOptimizer to the private package but import it here?

This refactor adds a layer of indirection to safeguard the optimizers
public api. The _optimizers subpackage provides a place for both
stable and unstable / experimental optimizers to live. Once an
optimizer's api and features are fleshed out, it can then be imported by
the optimizers subpackage in __init__.py effectively moving it into
the public api.

Gotcha. I was thrown off a bit by having in my head that the class was still experimental, since the description for the PR notes it not being exposed yet.

Yeah, it is a little confusing. I also thought we should not expose it to the public api until we have it "hooked up," but @hellkite500 thought otherwise.

aaraney · 2024-05-01T14:56:25Z

python/ngen_cal/src/ngen/cal/optimizers/__init__.py

@@ -0,0 +1,3 @@
+from .grey_wolf import GreyWolfOptimizer
+
+__all__ = [GreyWolfOptimizer]


I don't think __all__ is needed in this case since we aren't importing anything other than GreyWolfOptimizer in this module. __all__ controls what gets imported when you wildcard import (e.g. from ngen.cal.optimizers import *) a module. Regardless, its kind of annoying, but the members of __all__ must be strings. So, if we choose to keep this, this will need to be changed to this:

Suggested change

__all__ = [GreyWolfOptimizer]

__all__ = ["GreyWolfOptimizer"]

aaraney · 2024-05-01T15:02:37Z

python/ngen_cal/setup.cfg

@@ -36,6 +36,7 @@ install_requires =
    hydrotools.metrics
    hydrotools.nwis_client
    pyarrow
+    pyswarms


We don't have to do this here, but we should move deps like pyswarms to an [options.extras_require]. This would let people install ngen.cal[particalswarm] for example. So, just pull in what you need.

aaraney · 2024-05-01T15:29:55Z

python/ngen_cal/src/ngen/cal/optimizers/grey_wolf.py

+import time
+from typing import Tuple, List, Optional
+
+def create_swarm(


Suggested change

def create_swarm(

def create_swarm(

n_particles: int,

dimensions: int,

bounds: Optional[Tuple[Union[np.ndarray, List], Union[np.ndarray, List]]] = None,

center: Optional[Union[np.ndarray, float]] = 1.0,

init_pos: Optional[np.ndarray] = None,

options: Optional[Dict[Any, Any]] = None,

):

aaraney · 2024-05-01T15:31:42Z

python/ngen_cal/src/ngen/cal/optimizers/grey_wolf.py

+    bounds=None,
+    center=1.0,
+    init_pos=None,
+    options={}


This needs to default to None and passed as an {} if it is None. Using a mutable type as a default argument is a foot gun in python.

aaraney · 2024-05-01T15:33:58Z

python/ngen_cal/src/ngen/cal/optimizers/grey_wolf.py

+
+
+class GreyWolfOptimizer(SwarmOptimizer):
+    """_summary_


Can you write up a doc string for this or just remove it, please?

aaraney · 2024-05-02T16:28:16Z

python/ngen_cal/src/ngen/cal/_optimizers/grey_wolf.py

+        """
+
+        # Apply verbosity
+        if self.start_iter>0:


Yeah, there are a lot of style things that I would like to improve in this work. I will run something like black to format this before we merge.

aaraney · 2024-05-02T16:28:31Z

python/ngen_cal/src/ngen/cal/_optimizers/grey_wolf.py

+        # Setup Pool of processes for parallel evaluation
+        pool = None if n_processes is None else mp.Pool(n_processes)
+
+        ftol_history = deque(maxlen=self.ftol_iter)


@hellkite500

aaraney · 2024-05-02T16:29:22Z

python/ngen_cal/src/ngen/cal/_optimizers/grey_wolf.py

+        # Close Pool of Processes
+        if n_processes is not None:
+            pool.close()
+        return (final_best_cost, final_best_pos)


Totally agree, this will get reformatted before we merge.

aaraney · 2024-05-02T16:39:10Z

python/ngen_cal/src/ngen/cal/_optimizers/grey_wolf.py

+    See `pyswarms.base.SwarmOptimizer`'s documentation for more information.
+
+    Args:
+        SwarmOptimizer (_type_): _description_


@hellkite500, can you add some docs here. In this comment is fine too if you want me to just push them up and add you as the author.

python/ngen_cal/src/ngen/cal/_optimizers/grey_wolf.py

Co-authored-by: Robert Bartel <37884615+robertbartel@users.noreply.github.com>

hellkite500 and others added 9 commits April 30, 2024 15:22

feat(ngen_cal): create initial optimizers subpackage

7a6fd5c

test: add initial test of grey wolf optimizer

aea9508

test: test grew wolf optimize function

2dde645

test: test grew wolf restart from iteration

79f25e0

fix(grey_wolf): fix original implementation to pass restart test

1d6dad4

refactor(grey_wolf): refactor and condense grew wolf implementation

e52b78f

feat: expose GrewWolfOptimizer from optimizers subpackage

98d23e3

test: update grew wolf test for refactored optmizer

7cccd3a

hellkite500 requested review from robertbartel, aaraney, xfeng2021 and Ben-Choat April 30, 2024 22:43

doc: v0.3 changelog draft, setup dependencies updated

20d716b

aaraney added 3 commits May 2, 2024 10:10

chore: remove erroneous __all__

f964aba

style: sort imports

b567b53

robertbartel requested changes May 2, 2024

View reviewed changes

aaraney force-pushed the gwo-impl branch from c671146 to d2bab85 Compare May 2, 2024 16:24

aaraney reviewed May 2, 2024

View reviewed changes

aaraney and others added 5 commits May 2, 2024 12:48

chore: add type hints and improve signatures

3c9f431

refactor: use ctx manager; move code to .

8d03bc4

chore: add return type hint

3ee74f6

Co-authored-by: Robert Bartel <37884615+robertbartel@users.noreply.github.com>

chore: improve doc type hints

1dca37d

Co-authored-by: Robert Bartel <37884615+robertbartel@users.noreply.github.com>

style: use google style doc strings

a12cd96

aaraney force-pushed the gwo-impl branch from 4ae036d to a12cd96 Compare May 2, 2024 16:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactored and integrated Grey Wolf Optimizer #121

Refactored and integrated Grey Wolf Optimizer #121

hellkite500 commented Apr 30, 2024

xfeng2021 commented May 1, 2024

robertbartel left a comment

aaraney May 2, 2024

robertbartel May 2, 2024

robertbartel May 2, 2024

aaraney May 2, 2024

hellkite500 May 23, 2024

robertbartel May 2, 2024

robertbartel May 2, 2024

robertbartel May 2, 2024

robertbartel May 2, 2024

robertbartel May 2, 2024

aaraney May 2, 2024

robertbartel May 2, 2024

aaraney May 2, 2024

aaraney May 1, 2024

aaraney May 1, 2024

aaraney May 1, 2024

aaraney May 1, 2024

aaraney May 1, 2024

aaraney May 2, 2024

aaraney May 2, 2024

aaraney May 2, 2024

aaraney May 2, 2024

	hitory data and the best, otherwise extract cost for current iteration.
	history data and the best, otherwise extract cost for current iteration.

		@@ -0,0 +1 @@
		from .._optimizers.grey_wolf import GreyWolfOptimizer

		@@ -0,0 +1,3 @@
		from .grey_wolf import GreyWolfOptimizer

		__all__ = [GreyWolfOptimizer]

	__all__ = [GreyWolfOptimizer]
	__all__ = ["GreyWolfOptimizer"]

-def create_swarm(
+def create_swarm(
+    n_particles: int,
+    dimensions: int,
+    bounds: Optional[Tuple[Union[np.ndarray, List], Union[np.ndarray, List]]] = None,
+    center: Optional[Union[np.ndarray, float]] = 1.0,
+    init_pos: Optional[np.ndarray] = None,
+    options: Optional[Dict[Any, Any]] = None,
+):

Refactored and integrated Grey Wolf Optimizer #121

Are you sure you want to change the base?

Refactored and integrated Grey Wolf Optimizer #121

Conversation

hellkite500 commented Apr 30, 2024

Additions

Testing

Notes

Todos

Checklist

Target Environment support

xfeng2021 commented May 1, 2024

robertbartel left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment