-
Notifications
You must be signed in to change notification settings - Fork 18
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactored and integrated Grey Wolf Optimizer #121
base: master
Are you sure you want to change the base?
Conversation
This commit combines two files implementing the grey wolf optimizer as a pyswarm sublcass. The code from each as originaly authored by Xia Feng are included here in the history. These will be refactored and refined to fit into the general ngen_cal architecture and assist in maintaining the implementation in the future. Co-authored-by: hellkite500 <nels.frazier@noaa.gov>
Thanks for the updates. |
…t gwo in public `optimizers` This refactor adds a layer of indirection to safeguard the `optimizers` public api. The `_optimizers` subpackage provides a place for _both_ stable and unstable / experimental optimizers to live. Once an optimizer's api and features are fleshed out, it can then be imported by the `optimizers` subpackage in `__init__.py` effectively moving it into the public api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have several things I think need at least sanity checks. I got a bit nit-picky, especially on some simple style things, though I tried to offer suggestion inline for those things.
"""_summary_ | ||
|
||
Args: | ||
SwarmOptimizer (_type_): _description_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, this is just @hellkite500's IDE defaults. @hellkite500, can you add docs for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A bit of an aside: I originally started my review for grey_wolf.py
before it got moved to the private subpackage, but didn't finish before it got moved, and Github made me refresh and, seemingly, required I start over in the "other" grey_wolf.py
. So some of my comments may appear to be duplicated.
@@ -36,6 +36,7 @@ install_requires = | |||
hydrotools.metrics | |||
hydrotools.nwis_client | |||
pyarrow | |||
pyswarms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we add pyswarms
to requirements.txt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah? I am inclined to remove the root repo requirements.txt
in favor of a ns package level requirements.txt
instead. In either case, we should use pip freeze
to generate requirements.txt
instead of just listing out all the deps.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets open this as a TODO in an issue and cycle back to this.
df (DataFrame): dataframe containing swarm history or cost | ||
name (str): history or attribute variable to extract | ||
iter_range (Optional[List], optional): If provided, only extract | ||
hitory data and the best, otherwise extract cost for current iteration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hitory data and the best, otherwise extract cost for current iteration. | |
history data and the best, otherwise extract cost for current iteration. |
|
||
pos = [np.array(df[df['iteration']==i].iloc[:,0:self.dimensions]) for i in iter_range] | ||
current_pos = pos[len(pos)-1] | ||
return current_pos, pos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like at least something isn't right here. pos
seems like it has to be a List[np.ndarray]
, which would make current_pos
an np.ndarray
rather than an int
. But I'm not sure if the hints are the off or you want the function to be doing something slightly different.
df.to_csv(fname, index=False) | ||
return df | ||
|
||
def read_hist_iter_file(self) -> Tuple: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function isn't actually returning anything, despite the hint and docstring.
"""Update history.""" | ||
_hist={ a: getattr(self.swarm, a, None) for a in self._cost_attrs+self._pos_attrs} | ||
_hist['mean_pbest_cost'] = np.mean(self.swarm.pbest_cost) | ||
_hist['mean_leader_cost'] = np.mean(self.swarm.leader_cost) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The leader_cost
attribute for self.swarm
seems to be getting added dynamically. Just as a sanity check, are we sure update_history
cannot be called on an instance for which self.swarm.leader_cost
has not been added?
@@ -0,0 +1 @@ | |||
from .._optimizers.grey_wolf import GreyWolfOptimizer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait, why move GreyWolfOptimizer to the private package but import it here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This refactor adds a layer of indirection to safeguard the optimizers
public api. The _optimizers
subpackage provides a place for both
stable and unstable / experimental optimizers to live. Once an
optimizer's api and features are fleshed out, it can then be imported by
the optimizers
subpackage in __init__.py
effectively moving it into
the public api.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotcha. I was thrown off a bit by having in my head that the class was still experimental, since the description for the PR notes it not being exposed yet.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it is a little confusing. I also thought we should not expose it to the public api until we have it "hooked up," but @hellkite500 thought otherwise.
@@ -0,0 +1,3 @@ | |||
from .grey_wolf import GreyWolfOptimizer | |||
|
|||
__all__ = [GreyWolfOptimizer] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think __all__
is needed in this case since we aren't importing anything other than GreyWolfOptimizer
in this module. __all__
controls what gets imported when you wildcard import (e.g. from ngen.cal.optimizers import *
) a module. Regardless, its kind of annoying, but the members of __all__
must be strings. So, if we choose to keep this, this will need to be changed to this:
__all__ = [GreyWolfOptimizer] | |
__all__ = ["GreyWolfOptimizer"] |
@@ -36,6 +36,7 @@ install_requires = | |||
hydrotools.metrics | |||
hydrotools.nwis_client | |||
pyarrow | |||
pyswarms |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't have to do this here, but we should move deps like pyswarms
to an [options.extras_require]
. This would let people install ngen.cal[particalswarm]
for example. So, just pull in what you need.
import time | ||
from typing import Tuple, List, Optional | ||
|
||
def create_swarm( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
def create_swarm( | |
def create_swarm( | |
n_particles: int, | |
dimensions: int, | |
bounds: Optional[Tuple[Union[np.ndarray, List], Union[np.ndarray, List]]] = None, | |
center: Optional[Union[np.ndarray, float]] = 1.0, | |
init_pos: Optional[np.ndarray] = None, | |
options: Optional[Dict[Any, Any]] = None, | |
): |
bounds=None, | ||
center=1.0, | ||
init_pos=None, | ||
options={} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs to default to None
and passed as an {}
if it is None
. Using a mutable type as a default argument is a foot gun in python.
|
||
|
||
class GreyWolfOptimizer(SwarmOptimizer): | ||
"""_summary_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you write up a doc string for this or just remove it, please?
""" | ||
|
||
# Apply verbosity | ||
if self.start_iter>0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there are a lot of style things that I would like to improve in this work. I will run something like black
to format this before we merge.
# Setup Pool of processes for parallel evaluation | ||
pool = None if n_processes is None else mp.Pool(n_processes) | ||
|
||
ftol_history = deque(maxlen=self.ftol_iter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Close Pool of Processes | ||
if n_processes is not None: | ||
pool.close() | ||
return (final_best_cost, final_best_pos) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Totally agree, this will get reformatted before we merge.
See `pyswarms.base.SwarmOptimizer`'s documentation for more information. | ||
|
||
Args: | ||
SwarmOptimizer (_type_): _description_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hellkite500, can you add some docs here. In this comment is fine too if you want me to just push them up and add you as the author.
Co-authored-by: Robert Bartel <37884615+robertbartel@users.noreply.github.com>
Co-authored-by: Robert Bartel <37884615+robertbartel@users.noreply.github.com>
As noted in #99, a Grey Wolf Optimizer was implemented by @xfeng2021. This original implementation is captured in the history in 92abc92 and a small suite of unit tests were developed to exercise that implementation for its three key semantics -- creation, optimization, and "restart" from a non-zero iteration.
A few minor fixes were required on the original code to get these tests to work correctly, and then a refactor an consolidation of the code was applied.
Note that the PR doesn't currently expose this optimizer for use to
search.py
or__main__
just yet. That functionality is being reviewed from the original implementation and I can add those to this PR or open a new one if this gets merged.Additions
ngen.cal.optimizers
subpackagegrey_wolf.py
implementation of Grey Wolf Optmizationtest_gwo.py
for testing the grey wolf implementationTesting
Notes
Todos
Checklist
Target Environment support