-
Notifications
You must be signed in to change notification settings - Fork 238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add seed fixing option to PySMO's sampling methods to enhance reproducibility #1307
Conversation
- User can explicitly define a distribution for sampling of each variable. Sampling options currently available are random, uniform and Gaussian.
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #1307 +/- ##
=======================================
Coverage 77.41% 77.41%
=======================================
Files 390 390
Lines 63631 63649 +18
Branches 11702 11705 +3
=======================================
+ Hits 49258 49273 +15
- Misses 11835 11839 +4
+ Partials 2538 2537 -1 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few suggestions
@@ -1412,6 +1422,12 @@ def __init__( | |||
raise Exception("Invalid tolerance input") | |||
self.eps = tolerance | |||
|
|||
if rand_seed is not None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing as this is in __init__
, should this code be moved to the base class to avoid duplicating the code in each subclass?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I'd rather leave it here because it is an argument specific to a subset of the sampling methods.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Summary/Motivation:
This PR adds the ability for users to specify a seed with
rand_seed
when using theLatinHypercubeSampling
,CVTSampling
andCustomSampling
classes in PySMO. Fixingrand_seed
ensures that the sampling methods return the same results every time the methods are run, allowing for reproducibility.Changes proposed in this PR:
rand_seed
argument to CVT, LHS and custom sampling methodsLegal Acknowledgement
By contributing to this software project, I agree to the following terms and conditions for my contribution: