Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Add infrastructure for fixed random seeds in tests #1655

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

betatim
Copy link
Member

@betatim betatim commented Mar 1, 2017

Proposal for fixing random seeds in tests to make them reproducible.

The basic idea is that if you need randomness, you call check_random_state(seed) and that you expose the seed argument to your callers. If they don't care they can not pass anything in, in which case seed=None and we get random numbers that aren't reproducible. If they do care they can either pass in a number or an existing random number generator.

@standage this is what I had in mind. Shamelessly cribbed from http://scikit-learn.org/stable/modules/generated/sklearn.utils.check_random_state.html

We also use this approach in scikit-optimize.

  • Is it mergeable?
  • make test Did it pass the tests?
  • make clean diff-cover If it introduces new functionality in
    scripts/ is it tested?
  • make format diff_pylint_report doc pydocstyle Is it well
    formatted?
  • Did it change the command-line interface? Only backwards-compatible
    additions are allowed without a major version increment. Changing file
    formats also requires a major version number increment.
  • For substantial changes or changes to the command-line interface, is it
    documented in CHANGELOG.md? See keepachangelog
    for more details.
  • Was a spellchecker run on the source code and documentation after
    changes were made?
  • Do the changes respect streaming IO? (Are they
    tested for streaming IO?)

@betatim betatim changed the title Add infrastructure for fixed random seeds in tests [WIP] Add infrastructure for fixed random seeds in tests Mar 1, 2017
@standage
Copy link
Member

standage commented Mar 2, 2017

Got it.

So at some point we need to print out the random seed being used so that when tests fail we can reproduce their behavior, right?

@betatim
Copy link
Member Author

betatim commented Mar 2, 2017

Not really. If we do everything right in writing the tests they always use the same seed on all the platforms. The only time things change is when you reorder/change number of calls to the RNG. Though usually you don't do that for debugging a failure. Alternatively you can go in and fix the seeds for the intermediate steps.

(I've seen it done that you provide different seeds as parameters to a test_blah(seed) method if you wanted to run the same test for different seeds. Though I would argue that the value of the seed shouldn't change the outcome of your test)

@betatim
Copy link
Member Author

betatim commented Mar 2, 2017

If you like it, I will spread the approach further around the tests.

@standage
Copy link
Member

standage commented Mar 2, 2017

If we do everything right in writing the tests they always use the same seed on all the platforms.

Ah, I see. I misunderstood.

In contrast, the GenomeTools library (to which I was previously a frequent contributor) has the CI generate and report a new random seed for each build. This way it's are testing the overall robustness of the randomized procedures, not just for a single seed. It was pretty rare, but every once in a while one of the tests would fail on an edge case, and the seed would enable us to track down exactly what happened.

Of course, this often requires a different approach to evaluating the output, so I don't know how relevant it would be here, especially in the short term.

@codecov-io
Copy link

codecov-io commented Mar 3, 2017

Codecov Report

Merging #1655 into master will increase coverage by 0.01%.
The diff coverage is 88.88%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #1655      +/-   ##
==========================================
+ Coverage   69.81%   69.82%   +0.01%     
==========================================
  Files          66       66              
  Lines        8971     8978       +7     
  Branches     3060     3063       +3     
==========================================
+ Hits         6263     6269       +6     
- Misses       1025     1027       +2     
+ Partials     1683     1682       -1
Impacted Files Coverage Δ
khmer/utils.py 98.47% <88.88%> (-0.71%) ⬇️
lib/hashtable.cc 55.18% <0%> (-0.22%) ⬇️
lib/hashgraph.cc 46.49% <0%> (-0.13%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fbbb3c1...b188902. Read the comment docs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants