Stabilize torch.topk() behavior #290

melihyilmaz · 2024-02-09T00:50:26Z

Addresses #284

To make sure we get the padding token at index '0' as the top scoring token for finished beams at each decoding step, I added a small epsilon of 1e-8 to index '0' in finished_mask so that it's not zeroed out like the rest of values on the same row. Again, these zero rows correspond to finished beams and we use masking to avoid extending them with new AA tokens.

This seems to resolve the error with minimal/no overhead (I get the same output on both CPU and GPU for the problematic mgf files mentioned in the issue) but unit tests still need to be added (@bittremieux feel free if you get a chance) and I'm open to moving away from torch.topk() if there are suggestions for a more robust or elegant solution.

codecov · 2024-02-09T00:53:55Z

Codecov Report

All modified and coverable lines are covered by tests ✅

Comparison is base (f01c607) 89.74% compared to head (7da1e2b) 89.09%.

❗ Current head 7da1e2b differs from pull request most recent head c58817e. Consider uploading reports for the commit c58817e to get more accurate results

Additional details and impacted files

@@            Coverage Diff             @@
##              dev     #290      +/-   ##
==========================================
- Coverage   89.74%   89.09%   -0.66%     
==========================================
  Files          12       12              
  Lines         917      917              
==========================================
- Hits          823      817       -6     
- Misses         94      100       +6

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

bittremieux

I still don't find it super elegant, but it might be a practical fix for now. I combined some code to remove a few redundant steps, but the fix is the same.

@melihyilmaz Can you try to come up with a unit test that failed before the fix and runs now? So we can ensure that this is properly tested and avoid regression issues in the future.

melihyilmaz · 2024-02-13T01:01:13Z

I add a unit test for _get_topk_beams() that only passes with the current fix. I also compared results for a small mgf on CPU vs. GPU and those are identical now.

lutfia95 · 2024-02-15T21:12:14Z

Hey, how can I run casanovo from source code? I had same problem under CPU, as the fix is in dev branch, I would like to use it before the new release.

wsnoble · 2024-02-15T21:16:08Z

You should be able to install from the dev branch like this:

pip install git+https://github.com/Noble-Lab/casanovo.git@dev

* Remove `train_from_scratch` config option (#275) Instead of having to specify `train_from_scratch` in the config file, training will proceed from an existing model weights file if this is given as an argument to `casanovo train`. Fixes #263. * Stabilize torch.topk() behavior (#290) * Add epsilon to index zero * Fix typo * Use base PyTorch for repeating along the vocabulary size * Combine masking steps * Lint with updated black version * Lint test files * Add topk unit test * Fix lint * Add fixme comment for future * Update changelog * Generate new screengrabs with rich-codex --------- Co-authored-by: Wout Bittremieux <wout@bittremieux.be> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Update changelog --------- Co-authored-by: Melih Yilmaz <32707537+melihyilmaz@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

bittremieux · 2024-02-16T08:37:40Z

@lutfia95 We've now released Casanovo v4.1.0 as well that includes this fix, so you can more conveniently now upgrade from PyPI as well.

* Remove `train_from_scratch` config option (#275) Instead of having to specify `train_from_scratch` in the config file, training will proceed from an existing model weights file if this is given as an argument to `casanovo train`. Fixes #263. * Stabilize torch.topk() behavior (#290) * Add epsilon to index zero * Fix typo * Use base PyTorch for repeating along the vocabulary size * Combine masking steps * Lint with updated black version * Lint test files * Add topk unit test * Fix lint * Add fixme comment for future * Update changelog * Generate new screengrabs with rich-codex --------- Co-authored-by: Wout Bittremieux <wout@bittremieux.be> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> * Rename max_iters to cosine_schedule_period_iters (#300) * Rename max_iters to cosine_schedule_period_iters * Add deprecated config option unit test * Fix missed rename * Proper linting * Remove unnecessary logging * Test that checkpoints with deprecated config options can be loaded * Minor change * Add test for fine-tuning with deprecated config options * Remove deprecated hyperparameters during model loading * Include deprecated hyperparameter warning * Test whether the warning is issued * Verify that the deprecated option is removed * Fix comments * Avoid defining deprecated options twice * Remap previous renamed config option `every_n_train_steps` * Update changelog --------- Co-authored-by: melihyilmaz <yilmazmelih97@gmail.com> * Add FAQ entry about antibody sequencing * Don't crash when multiple beams have identical peptide scores (#306) * Test different beams with identical scores * Randomly break ties for beams with identical peptide score * Update changelog * Don't remove unit test * Allow csv to handle all newlines (#316) * Add 9-species model weights link to FAQ (#303) * Add model weights link * Generate new screengrabs with rich-codex * Clarify that these weights should only be used for benchmarking --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Wout Bittremieux <wout@bittremieux.be> * Add FAQ entry about antibody sequencing (#304) * Add FAQ entry about antibody sequencing * Generate new screengrabs with rich-codex --------- Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Melih Yilmaz <32707537+melihyilmaz@users.noreply.github.com> * Allow csv to handle all newlines The `csv` module tries to handle newlines itself. On Windows, this leads to line endings of `\r\r\n` instead of `\r\n`. Setting `newline=''` produces the intended output on both platforms. * Update CHANGELOG.md * Fix linting issue * Delete docs/images/help.svg --------- Co-authored-by: Melih Yilmaz <32707537+melihyilmaz@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Wout Bittremieux <wout@bittremieux.be> Co-authored-by: William Stafford Noble <wnoble@uw.edu> Co-authored-by: Wout Bittremieux <bittremieux@users.noreply.github.com> * Don't test on macOS versions with MPS (#327) * Prepare for release v4.2.0 * Update CHANGELOG.md (#332) --------- Co-authored-by: Melih Yilmaz <32707537+melihyilmaz@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: melihyilmaz <yilmazmelih97@gmail.com> Co-authored-by: wsnoble <wnoble@uw.edu> Co-authored-by: Joshua Klein <mobiusklein@gmail.com>

melihyilmaz added 2 commits February 8, 2024 16:36

Add epsilon to index zero

96aa734

Fix typo

e6b8149

melihyilmaz added the bug Something isn't working label Feb 9, 2024

melihyilmaz requested a review from bittremieux February 9, 2024 00:50

bittremieux added 3 commits February 10, 2024 08:36

Use base PyTorch for repeating along the vocabulary size

0c163af

Combine masking steps

11630ca

Lint with updated black version

5ebcbf6

bittremieux requested changes Feb 10, 2024

View reviewed changes

bittremieux and others added 3 commits February 10, 2024 08:49

Lint test files

83219f4

Add topk unit test

f53e90b

Fix lint

c040ccd

melihyilmaz marked this pull request as ready for review February 13, 2024 01:01

melihyilmaz requested a review from bittremieux February 13, 2024 01:01

Add fixme comment for future

95435dc

bittremieux approved these changes Feb 14, 2024

View reviewed changes

bittremieux and others added 2 commits February 14, 2024 14:19

Update changelog

7da1e2b

Generate new screengrabs with rich-codex

c58817e

bittremieux merged commit be066c4 into dev Feb 14, 2024

bittremieux deleted the fix_topk branch February 14, 2024 13:38

bittremieux mentioned this pull request Feb 14, 2024

KeyError: '$' #284

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Stabilize torch.topk() behavior #290

Stabilize torch.topk() behavior #290

melihyilmaz commented Feb 9, 2024

codecov bot commented Feb 9, 2024 •

edited

Loading

bittremieux left a comment

melihyilmaz commented Feb 13, 2024 •

edited

Loading

lutfia95 commented Feb 15, 2024

wsnoble commented Feb 15, 2024

bittremieux commented Feb 16, 2024

Stabilize torch.topk() behavior #290

Stabilize torch.topk() behavior #290

Conversation

melihyilmaz commented Feb 9, 2024

codecov bot commented Feb 9, 2024 • edited Loading

Codecov Report

bittremieux left a comment

Choose a reason for hiding this comment

melihyilmaz commented Feb 13, 2024 • edited Loading

lutfia95 commented Feb 15, 2024

wsnoble commented Feb 15, 2024

bittremieux commented Feb 16, 2024

codecov bot commented Feb 9, 2024 •

edited

Loading

melihyilmaz commented Feb 13, 2024 •

edited

Loading