Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

src/dnadiffusion rewrite #150

Merged
merged 4 commits into from
Jun 10, 2023
Merged

src/dnadiffusion rewrite #150

merged 4 commits into from
Jun 10, 2023

Conversation

ssenan
Copy link
Collaborator

@ssenan ssenan commented Jun 7, 2023

This is a large change to remove a loss spike issue that occurred after splitting single script version code into separate files

Resolves #149

Code changes

  • Moves code back towards using builtin pytorch (with the exception of huggingface accelerate being used to assist with distributed training)
  • Hydra-zen train loop is still a wip, so train_hf.py contains the main train call that can be linked to the slurm script for distributed training
  • sample.py is used to load a checkpoint and generate cell-specific sequences for validation
  • dnadiffusion.py contains all code in a single script that can also be used for training
  • in the top directory of notebooks there are now two notebooks: master_dataset.ipynb and filter_master.ipynb, which show how our original data was collated for our complete table and then filtered down to our current working set
  • Final major change is that diffusion functions have been collected into a class and this class has been integrated into the main trainloop (src/dnadiffusion/utils/train_util.py)
  • There are a multitude of other small changes made to accommodate these larger changes

Code should now be more readable in the single script version dnadiffusion.py and more extensible in src/dnaddifusion

@ssenan ssenan added enhancement New feature or request codebase breaking Breaking Changes labels Jun 7, 2023
@ssenan ssenan added this to the 0.0.0 milestone Jun 7, 2023
@ssenan ssenan self-assigned this Jun 7, 2023
@ssenan ssenan added the refactoring Refactoring label Jun 7, 2023
dnadiffusion.py Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Jun 10, 2023

Codecov Report

Merging #150 (2203dc3) into main (b4ba5aa) will increase coverage by 1.04%.
The diff coverage is 0.00%.

@@           Coverage Diff            @@
##            main    #150      +/-   ##
========================================
+ Coverage   1.72%   2.76%   +1.04%     
========================================
  Files         18      12       -6     
  Lines       1278     795     -483     
  Branches     117      88      -29     
========================================
  Hits          22      22              
+ Misses      1256     773     -483     
Impacted Files Coverage Δ
src/dnadiffusion/data/dataloader.py 0.00% <0.00%> (ø)
src/dnadiffusion/metrics/metrics.py 0.00% <0.00%> (ø)
src/dnadiffusion/models/diffusion.py 0.00% <0.00%> (ø)
src/dnadiffusion/models/layers.py 0.00% <ø> (ø)
src/dnadiffusion/models/unet.py 0.00% <0.00%> (ø)
src/dnadiffusion/utils/sample_util.py 0.00% <0.00%> (ø)
src/dnadiffusion/utils/train_util.py 0.00% <0.00%> (ø)
src/dnadiffusion/utils/utils.py 0.00% <0.00%> (ø)

@cameronraysmith cameronraysmith self-requested a review June 10, 2023 03:26
@cameronraysmith cameronraysmith merged commit 8776054 into pinellolab:main Jun 10, 2023
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking Breaking Changes codebase enhancement New feature or request refactoring Refactoring
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Resolve training loss spike
2 participants