Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update references to use multi-region bucket #86

Merged
merged 2 commits into from
Oct 1, 2024

Conversation

jonbrenas
Copy link
Collaborator

@jonbrenas jonbrenas commented Sep 11, 2024

Addresses #85.

The link to the vobs-funestus project also used the old projects path which I deemed could be sorted out at the same time.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@jonbrenas
Copy link
Collaborator Author

@ahernank : Am I supposed to run the notebooks after they are modified? I am thinking particularly about [af1/|ag3/]download.ipynb that show examples of how to download data. They seem to be only half-run in the current version of the VUG (and the part that is not run would raise an error).

@ahernank
Copy link
Collaborator

ahernank commented Sep 11, 2024

Thanks @jonbrenas -- yes, it would be great if you could re-run them. Usually what we try to do, is that we run them (as our check to ensure all code works as expected), and then do a second run clearing all outputs and keeping only the cells we want to show (i.e. clearing anything that it's too messy). That's why they appear half run. When you run them, it is only to make sure the path exists and it is correct in the example, no need to wait for the full download to finish, as some of those examples have huge downloads!

@jonbrenas
Copy link
Collaborator Author

jonbrenas commented Sep 11, 2024

Thank you @ahernank . That's what I expected but, for instance, in af1/download.ipynb, one cell does:

!wget --no-clobber https://1229-vo-gh-dadzie-vmf00095.cog.sanger.ac.uk/VBS24195.vcf.gz
!wget --no-clobber https://1229-vo-gh-dadzie-vmf00095.cog.sanger.ac.uk/VBS24195.vcf.gz.tbi

and the next does:

!bcftools merge --output-type z --regions 3RL:1-1000000 --output merged.vcf.gz VBS24195.vcf.gz VBS24196.vcf.gz 

which is going to raise an error because VBS24196.vcf.gz was not downloaded. Should I modify the code so that all cells can be run without raising an error?

@jonbrenas
Copy link
Collaborator Author

A quick grep seemed to indicate that cloud.ipynb and download.ipynb were the only files that referenced the actual bucket.

@alimanfoo
Copy link
Member

Thank you @ahernank . That's what I expected but, for instance, in af1/download.ipynb, one cell does:

!wget --no-clobber https://1229-vo-gh-dadzie-vmf00095.cog.sanger.ac.uk/VBS24195.vcf.gz
!wget --no-clobber https://1229-vo-gh-dadzie-vmf00095.cog.sanger.ac.uk/VBS24195.vcf.gz.tbi

and the next does:

!bcftools merge --output-type z --regions 3RL:1-1000000 --output merged.vcf.gz VBS24195.vcf.gz VBS24196.vcf.gz 

which is going to raise an error because VBS24196.vcf.gz was not downloaded. Should I modify the code so that all cells can be run without raising an error?

Hi @jonbrenas, no need to rerun the bcftools merge command, I think we can just assume that's correct.

@alimanfoo alimanfoo merged commit 56c5f98 into master Oct 1, 2024
1 check passed
@alimanfoo alimanfoo deleted the 85-referencemultiregionbucket branch October 1, 2024 11:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants