Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: add sd_locs_vcf_index input to SDtoBAF #644

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

simojoe
Copy link

@simojoe simojoe commented Feb 12, 2024

Fixes #643

Adding the index in-place from the main file name.

I have not found any other reference to SDtoBAF that needs to be updated.

@VJalili
Copy link
Member

VJalili commented Feb 14, 2024

@simojoe, thank you for fixing this! Optional index files can be confusing, and your suggested fix fits perfectly with what we discussed in #578.

I want to suggest an improvement on your approach; can you please add an optional input sd_locs_vcf_index to the WDL, such that, if sv_locs_vcf is given, the sd_locs_vcf_index should also be given, and if sd_locs_vcf_index is not given, then infer it from sv_locs_vcf.

Copy link
Collaborator

@mwalker174 mwalker174 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @simojoe. I was at first surprised since we haven't seen this before but then noticed you are using the gzipped version of the dbsnp vcf. We currently reference the uncompressed version, which is why it didn't require an index. Unfortunately, if we ran your change with the uncompressed version it would fail because its index does not have a .tbi extension:

> gsutil ls "gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf*"
gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf
gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf.gz
gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf.gz.tbi
gs://gcp-public-data--broad-references/hg38/v0/Homo_sapiens_assembly38.dbsnp138.vcf.idx

I think we should be using the compressed one instead, but this will affect the code base in 2 other places:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

task SDtoBAF calls SiteDepthtoBAF without passing required index file
3 participants