Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate ReshardVcf into ResolveComplexVariants #626

Merged
merged 1 commit into from
Jun 25, 2024
Merged

Conversation

mwalker174
Copy link
Collaborator

Adds a call to ReshardVcf at the end of ResolveComplexVariants. In addition, each of the BOTHSIDES_PASS and HIGH_SR_BACKGROUND contig-sharded variant tables is concatenated into a single genome-wide table prior to annotating records with these flags in CleanVcf. This has a slight cost in memory footprint in CleanVcf1a but ensures that shuffled records are properly annotated. Json templates and top-level workflows are also updated.

This branch was successfully tested on ResolveComplexVariants, GenotypeComplexVariants, and CleanVcf.

Copy link
Collaborator

@epiercehoffman epiercehoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good, thanks for figuring out how to do the resharding efficiently and getting it integrated into our main workflows. Just one small update to make and then go ahead and merge

File contig_list
String prefix
Boolean? use_ssd
String sv_base_mini_docker
RuntimeAttr? runtime_override_reshard
}

Array[String] contigs = read_lines(contig_list)
Array[String] contigs = transpose(read_tsv(contig_list))[0]
Copy link
Collaborator

@epiercehoffman epiercehoffman May 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to update the ReshardVcf JSON templates (test & terra) to use the primary_contigs_fai to match this change - it looks like it was initially set up with primary_contigs_list

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It runs ok with the contigs list too

Add json template

Update top-level workflows; concat variant lists

Fix contig list read

Update ref panel outputs; fix CleanVcf template

Fix Terra CleanVcf templates
@mwalker174 mwalker174 merged commit 2295aa8 into main Jun 25, 2024
6 checks passed
@mwalker174 mwalker174 deleted the mw_vcf_reshard branch June 25, 2024 15:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants