Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EVA-3457 - Split variant load #195

Merged
merged 5 commits into from
Jan 15, 2024

Conversation

tcezard
Copy link
Member

@tcezard tcezard commented Jan 9, 2024

Split the variant load nextflow process into

  1. variant load
  2. annotation run
  3. statistics calculation

Also remove the option to run annotation only.

@tcezard tcezard changed the title EVA-3457 - EVA-3457 - Split variant load Jan 9, 2024
Copy link
Contributor

@apriltuesday apriltuesday left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I mentioned in the EVA-3455 PR, we might be able to combine the two variant load jobs into one which would change this PR slightly... otherwise this looks fine to me.


pipeline_parameters += " --spring.batch.job.names=calculate-statistics-job"

pipeline_parameters += " --input.vcf.aggregation=" + aggregation.toString().toUpperCase()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I'm understanding the old job configurations correctly, we should only run statistics calculation for genotyped vcf.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point and one I was not aware of.
I'm a bit surprised by this considering that there are both variant level and study level statistics in the mongodb database. Maybe the study level is calculated in the Variant Load step.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if this is why this assertion passes in your test, even though the stats step hasn't been run...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes that might be something else to investigate

eva_submission/nextflow/accession_and_load.nf Outdated Show resolved Hide resolved
eva_submission/nextflow/accession_and_load.nf Show resolved Hide resolved
@tcezard tcezard merged commit e6de0cb into EBIvariation:master Jan 15, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants