Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: CLIN-3090 run test in stub mode in github workflow #13

Merged

Conversation

LysianeBouchard
Copy link
Contributor

@LysianeBouchard LysianeBouchard commented Aug 26, 2024

Allow to run the post processing pipeline in stub mode in the github workflow ci.yml.

We adjusted a few things to make this work:

  • fix bugs in test sample sheet and make it cover both WGS an WES sequencing types
  • complete the test profile and add missing required parameters in schema
  • adjusted the github workflow command to add the -stub option

We also take the oppotunity to do a bit of cleanup on the github workflow files:

  • align trigger rules with ferlab git flow (sadly had to disable 2 linter tests to allow this)
  • use the same nextflow version as in production (23.10.1)

Other changes:

  • We realized while testing on juno that the docker image for process writemeta was missing, so we add it in nextflow.config
  • We also realized that the changelog was not displayed correctly in the UI. We fixed that.
  • Fix bug with extra java arguments in process CombineGVCF

Test description

We make sure that github workflow is executed correctly on this PR

Local tests:

  • We run nf-core lint and check that no test is failing and that there are no new warnings caused by the changes here. Actually, we got rid of 3 warnings (TODO keywords found in doc).
  • Check that the following commands work locally:
nextflow run main.nf -stub -profile test,docker
nextflow run main.nf -stub -profile test

Tests on juno:

Was able to run a job successfully:

 nextflow -c /root/nextflow/config/fusion.config  -c tweaks.config run Ferlab-Ste-Justine/Post-processing-Pipeline -r fix/CLIN-3090-run-test-in-stub-mode-in-github-workflow -params-file params.json -work-dir s3://cqdg-prod-file-scratch/nextflow/scratch/lysiane

I started from remi antoine's test dataset. I found 2 bugs while doing that:

  • gvcf file extension .g.vcf.gz not supported
  • For some processes, it is assumed that both extra arguments and extra java options are passed via ext.args, which does not work.

I think that they should be addressed in dedicated PRs. For the first bug, I simply copied my input file into a new file with extension .gvcf. For the second bug, I commited a fix specifically for the process causing problem for my test, but the problem is present in other processes as well.

PR checklist

  • This comment contains a description of changes (with reason).
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added a new tool - have you followed the pipeline conventions in the contribution docs
  • Make sure your code lints (nf-core lint).
  • Ensure the test suite passes (nextflow run . -profile test,docker --outdir <OUTDIR>).
  • Check for unexpected warnings in debug mode (nextflow run . -profile debug,test,docker --outdir <OUTDIR>).
  • Usage Documentation in docs/usage.md is updated.
  • Output Documentation in docs/output.md is updated.
  • CHANGELOG.md is updated.
  • README.md is updated (including new tool citations and authors/contributors).

@LysianeBouchard LysianeBouchard marked this pull request as draft August 26, 2024 16:31
@LysianeBouchard LysianeBouchard force-pushed the fix/CLIN-3090-run-test-in-stub-mode-in-github-workflow branch 4 times, most recently from 282d9ef to d4a67f6 Compare August 26, 2024 18:13
Had to do many adjustments to make it work:
- fix bugs in test sample sheet and make it cover both WGS an WES sequencing types
- complete the test profile and add missing required parameters in schema
- adjusted the github workflow command to add the -stub option

Also took the opportunity to do a bit of clean up:
- align workflow trigger rule with ferlab git flow (had to disable linter tests to allow this)
- run the test with the same nextflow version as in production (23.10.1)
- add missing docker image for process writemeta
- make sure that the changelog renders correctly in the ui
- fix a bug for process genotypeGVCF with extra java arguments
@LysianeBouchard LysianeBouchard force-pushed the fix/CLIN-3090-run-test-in-stub-mode-in-github-workflow branch from d4a67f6 to 764f08e Compare August 26, 2024 18:21
@LysianeBouchard LysianeBouchard marked this pull request as ready for review August 26, 2024 18:31
@@ -120,7 +120,7 @@ process genotypeGVCF {
script:
def familyId = meta.familyId
def args = task.ext.args ?: ''
def argsjava = task.ext.args ?: ''
def argsjava = task.ext.argsjava ?: ''
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug fix. For now, I fixed only this process, but the bug is probably present in many others. I will create a github issue so that we don't forget this.

@@ -120,7 +120,7 @@ process genotypeGVCF {
script:
def familyId = meta.familyId
def args = task.ext.args ?: ''
def argsjava = task.ext.args ?: ''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change needed in Import process too, right ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct, the problem appears for many processes actually, so I will use another PR for the other ones as this one is starting to be a bit fat.
I created a github issue so that we don't forget:
#14

@@ -362,10 +365,10 @@ manifest {
homePage = 'https://github.com/Ferlab-Ste-Justine/Post-processing-Pipeline'
description = """Variant analysis for genome and exome GVCFs"""
mainScript = 'main.nf'
nextflowVersion = '!>=23.04.0'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to write the version on the fly ? Or maybe just get rid of it in the manifest part and put the nextflow version in metadata...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am sorry, I don't understand what you mean exactly. Can you explain? Feel free to call me if it is easier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For info, this change was actually done automatically by nf-core when using command nf-core bump-version --nextflow 23.10.1.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, this does not represent the final nextflow version used to run the pipeline. It is more a documentation about which nextflow versions are supposed to be compatible with the code. We could add the nextflow version to the output of process writemeta, if it is not already present. I plan another PR that will involve modifying/refactoring pipeline initialization and pipeline completion logic. We can do it in this PR.

Copy link
Contributor

@FelixAntoineLeSieur FelixAntoineLeSieur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vu que le PR est sur les stub-mode, on pourrait peut-etre laisser les changements de testsamplesheet et test.config dans #10 et se concentrer ici sur le reste

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bonne idée!. C'est vrai que c'est mieux de faire un stub run pour l'action github, je ne pense pas qu'on pourra faire marcher le pipeline automatiquement tant qu'on aura pas les références sauvegardées quelque part en ligne

Copy link
Contributor Author

@LysianeBouchard LysianeBouchard Aug 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oui bien sûr. En ce moment, le coverage est mieux si je met le stub que si je le mets pas. Sinon, au mieux, il plante au premier process (importGVCF). Là au moins, il passe à travers le pipeline. Quand on aura les fichiers de référence là ça va changer. On aura un meilleur coverage sans le stub. Il faudra enlever cette option à ce moment là.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Juste pour être sur, ici quand on met actions_ci à false, on empêche pas cette action de se réaliser? Seulement le test lint, n'est-ce pas?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ça désactive le linter sur les fichiers des github workflows oui. Ça n'empêche pas le workflow d'être exécuté. J'avais pas le choix si je voulais changer les triggers du workflow.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

J'avais fait un changement dans #10 à cet effet, reste à voir si c'est nécessaire de le garder

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Je vais regarder pour voir si j'aurais qqchose à intégrer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Si tu es prêts à merger pour #10, on peut faire ton code review à toi avant. Ça me dérange pas.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

La modification de mon côté va dans le même sens que sur #10. Ce sont les mêmes gvcfs et grosso modo les memes bug fix. Mon sample sheet est légèrement plus complexe parce qu'il simule une famille avec 2 samples et une famille avec un seul sample. C'était pour couvrir avec/sans le combineGVCF.
Je crois que tu pourrais reprendre le mien dans #10, mais bon si jamais il pose problème, tu pourras le modifier.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dans #10 , j'ai inclu ceci:
`input = "$projectDir/assets/TestSampleSheet.csv"

referenceGenome = "https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/genomics/homo_sapiens/genome/"
referenceGenomeFasta = "genome.fasta"
intervalsFile = "$projectDir/assets/testInterval22.txt"
broad = "$projectDir/data-test/reference/broad"
vepCache = "$projectDir/data-test/reference/annotation/.vep"
hardFilters = [[name: 'QD2', expression: 'QD < 2.0'],
[name: 'QD1', expression: 'QD < 1.0'],
[name: 'QUAL30', expression: 'QUAL < 30.0'],
[name: 'SOR3', expression: 'SOR > 3.0'],
[name: 'FS60', expression: 'FS > 60.0'],
[name: 'MQ40', expression: 'MQ < 40.0'],
[name: 'MQRankSum-12.5', expression: 'MQRankSum < -12.5'],
[name: 'ReadPosRankSum-8', expression: 'ReadPosRankSum < -8.0']]`

Cela implique de télécharger les références de Rémi localement, par contre (les références de Rémi correspondent bien avec les données test de nf-core car il s'agit aussi du chr22). De toute façon, tant qu'on utilise stub-mode les changements ici ne devraient pas avoir beaucoup d'influence

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On pourrait utiliser les memes valeurs que le dataset de Remi tout de suite. Ça ne devrait pas perturber le stub mode effectivement. Les urls vont probablement changer au final, mais on aurait qqchose de plus facile à ajuster.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tel que convenu, je vais les laisser comme ça pour l'instant, mais tu pourras modifier à ta guise sur #10.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

merci, j'avais complètement oublié ça!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

je vois le problème, j'espère qu'on pourra régler ça quand on va transformer ces process en modules

@LysianeBouchard LysianeBouchard merged commit 9d66c21 into main Aug 27, 2024
2 checks passed
@LysianeBouchard LysianeBouchard deleted the fix/CLIN-3090-run-test-in-stub-mode-in-github-workflow branch August 27, 2024 00:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants