diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md index 69065bdb..c923032b 100644 --- a/.github/CONTRIBUTING.md +++ b/.github/CONTRIBUTING.md @@ -16,7 +16,7 @@ If you'd like to write some code for nf-cmgg/germline, the standard workflow is 1. Check that there isn't already an issue about your idea in the [nf-cmgg/germline issues](https://github.com/nf-cmgg/germline/issues) to avoid duplicating work. If there isn't one already, please create one so that others know you're working on this 2. [Fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the [nf-cmgg/germline repository](https://github.com/nf-cmgg/germline) to your GitHub account 3. Make the necessary changes / additions within your forked repository following [Pipeline conventions](#pipeline-contribution-conventions) -4. Use `nf-core schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10). +4. Use `nf-core pipelines schema build` and add any new parameters to the pipeline JSON schema (requires [nf-core tools](https://github.com/nf-core/tools) >= 1.10). 5. Submit a Pull Request against the `dev` branch and wait for the code to be reviewed and merged If you're not used to this workflow with git, you can start with some [docs from GitHub](https://help.github.com/en/github/collaborating-with-issues-and-pull-requests) or even their [excellent `git` resources](https://try.github.io/). @@ -37,7 +37,7 @@ There are typically two types of tests that run: ### Lint tests `nf-core` has a [set of guidelines](https://nf-co.re/developers/guidelines) which all pipelines must adhere to. -To enforce these and ensure that all pipelines stay in sync, we have developed a helper tool which runs checks on the pipeline code. This is in the [nf-core/tools repository](https://github.com/nf-core/tools) and once installed can be run locally with the `nf-core lint ` command. +To enforce these and ensure that all pipelines stay in sync, we have developed a helper tool which runs checks on the pipeline code. This is in the [nf-core/tools repository](https://github.com/nf-core/tools) and once installed can be run locally with the `nf-core pipelines lint ` command. If any failures or warnings are encountered, please follow the listed URL for more documentation. @@ -68,7 +68,7 @@ If you wish to contribute a new step, please use the following coding standards: 2. Write the process block (see below). 3. Define the output channel if needed (see below). 4. Add any new parameters to `nextflow.config` with a default (see below). -5. Add any new parameters to `nextflow_schema.json` with help text (via the `nf-core schema build` tool). +5. Add any new parameters to `nextflow_schema.json` with help text (via the `nf-core pipelines schema build` tool). 6. Add sanity checks and validation for all relevant parameters. 7. Perform local tests to validate that the new code works as expected. 8. If applicable, add a new test command in `.github/workflow/ci.yml`. @@ -79,11 +79,11 @@ If you wish to contribute a new step, please use the following coding standards: Parameters should be initialised / defined with default values in `nextflow.config` under the `params` scope. -Once there, use `nf-core schema build` to add to `nextflow_schema.json`. +Once there, use `nf-core pipelines schema build` to add to `nextflow_schema.json`. ### Default processes resource requirements -Sensible defaults for process resource requirements (CPUs / memory / time) for a process should be defined in `conf/base.config`. These should generally be specified generic with `withLabel:` selectors so they can be shared across multiple processes/steps of the pipeline. A nf-core standard set of labels that should be followed where possible can be seen in the [nf-core pipeline template](https://github.com/nf-core/tools/blob/master/nf_core/pipeline-template/conf/base.config), which has the default process as a single core-process, and then different levels of multi-core configurations for increasingly large memory requirements defined with standardised labels. +Sensible defaults for process resource requirements (CPUs / memory / time) for a process should be defined in `conf/base.config`. These should generally be specified generic with `withLabel:` selectors so they can be shared across multiple processes/steps of the pipeline. A nf-core standard set of labels that should be followed where possible can be seen in the [nf-core pipeline template](https://github.com/nf-core/tools/blob/main/nf_core/pipeline-template/conf/base.config), which has the default process as a single core-process, and then different levels of multi-core configurations for increasingly large memory requirements defined with standardised labels. The process resources can be passed on to the tool dynamically within the process with the `${task.cpus}` and `${task.memory}` variables in the `script:` block. @@ -96,7 +96,7 @@ Please use the following naming schemes, to make it easy to understand what is g ### Nextflow version bumping -If you are using a new feature from core Nextflow, you may bump the minimum required version of nextflow in the pipeline with: `nf-core bump-version --nextflow . [min-nf-version]` +If you are using a new feature from core Nextflow, you may bump the minimum required version of nextflow in the pipeline with: `nf-core pipelines bump-version --nextflow . [min-nf-version]` ### Images and figures diff --git a/.github/ISSUE_TEMPLATE/bug_report.yml b/.github/ISSUE_TEMPLATE/bug_report.yml index 841367d3..7afb0702 100644 --- a/.github/ISSUE_TEMPLATE/bug_report.yml +++ b/.github/ISSUE_TEMPLATE/bug_report.yml @@ -9,46 +9,34 @@ body: description: A clear and concise description of what the bug is. validations: required: true + - type: textarea id: command_used attributes: label: Command used and terminal output - description: Steps to reproduce the behaviour. Please paste the command you used - to launch the pipeline and the output from your terminal. + description: Steps to reproduce the behaviour. Please paste the command you used to launch the pipeline and the output from your terminal. render: console - placeholder: "$ nextflow run ... - + placeholder: | + $ nextflow run ... Some output where something broke - " - type: textarea id: files attributes: label: Relevant files - description: "Please drag and drop the relevant files here. Create a `.zip` archive - if the extension is not allowed. - - Your verbose log file `.nextflow.log` is often useful _(this is a hidden file - in the directory where you launched the pipeline)_ as well as custom Nextflow - configuration files. + description: | + Please drag and drop the relevant files here. Create a `.zip` archive if the extension is not allowed. + Your verbose log file `.nextflow.log` is often useful _(this is a hidden file in the directory where you launched the pipeline)_ as well as custom Nextflow configuration files. - " - type: textarea id: system attributes: label: System information - description: "* Nextflow version _(eg. 23.04.0)_ - + description: | + * Nextflow version _(eg. 23.04.0)_ * Hardware _(eg. HPC, Desktop, Cloud)_ - * Executor _(eg. slurm, local, awsbatch)_ - - * Container engine: _(e.g. Docker, Singularity, Conda, Podman, Shifter, Charliecloud, - or Apptainer)_ - + * Container engine: _(e.g. Docker, Singularity, Conda, Podman, Shifter, Charliecloud, or Apptainer)_ * OS _(eg. CentOS Linux, macOS, Linux Mint)_ - * Version of nf-cmgg/germline _(eg. 1.1, 1.5, 1.8.2)_ - - " diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md index b8950db4..d1e740ce 100644 --- a/.github/PULL_REQUEST_TEMPLATE.md +++ b/.github/PULL_REQUEST_TEMPLATE.md @@ -16,7 +16,7 @@ Learn more about contributing: [CONTRIBUTING.md](https://github.com/nf-cmgg/germ - [ ] This comment contains a description of changes (with reason). - [ ] If you've fixed a bug or added code that should be tested, add tests! - [ ] If you've added a new tool - have you followed the pipeline conventions in the [contribution docs](https://github.com/nf-cmgg/germline/tree/main/.github/CONTRIBUTING.md) -- [ ] Make sure your code lints (`nf-core lint`). +- [ ] Make sure your code lints (`nf-core pipelines lint`). - [ ] Ensure the test suite passes (`nextflow run . -profile test,docker --outdir `). - [ ] Check for unexpected warnings in debug mode (`nextflow run . -profile debug,test,docker --outdir `). - [ ] Usage Documentation in `docs/usage.md` is updated. diff --git a/.github/workflows/build-docs.yml b/.github/workflows/build-docs.yml index be1ba144..55c5b02b 100644 --- a/.github/workflows/build-docs.yml +++ b/.github/workflows/build-docs.yml @@ -13,29 +13,44 @@ jobs: - uses: actions/checkout@v3 with: fetch-depth: 0 # fetch all commits/branches + - uses: actions/setup-python@v4 with: python-version: 3.x - - run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_ENV - - name: Obtain version from nextflow config + + - name: Fetch current date + id: date + run: echo "cache_id=$(date --utc '+%V')" >> $GITHUB_OUTPUT + + - name: Read pipeline version from .nf-core.yml + uses: nichmor/minimal-read-yaml@v0.0.2 + id: read_yml + with: + config: ${{ github.workspace }}/.nf-core.yml + + - name: Parse version + id: version run: | - version=$(grep "version" nextflow.config | tail -1 | sed -e s'/[^=]*= //' | cut -d "'" -f 2) - [[ $version == *"dev"* ]] && pipeline_version="dev" || pipeline_version=$version - echo "pipeline_version=$pipeline_version" >> $GITHUB_ENV + [[ ${{ steps.read_yml.outputs['template.version'] }} == *"dev"* ]] && pipeline_version="dev" || pipeline_version=${{ steps.read_yml.outputs['template.version'] }} + echo "version=$pipeline_version" >> $GITHUB_OUTPUT + - name: Setup git user run: | git config --global user.name "${{github.actor}}" git config --global user.email "${{github.actor}}@users.noreply.github.com" - uses: actions/cache@v3 with: - key: mkdocs-material-${{ env.cache_id }} + key: mkdocs-material-${{ steps.date.outputs.cache_id }} path: .cache restore-keys: | mkdocs-material- + - name: Install dependencies run: pip install mkdocs-material pymdown-extensions pillow cairosvg mike + - name: Build docs run: | - [[ ${{ env.pipeline_version }} == "dev" ]] && mike deploy --push ${{ env.pipeline_version }} || mike deploy --push --update-aliases ${{ env.pipeline_version }} latest + [[ ${{ steps.version.outputs.version }} == "dev" ]] && mike deploy --push ${{ steps.version.outputs.version }} || mike deploy --push --update-aliases ${{ env.pipeline_version }} latest + - name: Set default docs run: mike set-default --push latest diff --git a/.github/workflows/ci.yml b/.github/workflows/ci.yml index 37ebe5ce..bb09e1f2 100644 --- a/.github/workflows/ci.yml +++ b/.github/workflows/ci.yml @@ -1,15 +1,15 @@ name: nf-core CI # This workflow runs the pipeline with the minimal test dataset to check that it completes without any syntax errors on: - push: - branches: - - dev pull_request: release: types: [published] + workflow_dispatch: env: NXF_ANSI_LOG: false + NFT_MAX_SHARDS: 5 + SOURCE_BRANCH: ${{ github.base_ref }} concurrency: group: "${{ github.workflow }}-${{ github.event.pull_request.number || github.ref }}" @@ -17,30 +17,20 @@ concurrency: jobs: test_all: - name: Run nf-test with ${{ matrix.test }}-${{ matrix.NXF_VER }} + name: Run ${{ matrix.filter }} tests | shard ${{ matrix.shard }} (${{ matrix.NXF_VER }}) # Only run on push if this is the nf-core dev branch (merged PRs) if: "${{ github.event_name != 'push' || (github.event_name == 'push' && github.repository == 'nf-cmgg/germline') }}" runs-on: ubuntu-latest strategy: matrix: NXF_VER: - - "24.04.0" + - "24.10.0" - "latest-everything" - test: - - "pipeline_default" - - "pipeline_callers" - - "pipeline_variations" - - "pipeline_variations2" - - "pipeline_gvcfs" - - "cram_call_genotype_gatk4" - - "cram_call_vardictjava" - - "cram_prepare_samtools_bedtools" - - "input_split_bedtools" - - "vcf_annotation" - - "vcf_extract_relate_somalier" - - "vcf_ped_rtgtools" - - "vcf_upd_updio" - - "vcf_validate_small_variants" + filter: + - "process" + - "workflow" + - "pipeline" + shard: [1, 2, 3, 4, 5] steps: - name: Free some space run: | @@ -51,22 +41,39 @@ jobs: - name: Check out pipeline code uses: actions/checkout@0ad4b8fadaa221de15dcec353f45205ec38ea70b # v4 + with: + fetch-depth: 0 - name: Install Nextflow uses: nf-core/setup-nextflow@v2 with: version: "${{ matrix.NXF_VER }}" - - name: Disk space cleanup - uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1 + # - name: Disk space cleanup + # uses: jlumbroso/free-disk-space@54081f138730dfa15788a46383842cd2f914a1be # v1.3.1 - name: Install nf-test run: | conda install -c bioconda nf-test - - name: Run pipeline with test data + - name: "Run ${{ matrix.filter }} tests (changed) | ${{ matrix.shard }}/${{ env.NFT_MAX_SHARDS }}" + if: ${{ env.SOURCE_BRANCH != 'main' }} + run: | + $CONDA/bin/nf-test test \ + --ci \ + --changed-since HEAD^ \ + --shard ${{ matrix.shard }}/${{ env.NFT_MAX_SHARDS }} \ + --filter ${{ matrix.filter }} \ + --junitxml=default.xml + + - name: "Run ${{ matrix.filter }} tests (all) | ${{ matrix.shard }}/${{ env.NFT_MAX_SHARDS }}" + if: ${{ env.SOURCE_BRANCH == 'main' }} run: | - $CONDA/bin/nf-test test --tag ${{ matrix.test }} --junitxml=default.xml + $CONDA/bin/nf-test test \ + --ci \ + --shard ${{ matrix.shard }}/${{ env.NFT_MAX_SHARDS }} \ + --filter ${{ matrix.filter }} \ + --junitxml=default.xml - name: Publish Test Report uses: mikepenz/action-junit-report@v3 diff --git a/.github/workflows/download_pipeline.yml b/.github/workflows/download_pipeline.yml index 4a2e3eb4..7cc2c387 100644 --- a/.github/workflows/download_pipeline.yml +++ b/.github/workflows/download_pipeline.yml @@ -1,4 +1,4 @@ -name: Test successful pipeline download with 'nf-core download' +name: Test successful pipeline download with 'nf-core pipelines download' # Run the workflow when: # - dispatched manually @@ -8,7 +8,7 @@ on: workflow_dispatch: inputs: testbranch: - description: "The specific branch you wish to utilize for the test execution of nf-core download." + description: "The specific branch you wish to utilize for the test execution of nf-core pipelines download." required: true default: "dev" pull_request: @@ -39,9 +39,11 @@ jobs: with: python-version: "3.12" architecture: "x64" - - uses: eWaterCycle/setup-singularity@931d4e31109e875b13309ae1d07c70ca8fbc8537 # v7 + + - name: Setup Apptainer + uses: eWaterCycle/setup-apptainer@4bb22c52d4f63406c49e94c804632975787312b3 # v2.0.0 with: - singularity-version: 3.8.3 + apptainer-version: 1.3.4 - name: Install dependencies run: | @@ -54,33 +56,64 @@ jobs: echo "REPOTITLE_LOWERCASE=$(basename ${GITHUB_REPOSITORY,,})" >> ${GITHUB_ENV} echo "REPO_BRANCH=${{ github.event.inputs.testbranch || 'dev' }}" >> ${GITHUB_ENV} + - name: Make a cache directory for the container images + run: | + mkdir -p ./singularity_container_images + - name: Download the pipeline env: - NXF_SINGULARITY_CACHEDIR: ./ + NXF_SINGULARITY_CACHEDIR: ./singularity_container_images run: | - nf-core download ${{ env.REPO_LOWERCASE }} \ + nf-core pipelines download ${{ env.REPO_LOWERCASE }} \ --revision ${{ env.REPO_BRANCH }} \ --outdir ./${{ env.REPOTITLE_LOWERCASE }} \ --compress "none" \ --container-system 'singularity' \ - --container-library "quay.io" -l "docker.io" -l "ghcr.io" \ + --container-library "quay.io" -l "docker.io" -l "community.wave.seqera.io" \ --container-cache-utilisation 'amend' \ - --download-configuration + --download-configuration 'yes' - name: Inspect download run: tree ./${{ env.REPOTITLE_LOWERCASE }} + - name: Count the downloaded number of container images + id: count_initial + run: | + image_count=$(ls -1 ./singularity_container_images | wc -l | xargs) + echo "Initial container image count: $image_count" + echo "IMAGE_COUNT_INITIAL=$image_count" >> ${GITHUB_ENV} + - name: Run the downloaded pipeline (stub) id: stub_run_pipeline continue-on-error: true env: - NXF_SINGULARITY_CACHEDIR: ./ + NXF_SINGULARITY_CACHEDIR: ./singularity_container_images NXF_SINGULARITY_HOME_MOUNT: true run: nextflow run ./${{ env.REPOTITLE_LOWERCASE }}/$( sed 's/\W/_/g' <<< ${{ env.REPO_BRANCH }}) -stub -profile test,singularity --outdir ./results - name: Run the downloaded pipeline (stub run not supported) id: run_pipeline if: ${{ job.steps.stub_run_pipeline.status == failure() }} env: - NXF_SINGULARITY_CACHEDIR: ./ + NXF_SINGULARITY_CACHEDIR: ./singularity_container_images NXF_SINGULARITY_HOME_MOUNT: true run: nextflow run ./${{ env.REPOTITLE_LOWERCASE }}/$( sed 's/\W/_/g' <<< ${{ env.REPO_BRANCH }}) -profile test,singularity --outdir ./results + + - name: Count the downloaded number of container images + id: count_afterwards + run: | + image_count=$(ls -1 ./singularity_container_images | wc -l | xargs) + echo "Post-pipeline run container image count: $image_count" + echo "IMAGE_COUNT_AFTER=$image_count" >> ${GITHUB_ENV} + + - name: Compare container image counts + run: | + if [ "${{ env.IMAGE_COUNT_INITIAL }}" -ne "${{ env.IMAGE_COUNT_AFTER }}" ]; then + initial_count=${{ env.IMAGE_COUNT_INITIAL }} + final_count=${{ env.IMAGE_COUNT_AFTER }} + difference=$((final_count - initial_count)) + echo "$difference additional container images were \n downloaded at runtime . The pipeline has no support for offline runs!" + tree ./singularity_container_images + exit 1 + else + echo "The pipeline can be downloaded successfully!" + fi diff --git a/.github/workflows/linting.yml b/.github/workflows/linting.yml index 9593436d..6bfe9373 100644 --- a/.github/workflows/linting.yml +++ b/.github/workflows/linting.yml @@ -42,24 +42,32 @@ jobs: architecture: "x64" - name: read .nf-core.yml - uses: pietrobolcato/action-read-yaml@1.0.0 + uses: pietrobolcato/action-read-yaml@1.1.0 id: read_yml with: - config: ${{ github.workspace }}/.nf-core.yaml + config: ${{ github.workspace }}/.nf-core.yml - name: Install dependencies run: | python -m pip install --upgrade pip - pip install git+https://github.com/nf-core/tools@dev - #pip install nf-core==${{ steps.read_yml.outputs['nf_core_version'] }} + pip install nf-core==${{ steps.read_yml.outputs['nf_core_version'] }} - name: Run nf-core pipelines lint + if: ${{ github.base_ref != 'main' }} env: GITHUB_COMMENTS_URL: ${{ github.event.pull_request.comments_url }} GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} GITHUB_PR_COMMIT: ${{ github.event.pull_request.head.sha }} run: nf-core -l lint_log.txt pipelines lint --dir ${GITHUB_WORKSPACE} --markdown lint_results.md + - name: Run nf-core pipelines lint --release + if: ${{ github.base_ref == 'main' }} + env: + GITHUB_COMMENTS_URL: ${{ github.event.pull_request.comments_url }} + GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} + GITHUB_PR_COMMIT: ${{ github.event.pull_request.head.sha }} + run: nf-core -l lint_log.txt pipelines lint --release --dir ${GITHUB_WORKSPACE} --markdown lint_results.md + - name: Save PR number if: ${{ always() }} run: echo ${{ github.event.pull_request.number }} > PR_number.txt diff --git a/.github/workflows/release-announcements.yml b/.github/workflows/release-announcements.yml deleted file mode 100644 index 03ecfcf7..00000000 --- a/.github/workflows/release-announcements.yml +++ /dev/null @@ -1,75 +0,0 @@ -name: release-announcements -# Automatic release toot and tweet anouncements -on: - release: - types: [published] - workflow_dispatch: - -jobs: - toot: - runs-on: ubuntu-latest - steps: - - name: get topics and convert to hashtags - id: get_topics - run: | - echo "topics=$(curl -s https://nf-co.re/pipelines.json | jq -r '.remote_workflows[] | select(.full_name == "${{ github.repository }}") | .topics[]' | awk '{print "#"$0}' | tr '\n' ' ')" >> $GITHUB_OUTPUT - - - uses: rzr/fediverse-action@master - with: - access-token: ${{ secrets.MASTODON_ACCESS_TOKEN }} - host: "mstdn.science" # custom host if not "mastodon.social" (default) - # GitHub event payload - # https://docs.github.com/en/developers/webhooks-and-events/webhooks/webhook-events-and-payloads#release - message: | - Pipeline release! ${{ github.repository }} v${{ github.event.release.tag_name }} - ${{ github.event.release.name }}! - - Please see the changelog: ${{ github.event.release.html_url }} - - ${{ steps.get_topics.outputs.topics }} #nfcore #openscience #nextflow #bioinformatics - - send-tweet: - runs-on: ubuntu-latest - - steps: - - uses: actions/setup-python@82c7e631bb3cdc910f68e0081d67478d79c6982d # v5 - with: - python-version: "3.10" - - name: Install dependencies - run: pip install tweepy==4.14.0 - - name: Send tweet - shell: python - run: | - import os - import tweepy - - client = tweepy.Client( - access_token=os.getenv("TWITTER_ACCESS_TOKEN"), - access_token_secret=os.getenv("TWITTER_ACCESS_TOKEN_SECRET"), - consumer_key=os.getenv("TWITTER_CONSUMER_KEY"), - consumer_secret=os.getenv("TWITTER_CONSUMER_SECRET"), - ) - tweet = os.getenv("TWEET") - client.create_tweet(text=tweet) - env: - TWEET: | - Pipeline release! ${{ github.repository }} v${{ github.event.release.tag_name }} - ${{ github.event.release.name }}! - - Please see the changelog: ${{ github.event.release.html_url }} - TWITTER_CONSUMER_KEY: ${{ secrets.TWITTER_CONSUMER_KEY }} - TWITTER_CONSUMER_SECRET: ${{ secrets.TWITTER_CONSUMER_SECRET }} - TWITTER_ACCESS_TOKEN: ${{ secrets.TWITTER_ACCESS_TOKEN }} - TWITTER_ACCESS_TOKEN_SECRET: ${{ secrets.TWITTER_ACCESS_TOKEN_SECRET }} - - bsky-post: - runs-on: ubuntu-latest - steps: - - uses: zentered/bluesky-post-action@80dbe0a7697de18c15ad22f4619919ceb5ccf597 # v0.1.0 - with: - post: | - Pipeline release! ${{ github.repository }} v${{ github.event.release.tag_name }} - ${{ github.event.release.name }}! - - Please see the changelog: ${{ github.event.release.html_url }} - env: - BSKY_IDENTIFIER: ${{ secrets.BSKY_IDENTIFIER }} - BSKY_PASSWORD: ${{ secrets.BSKY_PASSWORD }} - # diff --git a/.gitpod.yml b/.gitpod.yml index 105a1821..46118637 100644 --- a/.gitpod.yml +++ b/.gitpod.yml @@ -4,17 +4,14 @@ tasks: command: | pre-commit install --install-hooks nextflow self-update - - name: unset JAVA_TOOL_OPTIONS - command: | - unset JAVA_TOOL_OPTIONS vscode: extensions: # based on nf-core.nf-core-extensionpack - - esbenp.prettier-vscode # Markdown/CommonMark linting and style checking for Visual Studio Code + #- esbenp.prettier-vscode # Markdown/CommonMark linting and style checking for Visual Studio Code - EditorConfig.EditorConfig # override user/workspace settings with settings found in .editorconfig files - Gruntfuggly.todo-tree # Display TODO and FIXME in a tree view in the activity bar - mechatroner.rainbow-csv # Highlight columns in csv files in different colors - # - nextflow.nextflow # Nextflow syntax highlighting + - nextflow.nextflow # Nextflow syntax highlighting - oderwat.indent-rainbow # Highlight indentation level - streetsidesoftware.code-spell-checker # Spelling checker for source code - charliermarsh.ruff # Code linter Ruff diff --git a/.nf-core.yml b/.nf-core.yml index 7b994ce9..99b4e65b 100644 --- a/.nf-core.yml +++ b/.nf-core.yml @@ -1,39 +1,46 @@ +bump_version: null lint: + actions_ci: false files_exist: - - "CODE_OF_CONDUCT.md" - - "assets/nf-core-germline_logo_light.png" - - "docs/images/nf-core-germline_logo_light.png" - - "docs/images/nf-core-germline_logo_dark.png" - - ".github/ISSUE_TEMPLATE/config.yml" - - ".github/workflows/awstest.yml" - - ".github/workflows/awsfulltest.yml" - - "docs/README.md" + - CODE_OF_CONDUCT.md + - assets/nf-core-germline_logo_light.png + - docs/images/nf-core-germline_logo_light.png + - docs/images/nf-core-germline_logo_dark.png + - .github/ISSUE_TEMPLATE/config.yml + - .github/workflows/awstest.yml + - .github/workflows/awsfulltest.yml + - .github/workflows/template_version_comment.yml + - docs/README.md files_unchanged: - - ".github/CONTRIBUTING.md" - - ".github/PULL_REQUEST_TEMPLATE.md" - - ".github/workflows/branch.yml" - - ".github/workflows/linting_comment.yml" - - ".github/workflows/linting.yml" - - "CODE_OF_CONDUCT.md" - - ".github/ISSUE_TEMPLATE/bug_report.yml" - - ".prettierignore" - nextflow_config: - - "custom_config" # TODO Remove this once the new methods are supported - - "manifest.name" - - "manifest.homePage" - - "params.genomes" - - "validation.help.beforeText" - - "validation.help.afterText" - - "validation.summary.beforeText" - - "validation.summary.afterText" + - .github/CONTRIBUTING.md + - .github/PULL_REQUEST_TEMPLATE.md + - .github/workflows/branch.yml + - .github/workflows/linting_comment.yml + - .github/workflows/linting.yml + - CODE_OF_CONDUCT.md + - .github/ISSUE_TEMPLATE/bug_report.yml + - .prettierignore multiqc_config: - - "report_comment" - actions_ci: false # TODO readd this once the linting doesn't act up -nf_core_version: 3.0.0dev + - report_comment + nextflow_config: + - custom_config + - manifest.name + - manifest.homePage + - validation.help.afterText + - validation.summary.afterText + subworkflow_changes: false +nf_core_version: 3.0.2 repository_type: pipeline template: author: nvnieuwk description: A nextflow pipeline for calling and annotating small germline variants from short DNA reads for WES and WGS data + force: false + is_nfcore: false name: germline - prefix: nf-cmgg + org: nf-cmgg + outdir: . + skip_features: + - fastqc + - is_nfcore + version: 1.9.0 diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 4dc0f1dc..9e9f0e1c 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -7,7 +7,7 @@ repos: - prettier@3.2.5 - repo: https://github.com/editorconfig-checker/editorconfig-checker.python - rev: "2.7.3" + rev: "3.0.3" hooks: - id: editorconfig-checker alias: ec diff --git a/CHANGELOG.md b/CHANGELOG.md index 776e85ab..70e12564 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -3,9 +3,37 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/) and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). +## v1.9.0 - Neighborly Nieuwkerke + +### New features + +1. Added the `--min_callable_coverage` parameter to state what the lowest coverage should for a region to be classified as callable. +2. Added the [`elprep` caller](https://github.com/ExaScience/elprep) as an alternative to the haplotypecaller. +3. Added full unit tests for all parts that were missing tests. + +### Changes + +1. Added the `--squash-ploidy` argument to the RTG vcfeval process. +2. Update to nf-core v3.0.1 +3. Completely reworked the output directory structure to a more sensible structure. The pipeline can now be run on the same output directory every time and will incrementally add files to the correct family folder. See the [output documentation](https://nf-cmgg.github.io/germline/latest/output/) for more info. +4. Migrated to the new workflow output definitions. +5. Bumped the minimal Nextflow version to 24.10.0. +6. Added the somalier reports to the multiQC report. +7. Removed the `--output_suffix` parameter +8. Added some missing required parameters to the `WES` and `seqplorer` profiles + +### Fixes + +1. Validation of all samples now uses an intersect of the golden truth BED files with the BED file used to call the variants. This should fix the WES validation which was broken until this point. +2. A couple of small fixes to the vardict flow. +3. Only use the standard chromosomes for UPDio analysis. +4. Reduced the resources given to some GATK4 modules +5. VCF2DB now uses a seqera container to fix some issues when running it in nomad +6. Dots in sample and family names are now converted to an underscore automatically. + ## v1.8.2 - Outstanding Oostkamp - [September 30 2024] -## Fixes +### Fixes 1. Fixed some issues where indices were not created 2. Updated the docs diff --git a/README.md b/README.md index bd133931..6bd84f3c 100644 --- a/README.md +++ b/README.md @@ -1,8 +1,10 @@ +# nf-cmgg/germline + [![GitHub Actions CI Status](https://github.com/nf-cmgg/germline/actions/workflows/ci.yml/badge.svg)](https://github.com/nf-cmgg/germline/actions/workflows/ci.yml) [![GitHub Actions Linting Status](https://github.com/nf-cmgg/germline/actions/workflows/linting.yml/badge.svg)](https://github.com/nf-cmgg/germline/actions/workflows/linting.yml) [![nf-test](https://img.shields.io/badge/unit_tests-nf--test-337ab7.svg)](https://www.nf-test.com) -[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A524.04.0-23aa62.svg)](https://www.nextflow.io/) +[![Nextflow](https://img.shields.io/badge/nextflow%20DSL2-%E2%89%A524.10.0-23aa62.svg)](https://www.nextflow.io/) [![run with conda](http://img.shields.io/badge/run%20with-conda-3EB049?labelColor=000000&logo=anaconda)](https://docs.conda.io/en/latest/) [![run with docker](https://img.shields.io/badge/run%20with-docker-0db7ed?labelColor=000000&logo=docker)](https://www.docker.com/) [![run with singularity](https://img.shields.io/badge/run%20with-singularity-1d355c.svg?labelColor=000000)](https://sylabs.io/docs/) diff --git a/assets/samplesheet.csv b/assets/samplesheet.csv index b20e3926..3dc49e4f 100644 --- a/assets/samplesheet.csv +++ b/assets/samplesheet.csv @@ -1,4 +1,4 @@ -sample,family,cram,crai,roi,ped,truth_vcf,truth_tbi,truth_bed,vardict_min_af -NA24143,Proband_12345,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24143.cram,,,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/test.ped,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/vcfs/NA24143.vcf.gz,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/vcfs/NA24143.vcf.gz.tbi,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/regions/roi.bed,0.01 -NA24149,Proband_12345,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24149.cram,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24149.cram.crai,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/regions/roi.bed,,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/vcfs/NA24149.vcf.gz,,, -NA24385,Proband_12345,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24385.cram,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24385.cram.crai,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/regions/roi.bed,,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/vcfs/NA24385.vcf.gz,,, +sample,family,cram,crai,roi,truth_vcf,truth_tbi,truth_bed,vardict_min_af +NA24143,Proband_12.345,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24143.cram,,,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/vcfs/NA24143.vcf.gz,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/vcfs/NA24143.vcf.gz.tbi,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/regions/roi.bed,0.01 +NA24149,Proband_12.345,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24149.cram,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24149.cram.crai,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/regions/roi.bed,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/vcfs/NA24149.vcf.gz,,, +NA24385,Proband_12.345,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24385.cram,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24385.cram.crai,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/regions/roi.bed,https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/vcfs/NA24385.vcf.gz,,, diff --git a/assets/schema_input.json b/assets/schema_input.json index f1141ae6..1cd7a49f 100644 --- a/assets/schema_input.json +++ b/assets/schema_input.json @@ -9,11 +9,15 @@ "properties": { "sample": { "type": "string", - "meta": ["id", "sample"] + "pattern": "^[a-zA-Z0-9_\\.]+$", + "meta": ["id", "sample"], + "errorMessage": "Sample name should be a string that may contain underscores (_) and dots (.)" }, "family": { "type": "string", - "meta": ["family"] + "pattern": "^[a-zA-Z0-9_\\.]+$", + "meta": ["family"], + "errorMessage": "Family name should be a string that may contain underscores (_) and dots (.)" }, "cram": { "oneOf": [ diff --git a/conf/base.config b/conf/base.config index 35b35265..6ed8a916 100644 --- a/conf/base.config +++ b/conf/base.config @@ -10,9 +10,9 @@ process { - cpus = { 1 * task.attempt } - memory = { 8.GB * task.attempt } - time = { 4.h * task.attempt } + cpus = { 1 * task.attempt } + memory = { 8.GB * task.attempt } + time = { 4.h * task.attempt } errorStrategy = { task.exitStatus in ((130..145) + 104) ? 'retry' : 'finish' } maxRetries = 1 @@ -20,27 +20,27 @@ process { // Process-specific resource requirements withLabel:process_single { - cpus = { 1 } - memory = { 8.GB * task.attempt } - time = { 4.h * task.attempt } + cpus = { 1 } + memory = { 8.GB * task.attempt } + time = { 4.h * task.attempt } } withLabel:process_low { - cpus = { 2 * task.attempt } - memory = { 16.GB * task.attempt } - time = { 4.h * task.attempt } + cpus = { 2 * task.attempt } + memory = { 16.GB * task.attempt } + time = { 4.h * task.attempt } } withLabel:process_medium { - cpus = { 4 * task.attempt } - memory = { 32.GB * task.attempt } - time = { 8.h * task.attempt } + cpus = { 4 * task.attempt } + memory = { 32.GB * task.attempt } + time = { 8.h * task.attempt } } withLabel:process_high { - cpus = { 8 * task.attempt } - memory = { 64.GB * task.attempt } - time = { 16.h * task.attempt } + cpus = { 8 * task.attempt } + memory = { 64.GB * task.attempt } + time = { 16.h * task.attempt } } withLabel:process_long { - time = { 20.h * task.attempt } + time = { 20.h * task.attempt } } withLabel:process_high_memory { memory = { 200.GB * task.attempt } diff --git a/conf/seqcap.config b/conf/copgt.config similarity index 59% rename from conf/seqcap.config rename to conf/copgt.config index 9d499e5e..d5460667 100644 --- a/conf/seqcap.config +++ b/conf/copgt.config @@ -1,12 +1,11 @@ /* ======================================================================================== - Nextflow config file for SeqCap runs + Nextflow config file for WES runs ======================================================================================== */ params { - callers = "vardict" - filter = true - normalize = true - scatter_count = 14 + callers = "haplotypecaller" + only_call = true + scatter_count = 8 } diff --git a/conf/hypercap.config b/conf/hypercap.config index ac937f98..9ce274d8 100644 --- a/conf/hypercap.config +++ b/conf/hypercap.config @@ -8,6 +8,4 @@ params { callers = "vardict" scatter_count = 5 only_pass = true - - output_suffix = "-vardict-decomposed-annotated" } diff --git a/conf/igenomes_ignored.config b/conf/igenomes_ignored.config new file mode 100644 index 00000000..b4034d82 --- /dev/null +++ b/conf/igenomes_ignored.config @@ -0,0 +1,9 @@ +/* +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Nextflow config file for iGenomes paths +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + Empty genomes dictionary to use when igenomes is ignored. +---------------------------------------------------------------------------------------- +*/ + +params.genomes = [:] diff --git a/conf/modules.config b/conf/modules.config index 7c54ceb0..138ba4f1 100644 --- a/conf/modules.config +++ b/conf/modules.config @@ -10,50 +10,16 @@ ---------------------------------------------------------------------------------------- */ -def enableOutput(state) { - """ - This function checks if the output of the given module should be published to the output directory. - The higher the option is in the list, the higher the priority of being in the output directory - """ - def order = [ - "vcfanno": params.vcfanno && params.annotate, - "annotate": params.annotate, - "add_ped": params.add_ped, - "normalize": params.normalize, - "filter": params.filter, - "original": true - ] - - return order.findIndexOf{it.key == state} == order.findIndexOf{it.value == true} -} - -def date = params.skip_date_project ? "" : "${new Date().format("yyyy-MM-dd")}_" -def final_output = { "${params.outdir}/${params.project ? "${date}${params.project}" : "${date}${workflow.runName}"}/${meta.family}" } -def final_output_reports = { "${params.outdir}/${params.project ? "${date}${params.project}" : "${date}${workflow.runName}"}/${meta.family}/reports" } -def individual_output = { "${params.outdir}/${meta.sample}" } -def individual_reports = { "${params.outdir}/${meta.sample}/reports" } -def individual_validation = { "${params.outdir}/${meta.sample}/validation/${meta.caller}" } - -def callers = params.callers.tokenize(",") - -def final_prefix = { params.output_suffix ? "${meta.id}${params.output_suffix}" : "${meta.id}.${meta.caller}" } - process { - publishDir = [ - enabled: false - ] - /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ REFERENCE MODULES ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ - if(params.annotate) { - withName: '^.*ENSEMBLVEP_DOWNLOAD\$' { - ext.args = "--AUTO c --CONVERT --NO_BIOPERL --NO_TEST --NO_UPDATE" - } + withName: '^.*ENSEMBLVEP_DOWNLOAD\$' { + ext.args = "--AUTO c --CONVERT --NO_BIOPERL --NO_TEST --NO_UPDATE" } /* @@ -63,14 +29,7 @@ process { */ withName: "^.*GERMLINE:BCFTOOLS_STATS\$" { - publishDir = [ - overwrite: true, - enabled: true, - path: final_output_reports, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - ext.prefix = final_prefix + ext.prefix = { "${meta.id}.${meta.caller}"} } /* @@ -85,277 +44,194 @@ process { withName: "^.*CRAM_PREPARE_SAMTOOLS_BEDTOOLS:MOSDEPTH\$" { ext.args = [ - "--quantize 0:1:4: --mapq 1 --flag 1804 --no-per-base", + "--quantize 0:1:${params.min_callable_coverage - 1}: --mapq 1 --flag 1804 --no-per-base", params.mosdepth_slow ? "" : "--fast-mode" ].join(" ") - publishDir = [ - overwrite: true, - enabled: true, - mode: params.publish_dir_mode, - path: individual_output, - saveAs: { filename -> - filename.endsWith('.global.dist.txt') || filename.endsWith('.summary.txt') ? "reports/${filename}" : null - } - ] // SAVE } withName: "^.*CRAM_PREPARE_SAMTOOLS_BEDTOOLS:FILTER_BEDS\$" { ext.prefix = { "${meta.id}.filter"} ext.args = "-vE \"LOW_COVERAGE|NO_COVERAGE${params.keep_alt_contigs ? "" : "|alt|random|decoy|Un"}\"" ext.args2 = "-d 150" - publishDir = [ - overwrite: true, - enabled: true, - mode: params.publish_dir_mode, - path: individual_output, - saveAs: { filename -> filename.endsWith(".bed") ? filename.replace(".filter", "") : null } - ] // SAVE } withName: "^.*CRAM_PREPARE_SAMTOOLS_BEDTOOLS:BEDTOOLS_INTERSECT\$" { - ext.prefix = {"${meta.id}_intersect"} + ext.prefix = {"${meta.id}.intersect"} ext.args = "-sorted" - publishDir = [ - overwrite: true, - enabled: true, - mode: params.publish_dir_mode, - path: individual_output, - saveAs: { filename -> filename.endsWith(".bed") ? filename.replace("_intersect", "") : null } - ] // SAVE } /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - GATK4_HAPLOTYPCECALLER + GATK4 HAPLOTYPCECALLER ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ - if("haplotypecaller" in callers) { - if(params.dragstr) { - withName: "^.*CRAM_CALL_GENOTYPE_GATK4:CRAM_CALL_GATK4:GATK4_CALIBRATEDRAGSTRMODEL\$" { - ext.args = "--parallel" - } - } + withName: "^.*CRAM_CALL_GATK4:GATK4_CALIBRATEDRAGSTRMODEL\$" { + ext.args = "--parallel" + } - withName: "^.*CRAM_CALL_GENOTYPE_GATK4:CRAM_CALL_GATK4:GATK4_HAPLOTYPECALLER\$" { - time = { 16.h * task.attempt } - ext.prefix = {"${meta.id}.g"} - ext.args = { - [ - '-ERC GVCF -contamination "0"', - '-GQB 10 -GQB 20 -GQB 30 -GQB 40 -GQB 50 -GQB 60 -GQB 70 -GQB 80 -GQB 90', - '-G StandardAnnotation -G StandardHCAnnotation -G AS_StandardAnnotation', - params.dragstr ? '--dragen-mode' : '', - params.hc_phasing ? '' : '--do-not-run-physical-phasing' - ].join(" ") - } + withName: "^.*CRAM_CALL_GATK4:GATK4_HAPLOTYPECALLER\$" { + time = { 16.h * task.attempt } + ext.prefix = {"${meta.id}.g"} + ext.args = { + [ + '-ERC GVCF -contamination "0"', + '-GQB 10 -GQB 20 -GQB 30 -GQB 40 -GQB 50 -GQB 60 -GQB 70 -GQB 80 -GQB 90', + '-G StandardAnnotation -G StandardHCAnnotation -G AS_StandardAnnotation', + params.dragstr ? '--dragen-mode' : '', + params.hc_phasing ? '' : '--do-not-run-physical-phasing' + ].join(" ") } + } - withName: "^.*CRAM_CALL_GATK4:VCF_CONCAT_BCFTOOLS:BCFTOOLS_CONCAT\$" { - publishDir = [ - overwrite: true, - enabled: true, - mode: params.publish_dir_mode, - path: individual_output, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - ext.prefix = { "${meta.id}.${meta.caller}.g" } - ext.args = '--allow-overlaps --output-type z' - } + withName: "^.*CRAM_CALL_GATK4:VCF_CONCAT_BCFTOOLS:BCFTOOLS_CONCAT\$" { + ext.prefix = { "${meta.id}.${meta.caller}.g" } + ext.args = '--allow-overlaps --output-type z' + } - withName: "^.*CRAM_CALL_GATK4:VCF_CONCAT_BCFTOOLS:TABIX_TABIX\$" { - publishDir = [ - overwrite: true, - enabled: true, - path: individual_output, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - ext.args = '' - } + withName: "^.*CRAM_CALL_GATK4:BCFTOOLS_STATS\$" { + ext.prefix = { "${meta.id}.${meta.caller}" } + } + + /* + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + ELPREP + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + */ + + withName: "^.*BAM_CALL_ELPREP:ELPREP_FILTER\$" { + cpus = { 25 * task.attempt } + memory = { 250.GB * task.attempt } + ext.args = "--reference-confidence GVCF" + } + + withName: "^.*BAM_CALL_ELPREP:VCF_CONCAT_BCFTOOLS:BCFTOOLS_CONCAT\$" { + ext.prefix = { "${meta.id}.${meta.caller}.g" } + ext.args = '--allow-overlaps --output-type z' + } + + withName: "^.*BAM_CALL_ELPREP:BCFTOOLS_STATS\$" { + ext.prefix = { "${meta.id}.${meta.caller}" } + } + + /* + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + GVCF JOINT GENOTYPING + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + */ + + withName: "^.*GVCF_JOINT_GENOTYPE_GATK4:BCFTOOLS_QUERY\$" { + ext.args = "--exclude 'QUAL=\".\"' --format '%CHROM\t%POS0\t%END\\n'" + ext.suffix = "bed" + } + + withName: "^.*GVCF_JOINT_GENOTYPE_GATK4:MERGE_BEDS\$" { + ext.args = "-d ${params.merge_distance}" + } + + withName: "^.*GVCF_JOINT_GENOTYPE_GATK4:GAWK\$" { + ext.args2 = '\'BEGIN {FS="\t"}; {print \$1 FS "0" FS \$2}\'' + ext.suffix = "bed" + } - withName: "^.*CRAM_CALL_GATK4:BCFTOOLS_STATS_SINGLE\$" { - publishDir = [ - overwrite: true, - enabled: true, - path: individual_reports, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - ext.prefix = final_prefix + withName: "^.*GVCF_JOINT_GENOTYPE_GATK4:GATK4_GENOMICSDBIMPORT\$" { + label = { meta.family_samples.tokenize(",").size() <= 10 ? "process_medium" : "process_high" } + time = { 16.h * task.attempt } + // Lots of parameters are fetched from https://gatk.broadinstitute.org/hc/en-us/articles/360056138571-GenomicsDBImport-usage-and-performance-guidelines + ext.args = { + [ + meta.family_samples.tokenize(",").size() >= 100 ? "--batch-size 100" : "", + "--overwrite-existing-genomicsdb-workspace", + "--genomicsdb-shared-posixfs-optimizations", + "--merge-input-intervals", + "--bypass-feature-reader", + "--max-num-intervals-to-import-in-parallel ${task.cpus*10}", + "--reader-threads ${task.cpus}", + meta.family_samples.tokenize(",").size() >= 100 ? "--consolidate" : "", + "--merge-contigs-into-num-partitions 25" + ].join(" ") } + ext.prefix = { "genomicsdb_${meta.id}_${meta.caller}" } + } - if(!params.only_call) { - withName: "^.*CRAM_CALL_GENOTYPE_GATK4:GVCF_JOINT_GENOTYPE_GATK4:BCFTOOLS_QUERY\$" { - ext.args = "--exclude 'QUAL=\".\"' --format '%CHROM\t%POS0\t%END\\n'" - ext.suffix = "bed" - } - - withName: "^.*CRAM_CALL_GENOTYPE_GATK4:GVCF_JOINT_GENOTYPE_GATK4:MERGE_BEDS\$" { - ext.args = "-d ${params.merge_distance}" - publishDir = [ - enabled: true, - overwrite: true, - path: final_output, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - } - - withName: "^.*CRAM_CALL_GENOTYPE_GATK4:GVCF_JOINT_GENOTYPE_GATK4:GAWK\$" { - ext.args2 = '\'BEGIN {FS="\t"}; {print \$1 FS "0" FS \$2}\'' - ext.suffix = "bed" - } - - withName: "^.*CRAM_CALL_GENOTYPE_GATK4:GVCF_JOINT_GENOTYPE_GATK4:GATK4_GENOMICSDBIMPORT\$" { - label = { meta.family_samples.tokenize(",").size() <= 10 ? "process_medium" : "process_high" } - time = { 16.h * task.attempt } - // Lots of parameters are fetched from https://gatk.broadinstitute.org/hc/en-us/articles/360056138571-GenomicsDBImport-usage-and-performance-guidelines - ext.args = { - [ - meta.family_samples.tokenize(",").size() >= 100 ? "--batch-size 100" : "", - "--overwrite-existing-genomicsdb-workspace", - "--genomicsdb-shared-posixfs-optimizations", - "--merge-input-intervals", - "--bypass-feature-reader", - "--max-num-intervals-to-import-in-parallel ${task.cpus*10}", - "--reader-threads ${task.cpus}", - meta.family_samples.tokenize(",").size() >= 100 ? "--consolidate" : "", - "--merge-contigs-into-num-partitions 25" - ].join(" ") - } - ext.prefix = { "genomicsdb_${meta.id}" } - publishDir = [ - enabled: params.only_merge || params.output_genomicsdb, - overwrite: true, - path: final_output, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - } - - if(!params.only_merge) { - withName: "^.*CRAM_CALL_GENOTYPE_GATK4:GVCF_JOINT_GENOTYPE_GATK4:GATK4_GENOTYPEGVCFS\$" { - time = { 16.h * task.attempt } - ext.args = { - [ - "--allow-old-rms-mapping-quality-annotation-data", - "-G StandardAnnotation -G AS_StandardAnnotation", - "-AX ExcessHet -AX InbreedingCoeff" - ].join(" ") - } - } - - withName: "^.*GVCF_JOINT_GENOTYPE_GATK4:VCF_CONCAT_BCFTOOLS:BCFTOOLS_CONCAT\$" { - ext.prefix = enableOutput("original") ? final_prefix : {"${meta.id}.concat"} - ext.args = "--allow-overlaps --output-type z" - publishDir = [ - enabled: enableOutput("original"), - overwrite: true, - path: final_output, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - } - - if(params.filter){ - withName: "^.*CRAM_CALL_GENOTYPE_GATK4:VCF_FILTER_BCFTOOLS:FILTER_1\$" { - ext.prefix = { "${meta.id}_filtered_snps" } - ext.args = {"--output-type z --soft-filter 'GATKCutoffSNP' -e 'TYPE=\"snp\" && (MQRankSum < -12.5 || ReadPosRankSum < -8.0 || QD < 2.0 || FS > 60.0 || MQ < 30.0)' -m '+'"} - } - - withName: "^.*CRAM_CALL_GENOTYPE_GATK4:VCF_FILTER_BCFTOOLS:FILTER_2\$" { - ext.prefix = enableOutput("filter") ? final_prefix : {"${meta.id}.filtered"} - ext.args = {'--output-type z --soft-filter \'GATKCutoffIndel\' -e \'TYPE="indel" && (ReadPosRankSum < -20.0 || QD < 2.0 || FS > 200.0 || SOR > 10.0 )\' -m \'+\''} - publishDir = [ - enabled: enableOutput("filter"), - overwrite: true, - path: final_output, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - } - } - } + withName: "^.*GVCF_JOINT_GENOTYPE_GATK4:GATK4_GENOTYPEGVCFS\$" { + time = { 16.h * task.attempt } + ext.args = { + [ + "--allow-old-rms-mapping-quality-annotation-data", + "-G StandardAnnotation -G AS_StandardAnnotation", + "-AX ExcessHet -AX InbreedingCoeff" + ].join(" ") } } + withName: "^.*GVCF_JOINT_GENOTYPE_GATK4:VCF_CONCAT_BCFTOOLS:BCFTOOLS_CONCAT\$" { + ext.prefix = { "${meta.id}.concat" } + ext.args = "--allow-overlaps --output-type z" + } + /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ VARDICTJAVA ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ - if("vardict" in callers) { - withName: "^.*CRAM_CALL_VARDICTJAVA:VARDICTJAVA\$" { - time = { 16.h * task.attempt } - ext.prefix = {"${meta.id}"} - ext.args = { - [ - '-c 1 -S 2 -E 3 -g 4 --nosv --deldupvar -Q 10 -F 0x700', - "-f ${meta.vardict_min_af ?: params.vardict_min_af} -N ${meta.sample}" - ].join(" ") - } - ext.args2 = { - [ - "-f ${meta.vardict_min_af ?: params.vardict_min_af} -N ${meta.sample}", - '-A' - ].join(" ") - } + withName: "^.*BAM_CALL_VARDICTJAVA:VARDICTJAVA\$" { + time = { 16.h * task.attempt } + ext.args = { + [ + '-c 1 -S 2 -E 3 -g 4 --nosv --deldupvar -Q 10 -F 0x700', + "-f ${meta.vardict_min_af ?: params.vardict_min_af} -N ${meta.sample}" + ].join(" ") } - - withName: "^.*CRAM_CALL_VARDICTJAVA:VCF_CONCAT_BCFTOOLS:BCFTOOLS_CONCAT\$" { - ext.args = '--allow-overlaps --output-type z' - ext.prefix = enableOutput("original") ? final_prefix : {"${meta.id}.concat"} - publishDir = [ - overwrite: true, - enabled: enableOutput("original"), - mode: params.publish_dir_mode, - path: final_output, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE + ext.args2 = { + [ + "-f ${meta.vardict_min_af ?: params.vardict_min_af} -N ${meta.sample}", + '-A' + ].join(" ") } + } - withName: "^.*CRAM_CALL_VARDICTJAVA:TABIX_VCFANNO\$" { - ext.prefix = enableOutput("original") ? final_prefix : {"${meta.id}.vcfanno"} - publishDir = [ - overwrite: true, - enabled: enableOutput("original"), - mode: params.publish_dir_mode, - path: final_output, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - } + withName: "^.*BAM_CALL_VARDICTJAVA:VCF_CONCAT_BCFTOOLS:BCFTOOLS_CONCAT\$" { + ext.args = '--allow-overlaps --output-type z' + ext.prefix = {"${meta.id}.concat"} + } - if(params.filter) { - withName: "^.*CRAM_CALL_VARDICTJAVA:VCF_FILTER_BCFTOOLS:FILTER_1\$" { - ext.args = "-i 'QUAL >= 0${params.only_pass ? " && FILTER=\"PASS\"" : ""}' --output-type z" - } - - withName: "^.*CRAM_CALL_VARDICTJAVA:VCF_FILTER_BCFTOOLS:FILTER_2\$" { - ext.args = "--soft-filter 'LowFreqBias' --mode '+' -e 'FORMAT/AF[0:*] < 0.02 && FORMAT/VD[0] < 30 && INFO/SBF < 0.1 && INFO/NM >= 2.0' --output-type z" - ext.prefix = enableOutput("filter") ? final_prefix : {"${meta.id}.filtered"} - publishDir = [ - overwrite: true, - enabled: enableOutput("filter"), - mode: params.publish_dir_mode, - path: final_output, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - } - } + withName: "^.*BAM_CALL_VARDICTJAVA:TABIX_VCFANNO\$" { + ext.prefix = {"${meta.id}.vcfanno"} + } - withName: "^.*CRAM_CALL_VARDICTJAVA:TABIX_TABIX\$" { - publishDir = [ - overwrite: true, - enabled: enableOutput("filter") || enableOutput("original"), - mode: params.publish_dir_mode, - path: final_output, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE + /* + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + FILTER + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + */ + + withName: "^.*VCF_FILTER_BCFTOOLS:FILTER_1\$" { + ext.prefix = { "${meta.id}.filtered1" } + ext.args = { + meta.caller == "vardict" ? + "-i 'QUAL >= 0${params.only_pass ? " && FILTER=\"PASS\"" : ""}' --output-type z": + meta.caller == "haplotypecaller" ? + "--output-type z --soft-filter 'GATKCutoffSNP' -e 'TYPE=\"snp\" && (MQRankSum < -12.5 || ReadPosRankSum < -8.0 || QD < 2.0 || FS > 60.0 || MQ < 30.0)' -m '+'": + meta.caller == "elprep" ? + "--output-type z --soft-filter 'GATKCutoffSNP' -e 'TYPE=\"snp\" && (MQRankSum < -12.5 || ReadPosRankSum < -8.0 || QD < 2.0 || FS > 60.0 || MQ < 30.0)' -m '+'": + "" } + } + withName: "^.*VCF_FILTER_BCFTOOLS:FILTER_2\$" { + ext.args = { + meta.caller == "vardict" ? + "--soft-filter 'LowFreqBias' --mode '+' -e 'FORMAT/AF[0:*] < 0.02 && FORMAT/VD[0] < 30 && INFO/SBF < 0.1 && INFO/NM >= 2.0' --output-type z" : + meta.caller == "haplotypecaller" ? + '--output-type z --soft-filter \'GATKCutoffIndel\' -e \'TYPE="indel" && (ReadPosRankSum < -20.0 || QD < 2.0 || FS > 200.0 || SOR > 10.0 )\' -m \'+\'' : + meta.caller == "elprep" ? + '--output-type z --soft-filter \'GATKCutoffIndel\' -e \'TYPE="indel" && (ReadPosRankSum < -20.0 || QD < 2.0 || FS > 200.0 || SOR > 10.0 )\' -m \'+\'' : + "" + } + ext.prefix = {"${meta.id}.filtered"} } /* @@ -364,19 +240,9 @@ process { ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ - if(params.normalize) { - withName: "^.*BCFTOOLS_NORM\$" { - ext.prefix = enableOutput("normalize") ? final_prefix : {"${meta.id}.normalized"} - ext.args = "-m-" - publishDir = [ - overwrite: true, - enabled: enableOutput("normalize"), - mode: params.publish_dir_mode, - path: final_output, - saveAs: { filename -> filename.endsWith('.vcf.gz') ? filename : null } - ] // SAVE - } - + withName: "^.*GERMLINE:BCFTOOLS_NORM\$" { + ext.prefix = {"${meta.id}.normalized"} + ext.args = "-m-" } /* @@ -385,207 +251,120 @@ process { ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ - if(!params.only_call && !params.only_merge) { - - withName: "^.*VCF_EXTRACT_RELATE_SOMALIER:SOMALIER_RELATE\$" { - ext.args = { ped ? "" : "--infer"} - ext.prefix = final_prefix - publishDir = [ - overwrite: true, - enabled: true, - path: final_output, - mode: params.publish_dir_mode, - saveAs: { filename -> - filename ==~ /^.*\.html$/ ? "reports/" + filename.replace(".html", ".somalier.html") : - filename ==~ /^.*\.ped$/ ? filename.replace("_somalier.ped", ".ped") : - null - } - ] // SAVE - } + withName: "^.*VCF_EXTRACT_RELATE_SOMALIER:SOMALIER_RELATE\$" { + ext.args = { ped ? "" : "--infer"} + ext.prefix = { "${meta.id}.${meta.caller}" } + } - if(params.add_ped) { - withName: "^.*VCF_PED_RTGTOOLS:RTGTOOLS_PEDFILTER\$" { - ext.prefix = {"${meta.id}_ped"} - ext.args = "--vcf" - } - - withName: "^.*VCF_PED_RTGTOOLS:BCFTOOLS_ANNOTATE\$" { - ext.prefix = enableOutput("add_ped") ? final_prefix : { "${meta.id}.${meta.caller}_ped_annotated" } - ext.args = "--output-type z" - publishDir = [ - enabled: enableOutput("add_ped"), - overwrite: true, - path: final_output, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - } - } + withName: "^.*VCF_PED_RTGTOOLS:RTGTOOLS_PEDFILTER\$" { + ext.prefix = {"${meta.id}.ped"} + ext.args = "--vcf" + } - /* - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - ANNOTATION - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - */ - - if(params.annotate){ - withName: "^.*VCF_ANNOTATION:VCF_ANNOTATE_ENSEMBLVEP:ENSEMBLVEP_VEP\$" { - memory = { 12.GB * task.attempt } - ext.args = {[ - // Specify the input format - "--format vcf", - // don't contact external db - '--offline', - // increase buffer_size to speed up analysis - "--buffer_size ${params.vep_chunk_size}", - // output format options - '--vcf --compress_output bgzip --force_overwrite', - // annotation options - '--variant_class --sift b --polyphen b --humdiv --allele_number --numbers --total_length --gene_phenotype --ccds --regulatory', - // identifiers - '--hgvs --hgvsg --shift_hgvs 1 --protein --symbol --ccds --uniprot --tsl --appris --canonical --mane --biotype --domains', - // co-located variant info - '--check_existing --clin_sig_allele 1 --af --max_af --af_1kg --af_gnomad --pubmed --var_synonyms', - // plugins - (params.vep_dbnsfp) ? "--plugin dbNSFP,${params.dbnsfp.split('/')[-1]},Ensembl_geneid,Ensembl_transcriptid,LRT_score,LRT_pred,MutationTaster_score,MutationTaster_pred,MutationAssessor_score,MutationAssessor_pred,PROVEAN_score,PROVEAN_pred,MetaSVM_score,MetaSVM_pred,MetaLR_score,MetaLR_pred,MetaRNN_score,MetaRNN_pred,M-CAP_score,M-CAP_pred,REVEL_score,BayesDel_addAF_score,BayesDel_addAF_pred,BayesDel_noAF_score,BayesDel_noAF_pred,CADD_phred,DANN_score,fathmm-MKL_coding_score,fathmm-MKL_coding_pred,GenoCanyon_score,gnomAD_exomes_AC,gnomAD_exomes_AN,gnomAD_exomes_AF,gnomAD_exomes_nhomalt,gnomAD_exomes_POPMAX_AF,gnomAD_genomes_AC,gnomAD_genomes_AN,gnomAD_genomes_AF,gnomAD_genomes_nhomalt,gnomAD_genomes_POPMAX_AF,Interpro_domain" : '', - (params.vep_spliceai) ? "--plugin SpliceAI,snv=${params.spliceai_snv.split('/')[-1]},indel=${params.spliceai_indel.split('/')[-1]}" : '', - (params.vep_spliceregion) ? '--plugin SpliceRegion' : '', - (params.vep_mastermind) ? "--plugin Mastermind,${params.mastermind.split('/')[-1]}" : '', - (params.vep_maxentscan) ? "--plugin MaxEntScan" : '', - (params.vep_alphamissense) ? "--plugin AlphaMissense,file=${params.alphamissense.split('/')[-1]}" : '', - (params.vep_eog) ? "--custom ${params.eog.split('/')[-1]},EOG,vcf,overlap,0,AF" : '', - (params.vep_merged) ? '--merged' : '', - ].join(' ').trim()} - } - - withName: "^.*VCF_ANNOTATION:VCF_ANNOTATE_ENSEMBLVEP:BCFTOOLS_CONCAT\$" { - ext.prefix = {"${meta.id}_concat"} - ext.args = "--allow-overlaps --output-type z" - } - - withName: "^.*VCF_ANNOTATION:VCF_ANNOTATE_ENSEMBLVEP:BCFTOOLS_SORT\$" { - publishDir = [ - enabled: enableOutput("annotate"), - overwrite: true, - path: final_output, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - ext.prefix = enableOutput("annotate") ? final_prefix : {"${meta.id}.sorted"} - } - - if (params.vcfanno){ - withName: "^.*VCF_ANNOTATION:BGZIP_ANNOTATED_VCFS\$" { - publishDir = [ - enabled: enableOutput("vcfanno"), - overwrite: true, - path: final_output, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - ext.prefix = enableOutput("vcfanno") ? final_prefix : {"${meta.id}.vcfanno"} - } - } + withName: "^.*VCF_PED_RTGTOOLS:BCFTOOLS_ANNOTATE\$" { + ext.prefix = { "${meta.id}.${meta.caller}.ped.annotated" } + ext.args = "--output-type z" + } - } + /* + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + ANNOTATION + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + */ + withName: "^.*VCF_ANNOTATION:VCF_ANNOTATE_ENSEMBLVEP:ENSEMBLVEP_VEP\$" { + ext.args = {[ + // Specify the input format + "--format vcf", + // don't contact external db + '--offline', + // increase buffer_size to speed up analysis + "--buffer_size ${params.vep_chunk_size}", + // output format options + '--vcf --compress_output bgzip --force_overwrite', + // annotation options + '--variant_class --sift b --polyphen b --humdiv --allele_number --numbers --total_length --gene_phenotype --ccds --regulatory', + // identifiers + '--hgvs --hgvsg --shift_hgvs 1 --protein --symbol --ccds --uniprot --tsl --appris --canonical --mane --biotype --domains', + // co-located variant info + '--check_existing --clin_sig_allele 1 --af --max_af --af_1kg --af_gnomad --pubmed --var_synonyms', + // plugins + (params.vep_dbnsfp) ? "--plugin dbNSFP,${params.dbnsfp.split('/')[-1]},Ensembl_geneid,Ensembl_transcriptid,LRT_score,LRT_pred,MutationTaster_score,MutationTaster_pred,MutationAssessor_score,MutationAssessor_pred,PROVEAN_score,PROVEAN_pred,MetaSVM_score,MetaSVM_pred,MetaLR_score,MetaLR_pred,MetaRNN_score,MetaRNN_pred,M-CAP_score,M-CAP_pred,REVEL_score,BayesDel_addAF_score,BayesDel_addAF_pred,BayesDel_noAF_score,BayesDel_noAF_pred,CADD_phred,DANN_score,fathmm-MKL_coding_score,fathmm-MKL_coding_pred,GenoCanyon_score,gnomAD_exomes_AC,gnomAD_exomes_AN,gnomAD_exomes_AF,gnomAD_exomes_nhomalt,gnomAD_exomes_POPMAX_AF,gnomAD_genomes_AC,gnomAD_genomes_AN,gnomAD_genomes_AF,gnomAD_genomes_nhomalt,gnomAD_genomes_POPMAX_AF,Interpro_domain" : '', + (params.vep_spliceai) ? "--plugin SpliceAI,snv=${params.spliceai_snv.split('/')[-1]},indel=${params.spliceai_indel.split('/')[-1]}" : '', + (params.vep_spliceregion) ? '--plugin SpliceRegion' : '', + (params.vep_mastermind) ? "--plugin Mastermind,${params.mastermind.split('/')[-1]}" : '', + (params.vep_maxentscan) ? "--plugin MaxEntScan" : '', + (params.vep_alphamissense) ? "--plugin AlphaMissense,file=${params.alphamissense.split('/')[-1]}" : '', + (params.vep_eog) ? "--custom ${params.eog.split('/')[-1]},EOG,vcf,overlap,0,AF" : '', + (params.vep_merged) ? '--merged' : '', + ].join(' ').trim()} + } - /* - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - VALIDATION - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - */ - - if (params.validate){ - - withName: "^.*VCF_VALIDATE_SMALL_VARIANTS:RTGTOOLS_VCFEVAL\$" { - publishDir = [ - enabled: true, - overwrite: true, - path: individual_validation, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - ext.args = {"--sample ${meta.sample} --decompose"} - ext.prefix = {"${meta.sample}"} - } - - withName: "^.*VCF_VALIDATE_SMALL_VARIANTS:RTGTOOLS_ROCPLOT\$" { - publishDir = [ - enabled: true, - overwrite: true, - path: individual_validation, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - ext.args = '' - ext.prefix = {"${meta.sample}.${meta.roc_type}"} - } + withName: "^.*VCF_ANNOTATION:VCF_ANNOTATE_ENSEMBLVEP:BCFTOOLS_CONCAT\$" { + ext.prefix = {"${meta.id}_concat"} + ext.args = "--allow-overlaps --output-type z" + } - } + withName: "^.*VCF_ANNOTATION:VCF_ANNOTATE_ENSEMBLVEP:BCFTOOLS_SORT\$" { + ext.prefix = {"${meta.id}.sorted"} + } - /* - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - FINAL PROCESSES - ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ - */ - - if(params.gemini){ - withName: "^.*VCF2DB\$" { - ext.args = "--a-ok gnomAD_AC --a-ok gnomAD_Hom" - ext.prefix = final_prefix - publishDir = [ - overwrite: true, - path: final_output, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - } - } + withName: "^.*VCF_ANNOTATION:BGZIP_ANNOTATED_VCFS\$" { + ext.prefix = {"${meta.id}.vcfanno"} + } - if(params.updio) { - withName: "^.*VCF_UPD_UPDIO:UPDIO\$" { - ext.prefix = {"updio_${meta.caller}"} - ext.args = {[ - "--childID ${meta.child}", - "--momID ${meta.mother}", - "--dadID ${meta.father}", - "--include_MI" - ].join(" ")} - publishDir = [ - overwrite: true, - path: final_output, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : "${filename}/${meta.child}" } - ] // SAVE - } - } - if(params.automap) { - withName: "^.*VCF_ROH_AUTOMAP:AUTOMAP_AUTOMAP\$" { - ext.prefix = {"automap_${meta.caller}"} - ext.args = {[ - meta.family_samples.tokenize(",").size() > 1 ? "--multivcf" : "--id ${meta.family_samples}", - params.automap_panel_name ? "--panelname ${params.automap_panel_name}" : "" - ].findAll { it != "" }.join(" ")} - publishDir = [ - overwrite: true, - path: final_output, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - } - } + /* + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + VALIDATION + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + */ - withName: "^.*TABIX_FINAL\$" { - publishDir = [ - enabled: true, - overwrite: true, - path: final_output, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] // SAVE - } + withName: "^.*VCF_VALIDATE_SMALL_VARIANTS:RTGTOOLS_VCFEVAL\$" { + ext.args = {"--sample ${meta.sample} --decompose --squash-ploidy"} + ext.prefix = {"${meta.sample}"} + } + + withName: "^.*VCF_VALIDATE_SMALL_VARIANTS:RTGTOOLS_ROCPLOT\$" { + ext.prefix = {"${meta.sample}.${meta.roc_type}"} + } + + /* + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + FINAL PROCESSES + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + */ + + withName: "^.*:VCF2DB\$" { + ext.prefix = { "${meta.id}.${meta.caller}" } + ext.args = "--a-ok gnomAD_AC --a-ok gnomAD_Hom" + } + + withName: "^.*VCF_UPD_UPDIO:BCFTOOLS_FILTER\$" { + ext.prefix = { "${meta.id}.filtered" } + ext.args = [ + "--output-type z", + "--write-index=tbi", + "-r chr1,chr2,chr3,chr4,chr5,chr6,chr7,chr8,chr9,chr10,chr11,chr12,chr13,chr14,chr15,chr16,chr17,chr18,chr19,chr20,chr21,chr22,chrX,chrY" + ].join(" ") + } + + withName: "^.*VCF_UPD_UPDIO:UPDIO\$" { + ext.prefix = {"updio_${meta.caller}"} + ext.args = {[ + "--childID ${meta.child}", + "--momID ${meta.mother}", + "--dadID ${meta.father}", + "--include_MI" + ].join(" ")} + } + + withName: "^.*VCF_ROH_AUTOMAP:AUTOMAP_AUTOMAP\$" { + ext.prefix = {"automap_${meta.caller}"} + ext.args = {[ + meta.family_samples.tokenize(",").size() > 1 ? "--multivcf" : "--id ${meta.family_samples}", + params.automap_panel_name ? "--panelname ${params.automap_panel_name}" : "" + ].findAll { it != "" }.join(" ")} } /* @@ -596,13 +375,7 @@ process { withName: 'MULTIQC' { ext.args = { params.multiqc_title ? "--title \"$params.multiqc_title\"" : '' } - publishDir = [ - path: { "${params.outdir}/multiqc" }, - mode: params.publish_dir_mode, - saveAs: { filename -> filename.equals('versions.yml') ? null : filename } - ] } - } env { diff --git a/conf/seqplorer.config b/conf/seqplorer.config index 2a5c9ed8..ccac085a 100644 --- a/conf/seqplorer.config +++ b/conf/seqplorer.config @@ -15,8 +15,8 @@ params { vep_spliceai = true vep_mastermind = true vep_eog = true - vep_maxentscan = true vep_spliceregion = true - output_suffix = "-gatk4-haplotype-joint-decomposed-annotated" + vcfanno_config = "https://raw.githubusercontent.com/CenterForMedicalGeneticsGhent/nf-cmgg-configs/main/conf/Hsapiens/vcfanno/cmgg_vcfanno.toml" + vcfanno_resources = "${params.genomes_base}/Hsapiens/GRCh38/variation/dbscSNV-1.1/dbscSNV.txt.gz;${params.genomes_base}/Hsapiens/GRCh38/variation/dbscSNV-1.1/dbscSNV.txt.gz.tbi;${params.genomes_base}/Hsapiens/GRCh38/variation/gnomAD/exomes/r2.2.1/gnomad_exomes.vcf.gz;${params.genomes_base}/Hsapiens/GRCh38/variation/gnomAD/exomes/r2.2.1/gnomad_exomes.vcf.gz.csi;${params.genomes_base}/Hsapiens/GRCh38/variation/clinvar/clinvar_20221119.vcf.gz;${params.genomes_base}/Hsapiens/GRCh38/variation/clinvar/clinvar_20221119.vcf.gz.tbi;${params.genomes_base}/Hsapiens/GRCh38/variation/gnomAD/genomes/r3.1.2/gnomad_genomes.vcf.gz;${params.genomes_base}/Hsapiens/GRCh38/variation/gnomAD/genomes/r3.1.2/gnomad_genomes.vcf.gz.csi;${params.genomes_base}/Hsapiens/GRCh38/variation/dbNSFP-4.3/dbNSFP4.3a_grch38.gz;${params.genomes_base}/Hsapiens/GRCh38/variation/dbNSFP-4.3/dbNSFP4.3a_grch38.gz.tbi" } diff --git a/conf/test.config b/conf/test.config index 01e51012..6229b658 100644 --- a/conf/test.config +++ b/conf/test.config @@ -10,6 +10,14 @@ ---------------------------------------------------------------------------------------- */ +process { + resourceLimits = [ + cpus: 4, + memory: '15.GB', + time: '1.h' + ] +} + params { config_profile_name = 'Test profile' config_profile_description = 'Minimal test dataset to check pipeline function' @@ -26,8 +34,10 @@ params { fasta = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000.fasta" fai = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000.fasta.fai" dict = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000.dict" + elfasta = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000.elfasta" sdf = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000_sdf.tar.gz" strtablefile = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000.strtable.zip" + ped = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/test_dots.ped" // Pipeline specific parameters filter = true @@ -37,7 +47,6 @@ params { validate = true add_ped = true vep_chunk_size = 10000 - project = "test" normalize = true updio = true automap = true @@ -48,7 +57,6 @@ params { vcfanno = true vcfanno_config = "${projectDir}/assets/vcfanno.toml" vcfanno_resources = "https://github.com/brentp/vcfanno/raw/master/example/exac.vcf.gz;https://github.com/brentp/vcfanno/raw/master/example/exac.vcf.gz.tbi" - } process { diff --git a/conf/wes.config b/conf/wes.config new file mode 100644 index 00000000..41e693c1 --- /dev/null +++ b/conf/wes.config @@ -0,0 +1,13 @@ +/* +======================================================================================== + Nextflow config file for WES runs +======================================================================================== +*/ + +params { + callers = "haplotypecaller" + scatter_count = 8 + roi = "${params.genomes_base}/Hsapiens/GRCh38/regions/CMGG_WES_analysis_ROI_v5.bed" + updio = true + automap = true +} diff --git a/docs/CITATIONS.md b/docs/CITATIONS.md index 8fa8736d..8dc76574 100644 --- a/docs/CITATIONS.md +++ b/docs/CITATIONS.md @@ -40,6 +40,10 @@ > Quinlan AR and Hall IM, 2010. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 26, 6, pp. 841–842. +- [elPrep](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0244471) + + > Herzeel C, Costanza P, Decap D, Fostier J, Wuyts R, Verachtert W (2021) Multithreaded variant calling in elPrep 5. PLoS ONE 16(2): e0244471 + - [EnsemblVEP](https://pubmed.ncbi.nlm.nih.gov/27268795/) > McLaren W, Gil L, Hunt SE, et al.: The Ensembl Variant Effect Predictor. Genome Biol. 2016 Jun 6;17(1):122. doi: 10.1186/s13059-016-0974-4. PubMed PMID: 27268795; PubMed Central PMCID: PMC4893825. diff --git a/docs/images/germline_metro.png b/docs/images/germline_metro.png index aca56685..794442bb 100644 Binary files a/docs/images/germline_metro.png and b/docs/images/germline_metro.png differ diff --git a/docs/images/germline_metro.svg b/docs/images/germline_metro.svg index faf927a5..34f099cf 100644 --- a/docs/images/germline_metro.svg +++ b/docs/images/germline_metro.svg @@ -2,9 +2,9 @@ elPrepbcftoolsbcftoolsstatsstatstbigermlinesamplesheetsamplesheetsamtools mergeintrasamplesamtools indexmerge bedsmerge bedsintrasamplemosdepthgrepfilter bedsbedtools intersectintersect with roibedtools splitCalibrateDragstrModelsamtools viewCalibrate-cram -> bamDragstrModelbcftoolsconcatbcftools filterHaplotypecallerHaplotypecallerbcftoolsstatsbcftools querydefine regionsmerge bedsintrafamilybedtools splitGenomicsDBImportGenotypeGVCFsbcftools concatvcfannoadd dbsnp idsextract samplesheetsamplesheetsamplesheetsamplesheetrtgtools pedfiltercreate ped headerssomalier relate somalierrelate bcftools annotateadd ped headerbcftools plugin scatterVEPbcftools concatbcftools sortvcfannovcf2dbvcf2dbrtgtoolsvcfevalrtgtoolsrocplotreportsLegendsLegendsMain flowBED flowPED flowProcessreportsMultiQCreportsreportsbcftools normbcftoolsfilterupdioBAM flowsamtools viewcram -> bamtbibcftools normautomap + id="path3447-6-6-4-7-5-2-7-9-4-7-3-0-2" + cx="366.07504" + cy="348.40564" + r="2.6458333" />MultiQCReports diff --git a/docs/index.md b/docs/index.md index f6a20bf9..9b635ef5 100644 --- a/docs/index.md +++ b/docs/index.md @@ -18,7 +18,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool ## Quick Start -1. Install [`Nextflow`](https://www.nextflow.io/docs/latest/getstarted.html#installation) (`>=23.10.0`) +1. Install [`Nextflow`](https://www.nextflow.io/docs/latest/getstarted.html#installation) (`>=24.10.0`) 2. Install any of [`Docker`](https://docs.docker.com/engine/installation/), [`Singularity`](https://www.sylabs.io/guides/3.0/user-guide/) (you can follow [this tutorial](https://singularity-tutorial.github.io/01-installation/)), [`Podman`](https://podman.io/), [`Shifter`](https://nersc.gitlab.io/development/shifter/how-to-use/) or [`Charliecloud`](https://hpc.github.io/charliecloud/) for full pipeline reproducibility _(you can use [`Conda`](https://conda.io/miniconda.html) both to install Nextflow itself and also to manage software within pipelines. Please only use it within pipelines as a last resort; see [docs](https://nf-co.re/usage/configuration#basic-configuration-profiles))_. ```csv title="samplesheet.csv" @@ -34,6 +34,8 @@ Now, you can run the pipeline using: nextflow run nf-cmgg/germline --input samplesheet.csv --outdir --genome GRCh38 -profile ``` +This pipeline contains a lot of parameters to customize your pipeline run. Please take a look at the [parameters](parameters.md) documentation for an overview. + !!!warning Please provide pipeline parameters via the CLI or Nextflow `-params-file` option. Custom config files including those provided by the `-c` Nextflow option can be used to provide any configuration _**except for parameters**_; @@ -43,7 +45,7 @@ nextflow run nf-cmgg/germline --input samplesheet.csv --outdir --genome nf-cmgg/germline was originally written and is maintained by [@nvnieuwk](https://github.com/nvnieuwk). -Special thanks to [@matthdsm](https://github.com/matthdsm) for the many tips and feedback and to [@mvheetve](https://github.com/mvheetve) for testing the pipeline. +Special thanks to [@matthdsm](https://github.com/matthdsm) for the many tips and feedback and to [@mvheetve](https://github.com/mvheetve) and [@ToonRossel](https://github.com/ToonRosseel) for testing the pipeline. ## Contributions and Support diff --git a/docs/output.md b/docs/output.md index 34c10e15..facf11d9 100644 --- a/docs/output.md +++ b/docs/output.md @@ -1,79 +1,120 @@ # nf-cmgg/germline: Output -# nf-cmgg/germline: Output - ## Introduction This page describes the output produced by the pipeline. -The directories listed below will be created in the results directory after the pipeline has finished. All paths are relative to the top-level output directory (specified by `--outdir `). This is an example output when the pipeline has been run for a WGS sample called `SAMPLE_1` and a WES sample called `SAMPLE_2` which form a family called `FAMILY_1`. The output consists of 4 directories: `yyyy-MM-dd_project_name`, `individuals`, `multiqc_reports` and `pipeline_info`. This run has only been run with `haplotypecaller` (`--callers haplotypecaller`) +The output directory has been structured in such a way that you can pass the same output directory to it for each pipeline run. The pipeline will add the files to that directory in a traceable way without overwriting already existing files. This makes it easy to store data, coming from multiple sequencing runs, in the same root directory. + +To explain the structure of the output directory, a simple example run consisting of two families is used. The first family (`family1`) is a family consisting of a trio (son, father and mother) and the second family (`family2`) consists of a single sample. ```bash -results/ -├── YYYY_MM_DD_project_name #(1)! -│ └── FAMILY_1 #(2)! -│ ├── FAMILY_1.bed #(3)! -│ ├── FAMILY_1.haplotypecaller.ped #(4)! -│ ├── FAMILY_1.haplotypecaller.vcf.gz #(5)! -│ ├── FAMILY_1.haplotypecaller.vcf.gz.tbi #(6)! -│ └── reports -│ ├── FAMILY_1.haplotypecaller.bcftools_stats.txt #(7)! -│ └── FAMILY_1.haplotypecaller.somalier.html #(8)! -├── multiqc -│ ├── multiqc_data/ -│ └── multiqc_report.html #(9)! -├── SAMPLE_1 #(10)! -│ ├── SAMPLE_1.bed #(11)! -│ ├── SAMPLE_1.haplotypecaller.g.vcf.gz #(12)! -│ ├── SAMPLE_1.haplotypecaller.g.vcf.gz.tbi -│ └── reports -│ ├── SAMPLE_1.haplotypecaller.bcftools_stats.txt -│ ├── SAMPLE_1.mosdepth.global.dist.txt #(13)! -│ └── SAMPLE_1.mosdepth.summary.txt #(14)! -├── SAMPLE_2 -│ ├── SAMPLE_2.bed -│ ├── SAMPLE_2.haplotypecaller.g.vcf.gz -│ ├── SAMPLE_2.haplotypecaller.g.vcf.gz.tbi -│ └── reports -│ ├── SAMPLE_2.haplotypecaller.bcftools_stats.txt -│ ├── SAMPLE_2.mosdepth.global.dist.txt -│ └── SAMPLE_2.mosdepth.summary.txt -├── pipeline_info/ #(15)! -└── samplesheet.csv #(16)! + #(1)! +├── family1 #(2)! +│ ├── output__ #(3)! +│ │ ├── automap #(4)! +│ │ │ └── #(5)! +│ │ │ ├── sample1 #(6)! +│ │ │ │ ├── sample1.HomRegions..tsv +│ │ │ │ ├── sample1.HomRegions.pdf +│ │ │ │ ├── sample1.HomRegions.strict..tsv +│ │ │ │ └── sample1.HomRegions.tsv +│ │ │ ├── sample2 +│ │ │ └── sample3 +│ │ ├── family1..bed #(7)! +│ │ ├── family1..db #(8)! +│ │ ├── family1..ped #(9)! +│ │ ├── family1..vcf.gz #(10)! +│ │ └── family1..vcf.gz.tbi #(11)! +│ ├── qc__ #(12)! +│ │ ├── family1..bcftools_stats.txt #(13)! +│ │ └── family1..html #(14)! +│ ├── sample1__ #(15)! +│ │ ├── sample1.bed #(16)! +│ │ ├── sample1..bcftools_stats.txt #(17)! +│ │ ├── sample1..g.vcf.gz #(18)! +│ │ ├── sample1..g.vcf.gz.tbi #(19)! +│ │ └── validation #(20)! +│ │ └── #(21)! +│ │ ├── ... #(22)! +│ │ └── sample1.summary.txt #(23)! +│ ├── sample2__ +│ └── sample3__ +├── family2 +│ ├── output__ +│ ├── qc__ +│ └── sample4__ +└── _ #(24)! + ├── execution_report__--.html #(25)! + ├── execution_timeline__--.html #(26)! + ├── execution_trace__--.html #(27)! + ├── multiqc_report.html #(28)! + ├── params_2024-11-18_15-41-14.json #(29)! + ├── pipeline_dag__--.html #(30)! + ├── pipeline_software_mqc_versions.yml #(31)! + └── samplesheet. #(32)! ``` -1. This is the name of the main pipeline output. It contains the current date and the mnemonic name of the pipeline run by default. The date can be excluded with the `--skip_date_project` parameter and the name can be customized with the `--project ` parameter. +1. The output directory specified with `--outdir` + +2. The first family name specified in the samplesheet in the `family` field + +3. This folder contains all major outputs of the current family + +4. This folder will only be made when the `--automap` parameter has been used. It contains all output files from the automap process + +5. A specific folder containing postprocessing output generated for the caller used. This folder will be created for each caller provided to the `--callers` parameter + +6. This folder contains the files for the specified sample + +7. The BED file used to create the VCF file in this folder using the caller specified in the filename + +8. The Gemini DB file generated from the output VCF and the PED file. This file will only be created when `--gemini` has been used + +9. The PED file for the current family. This file will contain the correct samples from the input PED file, when given. The pipeline will try and infer a PED file automatically when none has been given. Mind that the inferring of the PED file can have some issues and isn't perfect. Giving a PED file is the recommended way of providing relational data to the pipeline + +10. The final VCF file created using the caller specified in the filename. All required postprocessing methods have been applied on this file + +11. The index of the final VCF file + +12. This folder contains all quality metrics for the family + +13. The statistics calculated by `bcftools stats` + +14. The relational report created by `somalier relate` + +15. The folder containing sample specific files -2. This directory contains all files for family `FAMILY_1`. +16. The BED file used to create the GVCF files for the sample -3. This is the BED file used to parallelize the joint-genotyping. It contains all regions that have reads mapped to them for WGS and all regions in the regions of interest that have reads mapped to them for WES. +17. The statistics of the GVCF file, calculate by `bcftools stats` -4. The PED file detailing the relation between the different members of the family. This file will be inferred when no PED file has been given to this family. +18. The GVCF file generated by the specified caller -5. The resulting VCF for this family. All desired post-processing has been applied on this file. +19. The index of the GVCF file -6. The index of the resulting VCF. +20. This folder contains the validation metrics of this specific sample in the final VCF -7. The statistics created with `bcftools stats` for the resulting VCF. +21. This folder contains the validation metrics for the final VCF generated using the specified caller -8. The results of `somalier relate`. +22. Additional files were removed from this example, but they are several VCF files and images for deeper analysis of the validation -9. The report created with MultiQC. This contains all statistics generated with `bcftools stats`, Ensembl VEP and other tools. +23. This file contains a summary of the validation metrics -10. The folder for `SAMPLE_1` containing temporary files that could be useful for re-analysing later. +24. This folder contains pipeline metrics and other pipeline run specific files -11. This is the BED file used to parallelize the variant calling. It contains all regions that have reads mapped to them for WGS and all regions in the regions of interest that have reads mapped to them for WES. +25. This file is an HTML file that summarizes a lot of metrics of the pipeline run (cpu usage, memory usage, walltime...) -12. The GVCF file created with `haplotypecaller`. This can used in later runs of the pipeline to skip variant calling for this sample. A major use case for this is to add a new member to a family without having to call all variants of already called members. +26. This file is an HTML file that visualizes the timeline of the pipeline run -13. The global distribution of the coverage calculated by `mosdepth`. +27. This file is an HTML file that visualizes the trace of the pipeline run -14. The summary created by `mosdepth`. +28. The multiqc report containing all main statistics of the output data and tool versions -15. The directory containing information of the pipeline run. +29. A JSON file containing the used parameters to run this pipeline run -16. The samplesheet used for the pipeline run. +30. This file is an HTML file that visualizes the DAG of the pipeline run -## Pipeline overview +31. This file contains a list of all tools used in the pipeline and their versions -[Nextflow](https://www.nextflow.io/docs/latest/tracing.html) provides excellent functionality for generating various reports relevant to the running and execution of the pipeline. This will allow you to troubleshoot errors with the running of the pipeline, and also provide you with other information such as launch commands, run times and resource usage. +32. The samplesheet used to run this pipeline run diff --git a/docs/parameters.md b/docs/parameters.md index e2bd97e4..5f80e486 100644 --- a/docs/parameters.md +++ b/docs/parameters.md @@ -10,9 +10,9 @@ Define where the pipeline should find input data and save output data. | ---------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | -------- | ------- | -------- | ------ | | `input` | Path to comma-separated file containing information about the samples in the experiment.
HelpYou will need to create a design file with information about the samples in your experiment before running the pipeline. Use this parameter to specify its location. It has to be a comma-separated file with samples, and a header row. See [usage docs](./usage.md).
| `string` | | True | | | `outdir` | The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure. | `string` | | True | | -| `watchdir` | A folder to watch for the creation of files that start with `watch:` in the samplesheet | `string` | | | | +| `watchdir` | A folder to watch for the creation of files that start with `watch:` in the samplesheet. | `string` | | | | | `email` | Email address for completion summary.
HelpSet this parameter to your e-mail address to get a summary e-mail with details of the run sent to you when the workflow exits. If set in your user config file (`~/.nextflow/config`) then you don't need to specify this on the command line for every run.
| `string` | | | | -| `ped` | Path to a pedigree file for all samples in the run | `string` | | | | +| `ped` | Path to a pedigree file for all samples in the run. All relational data will be fetched from this file. | `string` | | | | ## Reference genome options @@ -20,15 +20,18 @@ Reference genome related files and options required for the workflow. | Parameter | Description | Type | Default | Required | Hidden | | ------------------ | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | ------------ | -------- | ------ | -| `genome` | Reference genome build
HelpRequires a Genome Reference Consortium reference ID (e.g. GRCh38)
| `string` | GRCh38 | | | +| `genome` | Reference genome build. Used to fetch the right reference files.
HelpRequires a Genome Reference Consortium reference ID (e.g. GRCh38)
| `string` | GRCh38 | | | | `fasta` | Path to FASTA genome file.
HelpThis parameter is _mandatory_ if `--genome` is not specified. The path to the reference genome fasta.
| `string` | | True | | | `fai` | Path to FASTA genome index file. | `string` | | | | -| `dict` | Path to the sequence dictionary generated from the FASTA reference | `string` | | | | -| `strtablefile` | Path to the STR table file generated from the FASTA reference | `string` | | | | -| `sdf` | Path to the SDF folder generated from the reference FASTA file | `string` | | | | -| `genomes_base` | Directory base for CMGG reference store (used when --genomes_ignore false is specified) | `string` | /references/ | | | +| `dict` | Path to the sequence dictionary generated from the FASTA reference. This is only used when `haplotypecaller` is one of the specified callers. | `string` | | | | +| `strtablefile` | Path to the STR table file generated from the FASTA reference. This is only used when `--dragstr` has been given. | `string` | | | | +| `sdf` | Path to the SDF folder generated from the reference FASTA file. This is only required when using `--validate`. | `string` | | | | +| `elfasta` | Path to the ELFASTA genome file. This is used when `elprep` is part of the callers and will be automatically generated when missing. | `string` | | | | +| `elsites` | Path to the elsites file. This is used when `elprep` is part of the callers. | `string` | | | | +| `genomes` | Object for genomes | `object` | | | True | +| `genomes_base` | Directory base for CMGG reference store (used when `--genomes_ignore false` is specified) | `string` | /references/ | | | | `cmgg_config_base` | The base directory for the local config files | `string` | /conf/ | | True | -| `genomes_ignore` | Do not load the local references from the path specified with --genomes_base | `boolean` | | | True | +| `genomes_ignore` | Do not load the local references from the path specified with `--genomes_base` | `boolean` | | | True | | `igenomes_base` | Directory / URL base for iGenomes references. | `string` | | | True | | `igenomes_ignore` | Do not load the iGenomes reference config.
HelpDo not load `igenomes.config` when running the pipeline. You may choose this option if you observe clashes between custom parameters and those supplied in `igenomes.config`.
| `boolean` | | | True | @@ -36,39 +39,38 @@ Reference genome related files and options required for the workflow. Parameters that define how the pipeline works -| Parameter | Description | Type | Default | Required | Hidden | -| -------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | ------------------------------------------------------------------ | -------- | ------ | -| `scatter_count` | The amount of scattering that should happen per sample.
HelpIncrease this number to increase the pipeline run speed, but at the tradeoff of using more IO and disk space. This can differ from the actual scatter count in some cases (especially with smaller files).
This has an effect on HaplotypeCaller, GenomicsDBImport and GenotypeGVCFs.
| `integer` | 40 | | | -| `merge_distance` | The merge distance for genotype BED files
HelpIncrease this parameter if GenomicsDBImport is running slow. This defines the maximum distance between intervals that should be merged. The less intervals GenomicsDBImport actually gets, the faster it will run.
| `integer` | 100000 | | | -| `dragstr` | Create DragSTR models to be used with HaplotypeCaller
HelpThis currently is only able to run single-core per sample. Due to this, the process is very slow with only very small improvements to the analysis.
| `boolean` | | | | -| `validate` | Validate the found variants
HelpThis only validates individual sample GVCFs that have truth VCF supplied to them via the samplesheet (in row `truth_vcf`, with an optional index in the `truth_tbi` row)
| `boolean` | | | | -| `filter` | Filter the found variants | `boolean` | | | | -| `annotate` | Annotate the found variants | `boolean` | | | | -| `add_ped` | Add PED INFO header lines to the final VCFs | `boolean` | | | | -| `gemini` | Create a Gemini databases from the final VCFs | `boolean` | | | | -| `mosdepth_slow` | Don't run mosdepth in fast-mode
HelpThis is advised if you need exact coverage BED files as output
| `boolean` | | | | -| `project` | The name of the project.
HelpThis will be used to specify the final output files folder in the output directory.
| `string` | | | | -| `skip_date_project` | Don't add the current date to the output project folder | `boolean` | | | | -| `roi` | Path to the default ROI (regions of interest) BED file to be used for WES analysis
HelpThis will be used for all samples that do not have a specific ROI file supplied to them through the samplesheet. Don't supply an ROI file to run the analysis as WGS.
| `string` | | | | -| `dbsnp` | Path to the dbSNP VCF file | `string` | | | | -| `dbsnp_tbi` | Path to the index of the dbSNP VCF file | `string` | | | | -| `somalier_sites` | Path to the VCF file with sites for Somalier to use | `string` | https://github.com/brentp/somalier/files/3412456/sites.hg38.vcf.gz | | | -| `only_call` | Only call the variants without doing any post-processing | `boolean` | | | | -| `only_merge` | Only run the pipeline until the creation of the genomicsdbs and output them | `boolean` | | | | -| `output_genomicsdb` | Output the genomicsDB together with the joint-genotyped VCF | `boolean` | | | | -| `callers` | A comma delimited string of the available callers. Current options are: 'haplotypecaller' and 'vardict' | `string` | haplotypecaller | | | -| `vardict_min_af` | The minimum allele frequency for VarDict when no `vardict_min_af` is supplied in the samplesheet | `number` | 0.1 | | | -| `normalize` | Normalize the VCFs | `boolean` | | | | -| `output_suffix` | A custom suffix to add to the basename of the output files | `string` | | | | -| `only_pass` | Filter out all variants that don't have the PASS filter for vardict. This only works when --filter is also given | `boolean` | | | | -| `keep_alt_contigs` | Keep all aditional contigs for calling instead of filtering them out before | `boolean` | | | | -| `updio` | Run UPDio analysis on the resulting VCFs | `boolean` | | | | -| `updio_common_cnvs` | A TSV file containing common CNVs to be used by UPDio | `string` | | | | -| `automap` | Run AutoMap analysis on the resulting VCFs | `boolean` | | | | -| `automap_repeats` | BED file with repeat regions in the genome.
HelpThis file will be automatically generated for hg38/GRCh38 and hg19/GRCh37 when this parameter has not been given.
| `string` | | | | -| `automap_panel` | TXT file with gene panel regions to be used by AutoMap.
HelpBy default the CMGG gene panel list will be used.
| `string` | | | | -| `automap_panel_name` | The panel name of the panel given with --automap_panel. | `string` | cmgg_bio | | | -| `hc_phasing` | Perform phasing with HaplotypeCaller | `boolean` | | | | +| Parameter | Description | Type | Default | Required | Hidden | +| ----------------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | ------------------------------------------------------------------ | -------- | ------ | +| `scatter_count` | The amount of scattering that should happen per sample.
HelpIncrease this number to increase the pipeline run speed, but at the tradeoff of using more IO and disk space. This can differ from the actual scatter count in some cases (especially with smaller files).
This has an effect on HaplotypeCaller, GenomicsDBImport and GenotypeGVCFs.
| `integer` | 40 | | | +| `merge_distance` | The merge distance for family BED files
HelpIncrease this parameter if GenomicsDBImport is running slow. This defines the maximum distance between intervals that should be merged. The less intervals GenomicsDBImport actually gets, the faster it will run.
| `integer` | 100000 | | | +| `dragstr` | Create DragSTR models to be used with HaplotypeCaller
HelpThis currently is only able to run single-core per sample. Due to this, the process is very slow with only very small improvements to the analysis.
| `boolean` | | | | +| `validate` | Validate the found variants | `boolean` | | | | +| `filter` | Filter the found variants. | `boolean` | | | | +| `annotate` | Annotate the found variants using Ensembl VEP. | `boolean` | | | | +| `add_ped` | Add PED INFO header lines to the final VCFs. | `boolean` | | | | +| `gemini` | Create a Gemini databases from the final VCFs. | `boolean` | | | | +| `mosdepth_slow` | Don't run mosdepth in fast-mode
HelpThis is advised if you need exact coverage BED files as output.
| `boolean` | | | | +| `roi` | Path to the default ROI (regions of interest) BED file to be used for WES analysis.
HelpThis will be used for all samples that do not have a specific ROI file supplied to them through the samplesheet. Don't supply an ROI file to run the analysis as WGS.
| `string` | | | | +| `dbsnp` | Path to the dbSNP VCF file. This will be used to set the variant IDs. | `string` | | | | +| `dbsnp_tbi` | Path to the index of the dbSNP VCF file. | `string` | | | | +| `somalier_sites` | Path to the VCF file with sites for Somalier to use. | `string` | https://github.com/brentp/somalier/files/3412456/sites.hg38.vcf.gz | | | +| `only_call` | Only call the variants without doing any post-processing. | `boolean` | | | | +| `only_merge` | Only run the pipeline until the creation of the genomicsdbs and output them. | `boolean` | | | | +| `output_genomicsdb` | Output the genomicsDB together with the joint-genotyped VCF. | `boolean` | | | | +| `callers` | A comma delimited string of the available callers. Current options are: `haplotypecaller` and `vardict`. | `string` | haplotypecaller | | | +| `vardict_min_af` | The minimum allele frequency for VarDict when no `vardict_min_af` is supplied in the samplesheet. | `number` | 0.1 | | | +| `normalize` | Normalize the variant in the final VCFs. | `boolean` | | | | +| `only_pass` | Filter out all variants that don't have the PASS filter for vardict. This only works when `--filter` is also given. | `boolean` | | | | +| `keep_alt_contigs` | Keep all aditional contigs for calling instead of filtering them out before. | `boolean` | | | | +| `updio` | Run UPDio analysis on the final VCFs. | `boolean` | | | | +| `updio_common_cnvs` | A TSV file containing common CNVs to be used by UPDio. | `string` | | | | +| `automap` | Run AutoMap analysis on the final VCFs. | `boolean` | | | | +| `automap_repeats` | BED file with repeat regions in the genome.
HelpThis file will be automatically generated for hg38/GRCh38 and hg19/GRCh37 when this parameter has not been given.
| `string` | | | | +| `automap_panel` | TXT file with gene panel regions to be used by AutoMap.
HelpBy default the CMGG gene panel list will be used.
| `string` | | | | +| `automap_panel_name` | The panel name of the panel given with --automap_panel. | `string` | cmgg_bio | | | +| `hc_phasing` | Perform phasing with HaplotypeCaller. | `boolean` | | | | +| `min_callable_coverage` | The lowest callable coverage to determine callable regions. | `integer` | 5 | | | +| `unique_out` | Don't change this value | `string` | | | True | ## Institutional config options @@ -89,7 +91,6 @@ Less common options for the pipeline, typically set in a config file. | Parameter | Description | Type | Default | Required | Hidden | | ------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --------- | -------------------------------------------------------- | -------- | ------ | -| `help` | Display help text. | `boolean` | | | | | `version` | Display version and exit. | `boolean` | | | | | `publish_dir_mode` | Method used to save pipeline results to output directory.
HelpThe Nextflow `publishDir` option specifies which intermediate files should be saved to the output directory. This option tells the pipeline what method should be used to move these files. See [Nextflow docs](https://www.nextflow.io/docs/latest/process.html#publishdir) for details.
| `string` | copy | | | | `email_on_fail` | Email address for completion summary, only when pipeline fails.
HelpAn email address to send a summary email to when the pipeline is completed - ONLY sent if the pipeline does not exit successfully.
| `string` | | | True | @@ -108,34 +109,34 @@ Less common options for the pipeline, typically set in a config file. Parameters to configure Ensembl VEP and VCFanno -| Parameter | Description | Type | Default | Required | Hidden | -| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | --------- | ------------ | -------- | ------ | -| `vep_chunk_size` | The amount of sites per split VCF as input to VEP | `integer` | 50000 | | | -| `species` | The species of the samples
HelpMust be lower case and have underscores as spaces
| `string` | homo_sapiens | | | -| `vep_merged` | Specify if the VEP cache is a merged cache | `boolean` | True | | | -| `vep_cache` | The path to the VEP cache | `string` | | | | -| `vep_dbnsfp` | Use the dbNSFP plugin with Ensembl VEP
HelpThe '--dbnsfp' and '--dbnsfp_tbi' parameters need to be specified when using this parameter.
| `boolean` | | | | -| `vep_spliceai` | Use the SpliceAI plugin with Ensembl VEP
HelpThe '--spliceai_indel', '--spliceai_indel_tbi', '--spliceai_snv' and '--spliceai_snv_tbi' parameters need to be specified when using this parameter.
| `boolean` | | | | -| `vep_spliceregion` | Use the SpliceRegion plugin with Ensembl VEP | `boolean` | | | | -| `vep_mastermind` | Use the Mastermind plugin with Ensembl VEP
HelpThe '--mastermind' and '--mastermind_tbi' parameters need to be specified when using this parameter.
| `boolean` | | | | -| `vep_maxentscan` | Use the MaxEntScan plugin with Ensembl VEP
HelpThe '--maxentscan' parameter need to be specified when using this parameter.
| `boolean` | | | | -| `vep_eog` | Use the custom EOG annotation with Ensembl VEP
HelpThe '--eog' and '--eog_tbi' parameters need to be specified when using this parameter.
| `boolean` | | | | -| `vep_alphamissense` | Use the AlphaMissense plugin with Ensembl VEP
HelpThe '--alphamissense' and '--alphamissense_tbi' parameters need to be specified when using this parameter.
| `boolean` | | | | -| `vep_version` | The version of the VEP tool to be used | `number` | 105.0 | | | -| `vep_cache_version` | The version of the VEP cache to be used | `integer` | 105 | | | -| `dbnsfp` | Path to the dbSNFP file | `string` | | | | -| `dbnsfp_tbi` | Path to the index of the dbSNFP file | `string` | | | | -| `spliceai_indel` | Path to the VCF containing indels for spliceAI | `string` | | | | -| `spliceai_indel_tbi` | Path to the index of the VCF containing indels for spliceAI | `string` | | | | -| `spliceai_snv` | Path to the VCF containing SNVs for spliceAI | `string` | | | | -| `spliceai_snv_tbi` | Path to the index of the VCF containing SNVs for spliceAI | `string` | | | | -| `mastermind` | Path to the VCF for Mastermind | `string` | | | | -| `mastermind_tbi` | Path to the index of the VCF for Mastermind | `string` | | | | -| `alphamissense` | Path to the TSV for AlphaMissense | `string` | | | | -| `alphamissense_tbi` | Path to the index of the TSV for AlphaMissense | `string` | | | | -| `eog` | Path to the VCF containing EOG annotations | `string` | | | | -| `eog_tbi` | Path to the index of the VCF containing EOG annotations | `string` | | | | -| `vcfanno` | Run annotations with vcfanno | `boolean` | | | | -| `vcfanno_config` | The path to the VCFanno config TOML | `string` | | | | -| `vcfanno_lua` | The path to a Lua script to be used in VCFanno | `string` | | | | -| `vcfanno_resources` | A semicolon-seperated list of resource files for VCFanno, please also supply their indices using this parameter | `string` | | | | +| Parameter | Description | Type | Default | Required | Hidden | +| -------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | --------- | ------------ | -------- | ------ | +| `vep_chunk_size` | The amount of sites per split VCF as input to VEP. | `integer` | 50000 | | | +| `species` | The species of the samples.
HelpMust be lower case and have underscores as spaces.
| `string` | homo_sapiens | | | +| `vep_merged` | Specify if the VEP cache is a merged cache. | `boolean` | True | | | +| `vep_cache` | The path to the VEP cache. | `string` | | | | +| `vep_dbnsfp` | Use the dbNSFP plugin with Ensembl VEP.
HelpThe '--dbnsfp' and '--dbnsfp_tbi' parameters need to be specified when using this parameter.
| `boolean` | | | | +| `vep_spliceai` | Use the SpliceAI plugin with Ensembl VEP.
HelpThe '--spliceai_indel', '--spliceai_indel_tbi', '--spliceai_snv' and '--spliceai_snv_tbi' parameters need to be specified when using this parameter.
| `boolean` | | | | +| `vep_spliceregion` | Use the SpliceRegion plugin with Ensembl VEP. | `boolean` | | | | +| `vep_mastermind` | Use the Mastermind plugin with Ensembl VEP.
HelpThe '--mastermind' and '--mastermind_tbi' parameters need to be specified when using this parameter.
| `boolean` | | | | +| `vep_maxentscan` | Use the MaxEntScan plugin with Ensembl VEP.
HelpThe '--maxentscan' parameter need to be specified when using this parameter.
| `boolean` | | | | +| `vep_eog` | Use the custom EOG annotation with Ensembl VEP.
HelpThe '--eog' and '--eog_tbi' parameters need to be specified when using this parameter.
| `boolean` | | | | +| `vep_alphamissense` | Use the AlphaMissense plugin with Ensembl VEP.
HelpThe '--alphamissense' and '--alphamissense_tbi' parameters need to be specified when using this parameter.
| `boolean` | | | | +| `vep_version` | The version of the VEP tool to be used. | `number` | 105.0 | | | +| `vep_cache_version` | The version of the VEP cache to be used. | `integer` | 105 | | | +| `dbnsfp` | Path to the dbSNFP file. | `string` | | | | +| `dbnsfp_tbi` | Path to the index of the dbSNFP file. | `string` | | | | +| `spliceai_indel` | Path to the VCF containing indels for spliceAI. | `string` | | | | +| `spliceai_indel_tbi` | Path to the index of the VCF containing indels for spliceAI. | `string` | | | | +| `spliceai_snv` | Path to the VCF containing SNVs for spliceAI. | `string` | | | | +| `spliceai_snv_tbi` | Path to the index of the VCF containing SNVs for spliceAI. | `string` | | | | +| `mastermind` | Path to the VCF for Mastermind. | `string` | | | | +| `mastermind_tbi` | Path to the index of the VCF for Mastermind. | `string` | | | | +| `alphamissense` | Path to the TSV for AlphaMissense. | `string` | | | | +| `alphamissense_tbi` | Path to the index of the TSV for AlphaMissense. | `string` | | | | +| `eog` | Path to the VCF containing EOG annotations. | `string` | | | | +| `eog_tbi` | Path to the index of the VCF containing EOG annotations. | `string` | | | | +| `vcfanno` | Run annotations with vcfanno. | `boolean` | | | | +| `vcfanno_config` | The path to the VCFanno config TOML. | `string` | | | | +| `vcfanno_lua` | The path to a Lua script to be used in VCFanno. | `string` | | | | +| `vcfanno_resources` | A semicolon-seperated list of resource files for VCFanno, please also supply their indices using this parameter. | `string` | | | | diff --git a/docs/usage.md b/docs/usage.md index 4866cd53..4af36894 100644 --- a/docs/usage.md +++ b/docs/usage.md @@ -1,7 +1,5 @@ # nf-cmgg/germline: Usage -# nf-cmgg/germline: Usage - > _Documentation of pipeline parameters can be found in the [parameters documentation](./parameters.md)_ ## Samplesheet input @@ -21,7 +19,7 @@ sample,cram,crai SAMPLE_1,watch:INPUT.cram,watch:INPUT.cram.crai ``` -The files `INPUT.cram` and `INPUT.cram.crai` will now watched for recursively in the watch directory. +The files `INPUT.cram` and `INPUT.cram.crai` will now be watched for recursively in the watch directory. ### Example of the samplesheet @@ -100,12 +98,13 @@ The samplesheet can have following columns: | `ped` | OPTIONAL - Full path to PED file containing the relational information between samples in the same family. File has to have the extension `.ped`. | | `truth_vcf` | OPTIONAL - Full path to the VCF containing all the truth variants of the current sample. The validation subworkflow will be run when this file is supplied and the `--validate true` flag has been given. File has to have the extension `.vcf.gz` | | `truth_tbi` | OPTIONAL - Full path to the index of the truth VCF. This file can either be supplied by the user or generated by the pipeline. File has to have the extensions `.tbi` | +| `truth_bed` | OPTIONAL - Full path to the BED file containing the golden truth regions in the `truth_vcf` file. File has to have the extensions `.bed` | | `roi` | OPTIONAL - Full path to a BED file containing the regions of interest for the current sample to call on. When this file is given, the pipeline will run this sample in WES mode. (The flag `--roi ` can also be given to run WES mode for all samples using the file specified by the flag) File has to have the extension `.bed` or `.bed.gz`. | | `vardict_min_af` | OPTIONAL - The minimum AF value to use for the vardict variant caller (`--callers vardict`). This can be set in the samplesheet when it differs for all samples. A default can be set using the `--vardict_min_af` parameter (whichs defaults to 0.1) | !!!note - The `sample` identifiers have to be the same when you have re-sequenced the same sample more than once e.g. to increase sequencing depth. Either the `ped` or `family` field can be used to specify the family name. The pipeline automatically extracts the family id from the `ped` file if the `family` field is empty. The `family` is used to specify on which samples the joint-genotyping should be performed. If neither the `ped` or `family` fields are used, the pipeline will default to a single-sample family with the sample name as its ID. + The `sample` fields has to contain the same value when you have re-sequenced the same sample more than once e.g. to increase sequencing depth. Either the `ped` or `family` field can be used to specify the family name. The pipeline automatically extracts the family id from the `ped` file if the `family` field is empty. The `family` is used to specify on which samples the joint-genotyping should be performed. If neither the `ped` or `family` fields are used, the pipeline will default to a single-sample family with the sample name as its ID. This is an example of a working samplesheet used to test this pipeline: @@ -128,13 +127,13 @@ Note that the pipeline will create the following files in your working directory ```bash work #(1)! results #(2)! -.nextflow_log #(3)! +.nextflow.log #(3)! ... #(4)! ``` 1. Directory containing the nextflow working files -2. Finished results in specified location (defined with --outdir) +2. Finished results in specified location (defined with --outdir). See [output](./output.md) documentation for more on this. 3. Log file from Nextflow @@ -154,7 +153,7 @@ The above pipeline run specified with a params file in yaml format: nextflow run nf-cmgg/germline -profile docker -params-file params.yaml ``` -with `params.yaml` containing: +with: ```yaml title="params.yaml" input: './samplesheet.csv' @@ -168,15 +167,13 @@ genome: 'GRCh38' When you run the above command, Nextflow automatically pulls the pipeline code from GitHub and stores it as a cached version. When running the pipeline after this, it will always use the cached version if available - even if the pipeline has been updated since. To make sure that you're running the latest version of the pipeline, make sure that you regularly update the cached version of the pipeline. You can also add the `-latest` argument to your run command to automatically fetch the latest version on every run: ```bash -nextflow pull nf-cmgg/germline -nextflow pull nf-cmgg/germline +nextflow pull nf-cmgg/germline -r ``` ### Reproducibility It is a good idea to specify a pipeline version when running the pipeline on your data. This ensures that a specific version of the pipeline code and software are used when you run your pipeline. If you keep using the same tag, you'll be running the same version of the pipeline, even if there have been changes to the code since. -First, go to the [nf-cmgg/germline releases page](https://github.com/nf-cmgg/germline/releases) and find the latest pipeline version - numeric only (eg. `1.3.1`). Then specify this when running the pipeline with `-r` (one hyphen) - eg. `-r 1.3.1`. Of course, you can switch to another version by changing the number after the `-r` flag. First, go to the [nf-cmgg/germline releases page](https://github.com/nf-cmgg/germline/releases) and find the latest pipeline version - numeric only (eg. `1.3.1`). Then specify this when running the pipeline with `-r` (one hyphen) - eg. `-r 1.3.1`. Of course, you can switch to another version by changing the number after the `-r` flag. This version number will be logged in reports when you run the pipeline, so that you'll know what you used when you look back in the future. For example, at the bottom of the MultiQC reports. @@ -239,21 +236,14 @@ You can also supply a run name to resume a specific run: `-resume [run-name]`. U ### `-c` -Specify the path to a specific config file (this is a core Nextflow command). See the [nf-core website documentation](https://nf-co.re/usage/configuration) for more information. +Specify the path to a specific config file. See the [nf-core website documentation](https://nf-co.re/usage/configuration) for more information. ## Custom configuration ### Resource requests -Whilst the default requirements set within the pipeline will hopefully work for most people and with most input data, you may find that you want to customise the compute resources that the pipeline requests. Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the steps in the pipeline, if the job exits with any of the error codes specified [here](https://github.com/nf-core/rnaseq/blob/4c27ef5610c87db00c3c5a3eed10b1d161abf575/conf/base.config#L18) it will automatically be resubmitted with higher requests (2 x original, then 3 x original). If it still fails after the third attempt then the pipeline execution is stopped. - -To change the resource requests, please see the [max resources](https://nf-co.re/docs/usage/configuration#max-resources) and [tuning workflow resources](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources) section of the nf-core website. - -### Custom Containers +Whilst the default requirements set within the pipeline will hopefully work for most people and with most input data, you may find that you want to customise the compute resources that the pipeline requests. Each step in the pipeline has a default set of requirements for number of CPUs, memory and time. For most of the steps in the pipeline, if the job exits with any of the error codes specified [here](https://github.com/nf-cmgg/germline/blob/b637c64c2e1eeb1527d481a377f60950c9a114b8/conf/base.config#L17) it will automatically be resubmitted with higher requests (2 x original, then 3 x original). If it still fails after the third attempt then the pipeline execution is stopped. -In some cases you may wish to change which container or conda environment a step of the pipeline uses for a particular tool. By default nf-core pipelines use containers and software from the [biocontainers](https://biocontainers.pro/) or [bioconda](https://bioconda.github.io/) projects. However in some cases the pipeline specified version maybe out of date. - -To use a different container from the default container or conda environment specified in a pipeline, please see the [updating tool versions](https://nf-co.re/docs/usage/configuration#updating-tool-versions) section of the nf-core website. To change the resource requests, please see the [max resources](https://nf-co.re/docs/usage/configuration#max-resources) and [tuning workflow resources](https://nf-co.re/docs/usage/configuration#tuning-workflow-resources) section of the nf-core website. ### Custom Containers @@ -268,12 +258,6 @@ A pipeline might not always support every possible argument or option of a parti To learn how to provide additional arguments to a particular tool of the pipeline, please see the [customising tool arguments](https://nf-co.re/docs/usage/configuration#customising-tool-arguments) section of the nf-core website. -### Custom Tool Arguments - -A pipeline might not always support every possible argument or option of a particular tool used in pipeline. Fortunately, nf-core pipelines provide some freedom to users to insert additional parameters that the pipeline does not include by default. - -To learn how to provide additional arguments to a particular tool of the pipeline, please see the [customising tool arguments](https://nf-co.re/docs/usage/configuration#customising-tool-arguments) section of the nf-core website. - ### nf-core/configs In most cases, you will only need to create a custom config as a one-off but if you and others within your organisation are likely to be running nf-core pipelines regularly and need to use the same settings regularly it may be a good idea to request that your custom config file is uploaded to the `nf-core/configs` git repository. Before you do this please can you test that the config file works with your pipeline of choice using the `-c` parameter. You can then create a pull request to the `nf-core/configs` repository with the addition of your config file, associated documentation file (see examples in [`nf-core/configs/docs`](https://github.com/nf-core/configs/tree/master/docs)), and amending [`nfcore_custom.config`](https://github.com/nf-core/configs/blob/master/nfcore_custom.config) to include your custom profile. @@ -282,14 +266,6 @@ See the main [Nextflow documentation](https://www.nextflow.io/docs/latest/config If you have any questions or issues please send us a message on [Slack](https://nf-co.re/join/slack) on the [`#configs` channel](https://nfcore.slack.com/channels/configs). -## Azure Resource Requests - -To be used with the `azurebatch` profile by specifying the `-profile azurebatch`. -We recommend providing a compute `params.vm_type` of `Standard_D16_v3` VMs by default but these options can be changed if required. - -Note that the choice of VM size depends on your quota and the overall workload during the analysis. -For a thorough list, please refer the [Azure Sizes for virtual machines in Azure](https://docs.microsoft.com/en-us/azure/virtual-machines/sizes). - ## Running in the background Nextflow handles job submissions and supervises the running jobs. The Nextflow process must run until the pipeline is finished. diff --git a/lib/GlobalVariables.groovy b/lib/GlobalVariables.groovy index 72d5754b..9d868d23 100644 --- a/lib/GlobalVariables.groovy +++ b/lib/GlobalVariables.groovy @@ -4,9 +4,11 @@ import java.nio.file.Path class GlobalVariables { // The available callers - public static List availableCallers = ["haplotypecaller", "vardict"] + public static List availableCallers = ["haplotypecaller", "vardict", "elprep"] - public static List gvcfCallers = ["haplotypecaller"] + public static List gvcfCallers = ["haplotypecaller", "elprep"] + + public static List bamCallers = ["elprep", "vardict"] public static Map pedFiles = [:] diff --git a/lib/Pedigree.groovy b/lib/Pedigree.groovy index 8b1a0bbe..676bdee3 100644 --- a/lib/Pedigree.groovy +++ b/lib/Pedigree.groovy @@ -84,7 +84,7 @@ class PedigreeEntry { // Family ID id = lineSplit[0] if (id ==~ idRegex) { - familyId = id + familyId = id.replace(".", "_") // Replace dots with underscores to prevent breaking the multiqc report } else { exceptions.add("Invalid family ID (${id}). It should only contain these characters: a-z, A-Z, 0-9, _ and ." as String) } @@ -92,7 +92,7 @@ class PedigreeEntry { // Individual ID id = lineSplit[1] if (id ==~ idRegex) { - individualId = id + individualId = id.replace(".", "_") // Replace dots with underscores to prevent breaking the multiqc report } else { exceptions.add("Invalid individual ID (${id}). It should only contain these characters: a-z, A-Z, 0-9, _ and ." as String) } @@ -102,7 +102,7 @@ class PedigreeEntry { // Paternal ID id = lineSplit[2] if (id ==~ idRegex) { - paternalId = id + paternalId = id.replace(".", "_") // Replace dots with underscores to prevent breaking the multiqc report } else if (!validMissingIDs.contains(id)) { exceptions.add("Invalid paternal ID (${id}). It should only contain these characters: a-z, A-Z, 0-9, _ and .; Use 0 if the paternal ID is missing" as String) } @@ -110,7 +110,7 @@ class PedigreeEntry { // Maternal ID id = lineSplit[3] if (id ==~ idRegex) { - maternalId = id + maternalId = id.replace(".", "_") // Replace dots with underscores to prevent breaking the multiqc report } else if (!validMissingIDs.contains(id)) { exceptions.add("Invalid maternal ID (${id}). It should only contain these characters: a-z, A-Z, 0-9, _ and .; Use 0 if the maternal ID is missing" as String) } diff --git a/main.nf b/main.nf index 285e1280..7869d843 100644 --- a/main.nf +++ b/main.nf @@ -7,6 +7,8 @@ ---------------------------------------------------------------------------------------- */ +nextflow.preview.output = true + /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ IMPORT FUNCTIONS / MODULES / SUBWORKFLOWS / WORKFLOWS @@ -15,9 +17,11 @@ include { getGenomeAttribute } from './subworkflows/local/utils_cmgg_germline_pipeline' +// Take another look at this later! params.fasta = getGenomeAttribute('fasta', params.genomes, params.genome) params.fai = getGenomeAttribute('fai', params.genomes, params.genome) params.dict = getGenomeAttribute('dict', params.genomes, params.genome) +params.elfasta = getGenomeAttribute('elfasta', params.genomes, params.genome) params.strtablefile = getGenomeAttribute('strtablefile', params.genomes, params.genome) params.sdf = getGenomeAttribute('sdf', params.genomes, params.genome) params.dbsnp = getGenomeAttribute('dbsnp', params.genomes, params.genome) @@ -47,94 +51,7 @@ params.vcfanno_config = getGenomeAttribute('vcfanno_config', params.genome include { GERMLINE } from './workflows/germline' include { PIPELINE_INITIALISATION } from './subworkflows/local/utils_cmgg_germline_pipeline' include { PIPELINE_COMPLETION } from './subworkflows/local/utils_cmgg_germline_pipeline' - -// -// WORKFLOW: Run main analysis pipeline depending on type of input -// - -workflow NFCMGG_GERMLINE { - - take: - samplesheet // channel: samplesheet read in from --input - pipeline_params // the parameters used for this pipeline - multiqc_logo // string: the path to the multiqc logo - - main: - - // - // WORKFLOW: Run pipeline - // - GERMLINE ( - // Input channels - samplesheet, - - // File inputs - pipeline_params.fasta, - pipeline_params.fai, - pipeline_params.dict, - pipeline_params.strtablefile, - pipeline_params.sdf, - pipeline_params.dbsnp, - pipeline_params.dbsnp_tbi, - pipeline_params.vep_cache, - pipeline_params.dbnsfp, - pipeline_params.dbnsfp_tbi, - pipeline_params.spliceai_indel, - pipeline_params.spliceai_indel_tbi, - pipeline_params.spliceai_snv, - pipeline_params.spliceai_snv_tbi, - pipeline_params.mastermind, - pipeline_params.mastermind_tbi, - pipeline_params.eog, - pipeline_params.eog_tbi, - pipeline_params.alphamissense, - pipeline_params.alphamissense_tbi, - pipeline_params.vcfanno_resources, - pipeline_params.vcfanno_config, - pipeline_params.multiqc_config, - multiqc_logo, - pipeline_params.multiqc_methods_description, - pipeline_params.roi, - pipeline_params.somalier_sites, - pipeline_params.vcfanno_lua, - pipeline_params.updio_common_cnvs, - pipeline_params.automap_repeats, - pipeline_params.automap_panel, - pipeline_params.outdir, - GlobalVariables.pedFiles, - - // Boolean inputs - pipeline_params.dragstr, - pipeline_params.annotate, - pipeline_params.vcfanno, - pipeline_params.only_call, - pipeline_params.only_merge, - pipeline_params.filter, - pipeline_params.normalize, - pipeline_params.add_ped, - pipeline_params.gemini, - pipeline_params.validate, - pipeline_params.updio, - pipeline_params.automap, - pipeline_params.vep_dbnsfp, - pipeline_params.vep_spliceai, - pipeline_params.vep_mastermind, - pipeline_params.vep_eog, - pipeline_params.vep_alphamissense, - - // Value inputs - pipeline_params.genome, - pipeline_params.species, - pipeline_params.vep_cache_version, - pipeline_params.vep_chunk_size, - pipeline_params.scatter_count, - pipeline_params.callers.tokenize(",") - ) - - emit: - multiqc_report = GERMLINE.out.multiqc_report // channel: /path/to/multiqc_report.html - -} +include { getWorkflowVersion } from './subworkflows/nf-core/utils_nfcore_pipeline' /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -199,9 +116,7 @@ workflow { // PIPELINE_INITIALISATION ( params.version, - params.help, params.validate_params, - params.monochrome_logs, args, params.outdir, params.input, @@ -214,10 +129,74 @@ workflow { // // WORKFLOW: Run main workflow // - NFCMGG_GERMLINE ( + + GERMLINE ( + // Input channels PIPELINE_INITIALISATION.out.samplesheet, - params, - multiqc_logo + + // File inputs + params.fasta, + params.fai, + params.dict, + params.elfasta, + params.strtablefile, + params.sdf, + params.dbsnp, + params.dbsnp_tbi, + params.vep_cache, + params.dbnsfp, + params.dbnsfp_tbi, + params.spliceai_indel, + params.spliceai_indel_tbi, + params.spliceai_snv, + params.spliceai_snv_tbi, + params.mastermind, + params.mastermind_tbi, + params.eog, + params.eog_tbi, + params.alphamissense, + params.alphamissense_tbi, + params.vcfanno_resources, + params.vcfanno_config, + params.multiqc_config, + multiqc_logo, + params.multiqc_methods_description, + params.roi, + params.somalier_sites, + params.vcfanno_lua, + params.updio_common_cnvs, + params.automap_repeats, + params.automap_panel, + params.outdir, + GlobalVariables.pedFiles, + params.elsites, + + // Boolean inputs + params.dragstr, + params.annotate, + params.vcfanno, + params.only_call, + params.only_merge, + params.filter, + params.normalize, + params.add_ped, + params.gemini, + params.validate, + params.updio, + params.automap, + params.vep_dbnsfp, + params.vep_spliceai, + params.vep_mastermind, + params.vep_eog, + params.vep_alphamissense, + + // Value inputs + params.genome, + params.species, + params.vep_cache_version, + params.vep_chunk_size, + params.scatter_count, + params.callers.tokenize(",") ) // @@ -230,8 +209,86 @@ workflow { params.outdir, params.monochrome_logs, params.hook_url, - NFCMGG_GERMLINE.out.multiqc_report + GERMLINE.out.multiqc_report ) + + // Filtering out input GVCFs from the output publishing fixes an issue in the current implementation of + // the workflow output definitions: https://github.com/nextflow-io/nextflow/issues/5480 + def ch_gvcfs_out = GERMLINE.out.gvcfs.filter { _meta, gvcf, _tbi -> gvcf.startsWith(workflow.workDir) } + + publish: + ch_gvcfs_out >> 'gvcfs' + GERMLINE.out.single_beds >> 'single_beds' + GERMLINE.out.validation >> 'validation' + GERMLINE.out.gvcf_reports >> 'gvcf_reports' + GERMLINE.out.genomicsdb >> 'genomicsdb' + GERMLINE.out.vcfs >> 'vcfs' + GERMLINE.out.gemini >> 'gemini' + GERMLINE.out.peds >> 'peds' + GERMLINE.out.joint_beds >> 'joint_beds' + GERMLINE.out.final_reports >> 'final_reports' + GERMLINE.out.automap >> 'automap' + GERMLINE.out.updio >> 'updio' + GERMLINE.out.multiqc_report >> 'multiqc' + GERMLINE.out.multiqc_data >> 'multiqc_data' +} + +output { + 'gvcfs' { + path { meta, gvcf, _tbi -> { file -> + if(file == gvcf.name) { + return "${meta.family}/${meta.id}_${params.unique_out}/${meta.id}.${meta.caller}.g.vcf.gz" + } + return "${meta.family}/${meta.id}_${params.unique_out}/${meta.id}.${meta.caller}.g.vcf.gz.tbi" + } } + } + 'single_beds' { + path { meta, _bed -> { _file -> "${meta.family}/${meta.id}_${params.unique_out}/${meta.id}.bed" } } + } + 'validation' { + path { meta, _report -> { file -> "${meta.family}/${meta.id}_${params.unique_out}/validation/${meta.caller}/${file}" } } + } + 'gvcf_reports' { + path { meta, _report -> { _file -> "${meta.family}/${meta.id}_${params.unique_out}/${meta.id}.${meta.caller}.bcftools_stats.txt" }} + } + 'genomicsdb' { + enabled (params.output_genomicsdb || params.only_merge) + path { meta, _genomicsdb -> + { _file -> "${meta.family}/output_${params.unique_out}/${meta.id}_${meta.caller}_genomicsdb"} + } + } + 'vcfs' { + path { meta, vcf, _tbi -> { file -> + if(file == vcf.name) { + return "${meta.family}/output_${params.unique_out}/${meta.id}.${meta.caller}.vcf.gz" + } + return "${meta.family}/output_${params.unique_out}/${meta.id}.${meta.caller}.vcf.gz.tbi" + } } + } + 'gemini' { + path { meta, _db -> { _file -> "${meta.family}/output_${params.unique_out}/${meta.id}.${meta.caller}.db"}} + } + 'peds' { + path { meta, _ped -> { _file -> "${meta.family}/output_${params.unique_out}/${meta.id}.${meta.caller}.ped"}} + } + 'joint_beds' { + path { meta, _bed -> { _file -> "${meta.family}/output_${params.unique_out}/${meta.id}.${meta.caller}.bed"}} + } + 'final_reports' { + path { meta, _report -> { file -> "${meta.family}/qc_${params.unique_out}/${file}"}} + } + 'automap' { + path { meta, _automap -> { _file -> "${meta.family}/output_${params.unique_out}/automap/${meta.caller}"}} + } + 'updio' { + path { meta, _updio -> { _file -> "${meta.family}/output_${params.unique_out}/updio/${meta.caller}"}} + } + 'multiqc' { + path { _report -> { _file -> "${params.unique_out}/multiqc_report.html"}} + } + 'multiqc_data' { + path { _folder -> { _file -> "${params.unique_out}/multiqc_data"}} + } } /* diff --git a/modules.json b/modules.json index ac9d8d29..36b21493 100644 --- a/modules.json +++ b/modules.json @@ -7,212 +7,225 @@ "nf-core": { "bcftools/annotate": { "branch": "master", - "git_sha": "88ae68490e8f2478a1e1bbeedac970fd7cc73022", + "git_sha": "cb08035150685b11d890d90c9534d4f16869eaec", "installed_by": ["modules"], "patch": "modules/nf-core/bcftools/annotate/bcftools-annotate.diff" }, "bcftools/concat": { "branch": "master", "git_sha": "d1e0ec7670fa77905a378627232566ce54c3c26d", - "installed_by": ["vcf_annotate_ensemblvep_snpeff"] + "installed_by": ["modules"] }, "bcftools/filter": { "branch": "master", - "git_sha": "33ef773a7ea36e88323902f63662aa53c9b88988", + "git_sha": "f85dbddd7a335fc0f5ac331e8d22ca94123b654b", "installed_by": ["modules"] }, "bcftools/norm": { "branch": "master", - "git_sha": "f6cc00f107826cfaf1c933297b10ed1757b41479", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bcftools/pluginscatter": { "branch": "master", - "git_sha": "33ef773a7ea36e88323902f63662aa53c9b88988", - "installed_by": ["vcf_annotate_ensemblvep_snpeff"] + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] }, "bcftools/query": { "branch": "master", - "git_sha": "34ac993e081b32d2170ab790d0386b74122f9d36", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bcftools/reheader": { "branch": "master", - "git_sha": "c32611ac6813055b9321d2827678e2f8aebcb394", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bcftools/sort": { "branch": "master", - "git_sha": "cdf83b18471db290a28fe98c2a0852cb05864890", - "installed_by": ["vcf_annotate_ensemblvep_snpeff"] + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"] }, "bcftools/stats": { "branch": "master", - "git_sha": "a5ba4d59c2b248c0379b0f8aeb4e7e754566cd1f", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bedtools/intersect": { "branch": "master", - "git_sha": "06c8865e36741e05ad32ef70ab3fac127486af48", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bedtools/merge": { "branch": "master", - "git_sha": "a3d614e4a7b8691a259bcfe33ad80903217d6215", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "bedtools/split": { "branch": "master", - "git_sha": "6dc8a32e32158bb4d3f9af92c802233b5d4f8e4d", + "git_sha": "cb08035150685b11d890d90c9534d4f16869eaec", "installed_by": ["modules"] }, + "elprep/fastatoelfasta": { + "branch": "master", + "git_sha": "74ac5351a11a184171489dee73652e8b69ba9d22", + "installed_by": ["modules"] + }, + "elprep/filter": { + "branch": "master", + "git_sha": "909c4dcdbb1e751214e2bb155e8c0a59633ed12a", + "installed_by": ["modules"], + "patch": "modules/nf-core/elprep/filter/elprep-filter.diff" + }, "ensemblvep/download": { "branch": "master", - "git_sha": "54c183cba37cac58860d9967feaae54acf9cc3e0", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"], "patch": "modules/nf-core/ensemblvep/download/ensemblvep-download.diff" }, "ensemblvep/vep": { "branch": "master", - "git_sha": "54c183cba37cac58860d9967feaae54acf9cc3e0", - "installed_by": ["vcf_annotate_ensemblvep_snpeff"], + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"], "patch": "modules/nf-core/ensemblvep/vep/ensemblvep-vep.diff" }, "gatk4/calibratedragstrmodel": { "branch": "master", - "git_sha": "3f42e07a1133064c569b0dbe182979527bca9e59", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/composestrtablefile": { "branch": "master", - "git_sha": "926e2f394d01c71d3abbdbca9c588630bfe51abf", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "gatk4/createsequencedictionary": { "branch": "master", - "git_sha": "e6fe277739f5894711405af3e717b2470bd956b5", - "installed_by": ["modules"] + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"], + "patch": "modules/nf-core/gatk4/createsequencedictionary/gatk4-createsequencedictionary.diff" }, "gatk4/genomicsdbimport": { "branch": "master", - "git_sha": "4e5f4687318f24ba944a13609d3ea6ebd890737d", - "installed_by": ["modules"] + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"], + "patch": "modules/nf-core/gatk4/genomicsdbimport/gatk4-genomicsdbimport.diff" }, "gatk4/genotypegvcfs": { "branch": "master", - "git_sha": "8b74c800af3d91e0d7bfbecb902308dc4369071c", - "installed_by": ["modules"] + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"], + "patch": "modules/nf-core/gatk4/genotypegvcfs/gatk4-genotypegvcfs.diff" }, "gatk4/haplotypecaller": { "branch": "master", - "git_sha": "c332ea831f95f750be962c4b5de655f7a1e6e245", - "installed_by": ["modules"] + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"], + "patch": "modules/nf-core/gatk4/haplotypecaller/gatk4-haplotypecaller.diff" }, "gawk": { "branch": "master", - "git_sha": "cf3ed075695639b0a0924eb0901146df1996dc08", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "mosdepth": { "branch": "master", - "git_sha": "9bfc81874554e87740bcb3e5e07acf0a153c9ecb", - "installed_by": ["modules"] + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"], + "patch": "modules/nf-core/mosdepth/mosdepth.diff" }, "multiqc": { "branch": "master", - "git_sha": "06c8865e36741e05ad32ef70ab3fac127486af48", + "git_sha": "b8d36829fa84b6e404364abff787e8b07f6d058c", "installed_by": ["modules"] }, "rtgtools/format": { "branch": "master", - "git_sha": "e743b2dea725bcfc4b76a209981808987332013a", + "git_sha": "167a20a2e267261af397e9ea5bf58426e6345ce7", "installed_by": ["modules"] }, "rtgtools/pedfilter": { "branch": "master", - "git_sha": "c1c2a770cfb0bfbf093a2434a27f091ebbc65987", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"], "patch": "modules/nf-core/rtgtools/pedfilter/rtgtools-pedfilter.diff" }, "rtgtools/rocplot": { "branch": "master", - "git_sha": "64770369d851c45c364e410e052ef9a6c3a7d2bb", + "git_sha": "83e2df1e4ec594beb8a575b4db0b4197900f4ebd", "installed_by": ["modules"] }, "rtgtools/vcfeval": { "branch": "master", - "git_sha": "e743b2dea725bcfc4b76a209981808987332013a", + "git_sha": "83e2df1e4ec594beb8a575b4db0b4197900f4ebd", "installed_by": ["modules"] }, "samtools/convert": { "branch": "master", - "git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519", + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", "installed_by": ["modules"] }, "samtools/faidx": { "branch": "master", - "git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519", + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", "installed_by": ["modules"] }, "samtools/index": { "branch": "master", - "git_sha": "46eca555142d6e597729fcb682adcc791796f514", + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", "installed_by": ["modules"] }, "samtools/merge": { "branch": "master", - "git_sha": "04fbbc7c43cebc0b95d5b126f6d9fe4effa33519", + "git_sha": "b13f07be4c508d6ff6312d354d09f2493243e208", "installed_by": ["modules"] }, - "snpeff/snpeff": { - "branch": "master", - "git_sha": "3ad7292d9b8da881386e9d5b58364d7da489b38b", - "installed_by": ["vcf_annotate_ensemblvep_snpeff"] - }, "somalier/extract": { "branch": "master", - "git_sha": "458c882987320e27fc90723ec96c127a243a5497", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "somalier/relate": { "branch": "master", - "git_sha": "458c882987320e27fc90723ec96c127a243a5497", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"], "patch": "modules/nf-core/somalier/relate/somalier-relate.diff" }, "tabix/bgzip": { "branch": "master", - "git_sha": "b20be35facfc5acdc1259f132ed79339d79e989f", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "tabix/bgziptabix": { "branch": "master", - "git_sha": "0840b387799172e77510393ed09e4d4ec1bc6d7c", + "git_sha": "f448e846bdadd80fc8be31fbbc78d9f5b5131a45", "installed_by": ["modules"] }, "tabix/tabix": { "branch": "master", - "git_sha": "0840b387799172e77510393ed09e4d4ec1bc6d7c", - "installed_by": ["modules", "vcf_annotate_ensemblvep_snpeff"] + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", + "installed_by": ["modules"], + "patch": "modules/nf-core/tabix/tabix/tabix-tabix.diff" }, "untar": { "branch": "master", - "git_sha": "4e5f4687318f24ba944a13609d3ea6ebd890737d", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] }, "vardictjava": { "branch": "master", - "git_sha": "e61e5a13ef49c5595986bd31efb85c3f0709a282", + "git_sha": "f85452fcbebab5dfd77c0752236f6f86e9a03b32", "installed_by": ["modules"] }, "vcf2db": { "branch": "master", - "git_sha": "730f3aee80d5f8d0b5fc532202ac59361414d006", - "installed_by": ["modules"] + "git_sha": "439f05652b54826ff23f5baa505082d5d8587dd7", + "installed_by": ["modules"], + "patch": "modules/nf-core/vcf2db/vcf2db.diff" }, "vcfanno": { "branch": "master", - "git_sha": "9a8bba5910982ae637dedb8664e3121db77e173f", + "git_sha": "666652151335353eef2fcd58880bcef5bc2928e1", "installed_by": ["modules"] } } @@ -221,22 +234,17 @@ "nf-core": { "utils_nextflow_pipeline": { "branch": "master", - "git_sha": "d20fb2a9cc3e2835e9d067d1046a63252eb17352", + "git_sha": "56372688d8979092cafbe0c5c3895b491166ca1c", "installed_by": ["subworkflows"] }, "utils_nfcore_pipeline": { "branch": "master", - "git_sha": "2fdce49d30c0254f76bc0f13c55c17455c1251ab", + "git_sha": "1b6b9a3338d011367137808b49b923515080e3ba", "installed_by": ["subworkflows"] }, "utils_nfschema_plugin": { "branch": "master", - "git_sha": "bbd5a41f4535a8defafe6080e00ea74c45f4f96c", - "installed_by": ["subworkflows"] - }, - "vcf_annotate_ensemblvep_snpeff": { - "branch": "master", - "git_sha": "cfd937a668919d948f6fcbf4218e79de50c2f36f", + "git_sha": "2fd2cd6d0e7b273747f32e465fdc6bcc3ae0814e", "installed_by": ["subworkflows"] } } diff --git a/modules/local/automap/automap/main.nf b/modules/local/automap/automap/main.nf index b84535a9..600eb300 100644 --- a/modules/local/automap/automap/main.nf +++ b/modules/local/automap/automap/main.nf @@ -11,12 +11,12 @@ process AUTOMAP_AUTOMAP { val(genome) output: - tuple val(meta), path("${prefix}"), emit: automap - path "versions.yml" , emit: versions + tuple val(meta), path("${task.ext.prefix ?: meta.id}"), emit: automap + path "versions.yml" , emit: versions script: def args = task.ext.args ?: '' - prefix = task.ext.prefix ?: "${meta.id}" + def prefix = task.ext.prefix ?: "${meta.id}" def panel_file = panel ? "--panel $panel" : "--panel /usr/local/lib/automap/Resources/Biomodule_20220808_all_genes_hg38.txt" def hg_genome = genome ?: "hg38" @@ -40,7 +40,7 @@ process AUTOMAP_AUTOMAP { stub: def args = task.ext.args ?: '' def panel_name = args.contains("--panelname") ? args.split("--panelname")[-1].trim().split(" ")[0] : "" - prefix = task.ext.prefix ?: "${meta.id}" + def prefix = task.ext.prefix ?: "${meta.id}" def create_outputs = meta.family_samples.tokenize(",").size() > 1 ? (1..meta.family_samples.tokenize(",").size()).collect { number -> def cmd_prefix = "touch ${prefix}/sample${number}" diff --git a/modules/local/automap/automap/tests/main.nf.test b/modules/local/automap/automap/tests/main.nf.test new file mode 100644 index 00000000..f19fe051 --- /dev/null +++ b/modules/local/automap/automap/tests/main.nf.test @@ -0,0 +1,95 @@ +nextflow_process { + + name "Test Process AUTOMAP_AUTOMAP" + script "../main.nf" + process "AUTOMAP_AUTOMAP" + + tag "modules" + tag "modules_local" + tag "automap" + tag "automap/automap" + + setup { + run("AUTOMAP_REPEATS") { + script "../../repeats/main.nf" + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + "hg38" + ] + """ + } + } + } + + test("homo_sapiens - vcf, tbi, repeats, [], hg38") { + + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test', family_samples:"NA24143,NA24149,NA24385", caller:"haplotypecaller" ], // meta map + file(params.famvcf, checkIfExists: true), + file(params.famtbi, checkIfExists: true) + ] + input[1] = AUTOMAP_REPEATS.out.repeats + input[2] = [[],[]] + input[3] = "hg38" + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.automap.collect { meta, out -> + [ + meta, + path(out).list().collect { + it.list().collect { + "${file(it.toString()).name},variantsMD5:${it.vcf.variantsMD5}" + } + } + ] + }, + process.out.versions + ).match() } + ) + } + + } + + test("homo_sapiens - vcf, tbi, repeats, [], hg38 - stub") { + + options "-stub" + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test', family_samples:"NA24143,NA24149,NA24385", caller:"haplotypecaller" ], // meta map + file(params.famvcf, checkIfExists: true), + file(params.famtbi, checkIfExists: true) + ] + input[1] = AUTOMAP_REPEATS.out.repeats + input[2] = [[],[]] + input[3] = "hg38" + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/local/automap/automap/tests/main.nf.test.snap b/modules/local/automap/automap/tests/main.nf.test.snap new file mode 100644 index 00000000..356704a3 --- /dev/null +++ b/modules/local/automap/automap/tests/main.nf.test.snap @@ -0,0 +1,109 @@ +{ + "homo_sapiens - vcf, tbi, repeats, [], hg38": { + "content": [ + [ + [ + { + "id": "test", + "family_samples": "NA24143,NA24149,NA24385", + "caller": "haplotypecaller" + }, + [ + [ + "NA24143.individual.vcf,variantsMD5:d87c6d5aea196c746312a8a383eede4b" + ], + [ + "NA24149.individual.vcf,variantsMD5:24eacd6958a767b7768b4282caea52cc" + ], + [ + "NA24385.individual.vcf,variantsMD5:f7f27a20139824e3e49cf17a2867489b" + ] + ] + ] + ], + [ + "versions.yml:md5,8d88bdf05fc6f578d81fee3716f8b562" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-13T17:03:29.44273724" + }, + "homo_sapiens - vcf, tbi, repeats, [], hg38 - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "family_samples": "NA24143,NA24149,NA24385", + "caller": "haplotypecaller" + }, + [ + [ + "sample1.HomRegions.cmgg_bio.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample1.HomRegions.pdf:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample1.HomRegions.strict.cmgg_bio.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample1.HomRegions.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ], + [ + "sample2.HomRegions.cmgg_bio.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.HomRegions.pdf:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.HomRegions.strict.cmgg_bio.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.HomRegions.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ], + [ + "sample3.HomRegions.cmgg_bio.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.HomRegions.pdf:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.HomRegions.strict.cmgg_bio.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.HomRegions.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ] + ], + "1": [ + "versions.yml:md5,8d88bdf05fc6f578d81fee3716f8b562" + ], + "automap": [ + [ + { + "id": "test", + "family_samples": "NA24143,NA24149,NA24385", + "caller": "haplotypecaller" + }, + [ + [ + "sample1.HomRegions.cmgg_bio.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample1.HomRegions.pdf:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample1.HomRegions.strict.cmgg_bio.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample1.HomRegions.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ], + [ + "sample2.HomRegions.cmgg_bio.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.HomRegions.pdf:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.HomRegions.strict.cmgg_bio.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample2.HomRegions.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ], + [ + "sample3.HomRegions.cmgg_bio.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.HomRegions.pdf:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.HomRegions.strict.cmgg_bio.tsv:md5,d41d8cd98f00b204e9800998ecf8427e", + "sample3.HomRegions.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ] + ], + "versions": [ + "versions.yml:md5,8d88bdf05fc6f578d81fee3716f8b562" + ] + } + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-13T17:00:28.503010269" + } +} \ No newline at end of file diff --git a/modules/local/automap/automap/tests/nextflow.config b/modules/local/automap/automap/tests/nextflow.config new file mode 100644 index 00000000..ba4da41d --- /dev/null +++ b/modules/local/automap/automap/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: "AUTOMAP_AUTOMAP" { + ext.args = "--panelname cmgg_bio --multivcf" + } +} diff --git a/modules/local/automap/repeats/main.nf b/modules/local/automap/repeats/main.nf index e5848023..25d7cf00 100644 --- a/modules/local/automap/repeats/main.nf +++ b/modules/local/automap/repeats/main.nf @@ -12,7 +12,6 @@ process AUTOMAP_REPEATS { path "versions.yml" , emit: versions script: - def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" def VERSION = "1.0.0" diff --git a/modules/local/automap/repeats/tests/main.nf.test b/modules/local/automap/repeats/tests/main.nf.test new file mode 100644 index 00000000..94f1ddb6 --- /dev/null +++ b/modules/local/automap/repeats/tests/main.nf.test @@ -0,0 +1,58 @@ +nextflow_process { + + name "Test Process AUTOMAP_REPEATS" + script "../main.nf" + process "AUTOMAP_REPEATS" + + tag "modules" + tag "modules_local" + tag "automap" + tag "automap/repeats" + + test("homo_sapiens - hg38") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + "hg38" + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("homo_sapiens - hg38 - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + "hg38" + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/local/automap/repeats/tests/main.nf.test.snap b/modules/local/automap/repeats/tests/main.nf.test.snap new file mode 100644 index 00000000..5f114982 --- /dev/null +++ b/modules/local/automap/repeats/tests/main.nf.test.snap @@ -0,0 +1,72 @@ +{ + "homo_sapiens - hg38": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bed:md5,74190e74851226329342a3fcae133cdb" + ] + ], + "1": [ + "versions.yml:md5,b5fc7b6bfce18165907357d8966f09d4" + ], + "repeats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bed:md5,74190e74851226329342a3fcae133cdb" + ] + ], + "versions": [ + "versions.yml:md5,b5fc7b6bfce18165907357d8966f09d4" + ] + } + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-13T16:31:33.522485317" + }, + "homo_sapiens - hg38 - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bed:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,b5fc7b6bfce18165907357d8966f09d4" + ], + "repeats": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bed:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,b5fc7b6bfce18165907357d8966f09d4" + ] + } + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-13T16:31:43.895708133" + } +} \ No newline at end of file diff --git a/modules/local/filter_beds/tests/main.nf.test b/modules/local/filter_beds/tests/main.nf.test new file mode 100644 index 00000000..f139e316 --- /dev/null +++ b/modules/local/filter_beds/tests/main.nf.test @@ -0,0 +1,60 @@ +nextflow_process { + + name "Test Process FILTER_BEDS" + script "../main.nf" + process "FILTER_BEDS" + + tag "modules" + tag "modules_local" + tag "filter_beds" + + test("homo_sapiens - bed") { + + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.bed, checkIfExists:true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("homo_sapiens - bed - stub") { + + options "-stub" + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.bed, checkIfExists:true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/local/filter_beds/tests/main.nf.test.snap b/modules/local/filter_beds/tests/main.nf.test.snap new file mode 100644 index 00000000..c4e2ba18 --- /dev/null +++ b/modules/local/filter_beds/tests/main.nf.test.snap @@ -0,0 +1,72 @@ +{ + "homo_sapiens - bed": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bed:md5,fa245abf8add7a80650566a1de67ec04" + ] + ], + "1": [ + "versions.yml:md5,38ea76fdc8d681b1e47415b195cccd88" + ], + "bed": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bed:md5,fa245abf8add7a80650566a1de67ec04" + ] + ], + "versions": [ + "versions.yml:md5,38ea76fdc8d681b1e47415b195cccd88" + ] + } + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-13T17:12:06.470648263" + }, + "homo_sapiens - bed - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bed:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,38ea76fdc8d681b1e47415b195cccd88" + ], + "bed": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bed:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,38ea76fdc8d681b1e47415b195cccd88" + ] + } + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-13T17:09:43.531020092" + } +} \ No newline at end of file diff --git a/modules/local/filter_beds/tests/nextflow.config b/modules/local/filter_beds/tests/nextflow.config new file mode 100644 index 00000000..1734cd70 --- /dev/null +++ b/modules/local/filter_beds/tests/nextflow.config @@ -0,0 +1,4 @@ +process { + ext.args = "-vE \"LOW_COVERAGE|NO_COVERAGE\"" + ext.args2 = "-d 150" +} diff --git a/modules/local/merge_beds/tests/main.nf.test b/modules/local/merge_beds/tests/main.nf.test new file mode 100644 index 00000000..ea07e119 --- /dev/null +++ b/modules/local/merge_beds/tests/main.nf.test @@ -0,0 +1,68 @@ +nextflow_process { + + name "Test Process MERGE_BEDS" + script "../main.nf" + process "MERGE_BEDS" + + tag "modules" + tag "modules_local" + tag "merge_beds" + + test("homo_sapiens - bed") { + + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.bed, checkIfExists:true) + ] + input[1] = [ + [ id:'fai' ], + file(params.fai, checkIfExists:true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("homo_sapiens - bed - stub") { + + options "-stub" + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.bed, checkIfExists:true) + ] + input[1] = [ + [ id:'fai' ], + file(params.fai, checkIfExists:true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/local/merge_beds/tests/main.nf.test.snap b/modules/local/merge_beds/tests/main.nf.test.snap new file mode 100644 index 00000000..a6eb4e35 --- /dev/null +++ b/modules/local/merge_beds/tests/main.nf.test.snap @@ -0,0 +1,72 @@ +{ + "homo_sapiens - bed": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bed:md5,fa245abf8add7a80650566a1de67ec04" + ] + ], + "1": [ + "versions.yml:md5,273f887b675fb5feb6073a5313a191a9" + ], + "bed": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bed:md5,fa245abf8add7a80650566a1de67ec04" + ] + ], + "versions": [ + "versions.yml:md5,273f887b675fb5feb6073a5313a191a9" + ] + } + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-14T09:49:08.980488978" + }, + "homo_sapiens - bed - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bed:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,273f887b675fb5feb6073a5313a191a9" + ], + "bed": [ + [ + { + "id": "test", + "single_end": false + }, + "test.bed:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,273f887b675fb5feb6073a5313a191a9" + ] + } + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-14T09:49:17.089675797" + } +} \ No newline at end of file diff --git a/modules/local/merge_beds/tests/nextflow.config b/modules/local/merge_beds/tests/nextflow.config new file mode 100644 index 00000000..4fda00bd --- /dev/null +++ b/modules/local/merge_beds/tests/nextflow.config @@ -0,0 +1,3 @@ +process { + ext.args = "-d 150" +} diff --git a/modules/local/updio/main.nf b/modules/local/updio/main.nf index 2dfd7ef9..44e882f5 100644 --- a/modules/local/updio/main.nf +++ b/modules/local/updio/main.nf @@ -9,12 +9,12 @@ process UPDIO { tuple val(meta2), path(cnv) output: - tuple val(meta), path("${prefix}"), emit: updio - path "versions.yml" , emit: versions + tuple val(meta), path("${task.ext.prefix ?: meta.id}"), emit: updio + path "versions.yml" , emit: versions script: def args = task.ext.args ?: '' - prefix = task.ext.prefix ?: "${meta.id}" + def prefix = task.ext.prefix ?: "${meta.id}" def common_cnv_file = cnv ? "--common_cnv_file $cnv" : "--common_cnv_file /usr/local/lib/updio/sample_data/common_dels_1percent_liftover.tsv" def VERSION = "1.0.0" @@ -33,7 +33,7 @@ process UPDIO { """ stub: - prefix = task.ext.prefix ?: "${meta.id}" + def prefix = task.ext.prefix ?: "${meta.id}" def VERSION = "1.0.0" """ diff --git a/modules/local/updio/tests/main.nf.test b/modules/local/updio/tests/main.nf.test new file mode 100644 index 00000000..50df983e --- /dev/null +++ b/modules/local/updio/tests/main.nf.test @@ -0,0 +1,63 @@ +nextflow_process { + + name "Test Process UPDIO" + script "../main.nf" + process "UPDIO" + + tag "modules" + tag "modules_local" + tag "updio" + + test("homo_sapiens - vcf, tbi, []") { + + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test', child:'NA24385' ], // meta map + file(params.famvcf, checkIfExists:true), + file(params.famtbi, checkIfExists:true) + ] + input[1] = [[],[]] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("homo_sapiens - vcf, tbi, [] - stub") { + + options "-stub" + config "./nextflow.config" + + when { + process { + """ + input[0] = [ + [ id:'test', child:'NA24385' ], // meta map + file(params.famvcf, checkIfExists:true), + file(params.famtbi, checkIfExists:true) + ] + input[1] = [[],[]] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } +} diff --git a/modules/local/updio/tests/main.nf.test.snap b/modules/local/updio/tests/main.nf.test.snap new file mode 100644 index 00000000..dc030bb0 --- /dev/null +++ b/modules/local/updio/tests/main.nf.test.snap @@ -0,0 +1,92 @@ +{ + "homo_sapiens - vcf, tbi, [] - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "child": "NA24385" + }, + [ + "NA24385.events_list:md5,d41d8cd98f00b204e9800998ecf8427e", + "NA24385.log:md5,d41d8cd98f00b204e9800998ecf8427e", + "NA24385.table:md5,d41d8cd98f00b204e9800998ecf8427e", + "NA24385.upd:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "1": [ + "versions.yml:md5,8a807f51c31f64de1c707210fabe7029" + ], + "updio": [ + [ + { + "id": "test", + "child": "NA24385" + }, + [ + "NA24385.events_list:md5,d41d8cd98f00b204e9800998ecf8427e", + "NA24385.log:md5,d41d8cd98f00b204e9800998ecf8427e", + "NA24385.table:md5,d41d8cd98f00b204e9800998ecf8427e", + "NA24385.upd:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "versions": [ + "versions.yml:md5,8a807f51c31f64de1c707210fabe7029" + ] + } + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-14T09:57:26.032830013" + }, + "homo_sapiens - vcf, tbi, []": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "child": "NA24385" + }, + [ + "NA24385.events_list:md5,4a26c133ef193d31eddfd18dae94f0a0", + "NA24385.log:md5,a40a24f379127a9cde7e40a1ce1032ec", + "NA24385.table:md5,ca8165fc7869a113ca034396de7cf579", + "NA24385.upd:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "1": [ + "versions.yml:md5,8a807f51c31f64de1c707210fabe7029" + ], + "updio": [ + [ + { + "id": "test", + "child": "NA24385" + }, + [ + "NA24385.events_list:md5,4a26c133ef193d31eddfd18dae94f0a0", + "NA24385.log:md5,a40a24f379127a9cde7e40a1ce1032ec", + "NA24385.table:md5,ca8165fc7869a113ca034396de7cf579", + "NA24385.upd:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ] + ], + "versions": [ + "versions.yml:md5,8a807f51c31f64de1c707210fabe7029" + ] + } + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-14T09:57:17.522197249" + } +} \ No newline at end of file diff --git a/modules/local/updio/tests/nextflow.config b/modules/local/updio/tests/nextflow.config new file mode 100644 index 00000000..ed9191f4 --- /dev/null +++ b/modules/local/updio/tests/nextflow.config @@ -0,0 +1,3 @@ +process { + ext.args = "--childID NA24385 --momID NA24149 --dadID NA24143" +} diff --git a/modules/nf-core/bcftools/annotate/bcftools-annotate.diff b/modules/nf-core/bcftools/annotate/bcftools-annotate.diff index 518cee56..045aa783 100644 --- a/modules/nf-core/bcftools/annotate/bcftools-annotate.diff +++ b/modules/nf-core/bcftools/annotate/bcftools-annotate.diff @@ -19,8 +19,516 @@ Changes in 'bcftools/annotate/main.nf': 'modules/nf-core/bcftools/annotate/tests/vcf.config' is unchanged 'modules/nf-core/bcftools/annotate/tests/vcf_gz_index_csi.config' is unchanged 'modules/nf-core/bcftools/annotate/tests/vcf_gz_index_tbi.config' is unchanged -'modules/nf-core/bcftools/annotate/tests/main.nf.test' is unchanged +Changes in 'bcftools/annotate/tests/main.nf.test': +--- modules/nf-core/bcftools/annotate/tests/main.nf.test ++++ modules/nf-core/bcftools/annotate/tests/main.nf.test +@@ -21,9 +21,9 @@ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), +- file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) +- ] +- input[1] = [] ++ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), ++ [] ++ ] + """ + } + } +@@ -52,9 +52,9 @@ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), +- file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) +- ] +- input[1] = [] ++ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), ++ [] ++ ] + """ + } + } +@@ -82,9 +82,9 @@ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), +- file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) +- ] +- input[1] = [] ++ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), ++ [] ++ ] + """ + } + } +@@ -116,9 +116,9 @@ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), +- file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) +- ] +- input[1] = [] ++ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), ++ [] ++ ] + """ + } + } +@@ -150,9 +150,9 @@ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), +- file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) +- ] +- input[1] = [] ++ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), ++ [] ++ ] + """ + } + } +@@ -178,17 +178,18 @@ + when { + process { + """ +- input[0] = [ ++ input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + [], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) +- ] +- input[1] = Channel.of( +- '##INFO=', +- '##INFO=' +- ).collectFile(name:"headers.vcf", newLine:true) ++ ]).join( ++ Channel.of( ++ '##INFO=', ++ '##INFO=' ++ ).collectFile(name:"headers.vcf", newLine:true) ++ ) + """ + } + } +@@ -218,9 +219,9 @@ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), +- file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) +- ] +- input[1] = [] ++ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), ++ [] ++ ] + """ + } + } +@@ -247,9 +248,9 @@ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), +- file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) +- ] +- input[1] = [] ++ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), ++ [] ++ ] + """ + } + } +@@ -277,9 +278,9 @@ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), +- file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) +- ] +- input[1] = [] ++ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), ++ [] ++ ] + """ + } + } +@@ -307,9 +308,9 @@ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), +- file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) +- ] +- input[1] = [] ++ file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), ++ [] ++ ] + """ + } + } + 'modules/nf-core/bcftools/annotate/tests/bcf.config' is unchanged 'modules/nf-core/bcftools/annotate/tests/vcf_gz_index.config' is unchanged -'modules/nf-core/bcftools/annotate/tests/main.nf.test.snap' is unchanged +Changes in 'bcftools/annotate/tests/main.nf.test.snap': +--- modules/nf-core/bcftools/annotate/tests/main.nf.test.snap ++++ modules/nf-core/bcftools/annotate/tests/main.nf.test.snap +@@ -1,26 +1,5 @@ + { +- "bcf": { +- "content": [ +- [ +- [ +- { +- "id": "test", +- "single_end": false +- }, +- "test_ann.bcf" +- ] +- ], +- [ +- "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" +- ] +- ], +- "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.2" +- }, +- "timestamp": "2024-06-12T16:39:33.331888" +- }, +- "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index": { ++ "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_csi": { + "content": [ + [ + [ +@@ -51,9 +30,9 @@ + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, +- "timestamp": "2024-08-15T10:07:59.658031137" +- }, +- "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_csi - stub": { ++ "timestamp": "2024-08-15T10:08:10.581301219" ++ }, ++ "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - stub": { + "content": [ + { + "0": [ +@@ -69,25 +48,13 @@ + + ], + "2": [ +- [ +- { +- "id": "test", +- "single_end": false +- }, +- "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" +- ] ++ + ], + "3": [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ], + "csi": [ +- [ +- { +- "id": "test", +- "single_end": false +- }, +- "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" +- ] ++ + ], + "tbi": [ + +@@ -110,9 +77,78 @@ + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, +- "timestamp": "2024-08-15T10:09:05.096883418" +- }, +- "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_csi": { ++ "timestamp": "2024-08-15T10:08:43.975017625" ++ }, ++ "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_tbi": { ++ "content": [ ++ [ ++ [ ++ { ++ "id": "test", ++ "single_end": false ++ }, ++ "test_vcf.vcf.gz" ++ ] ++ ], ++ [ ++ [ ++ { ++ "id": "test", ++ "single_end": false ++ }, ++ "test_vcf.vcf.gz.tbi" ++ ] ++ ], ++ [ ++ ++ ], ++ [ ++ "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" ++ ] ++ ], ++ "meta": { ++ "nf-test": "0.8.4", ++ "nextflow": "24.04.2" ++ }, ++ "timestamp": "2024-08-15T10:08:21.354059092" ++ }, ++ "bcf": { ++ "content": [ ++ [ ++ ++ ], ++ [ ++ ++ ] ++ ], ++ "meta": { ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" ++ }, ++ "timestamp": "2024-11-20T13:52:35.607526048" ++ }, ++ "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_output": { ++ "content": [ ++ [ ++ [ ++ { ++ "id": "test", ++ "single_end": false ++ }, ++ "test_vcf.vcf.gz" ++ ] ++ ], ++ [ ++ "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" ++ ] ++ ], ++ "meta": { ++ "nf-test": "0.8.4", ++ "nextflow": "24.04.2" ++ }, ++ "timestamp": "2024-08-15T10:07:37.788393317" ++ }, ++ "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index": { + "content": [ + [ + [ +@@ -143,9 +179,30 @@ + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, +- "timestamp": "2024-08-15T10:08:10.581301219" +- }, +- "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - stub": { ++ "timestamp": "2024-08-15T10:07:59.658031137" ++ }, ++ "sarscov2 - [vcf, [], annotation, annotation_tbi], [] - vcf_output": { ++ "content": [ ++ [ ++ [ ++ { ++ "id": "test", ++ "single_end": false ++ }, ++ "test_vcf.vcf.gz" ++ ] ++ ], ++ [ ++ "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" ++ ] ++ ], ++ "meta": { ++ "nf-test": "0.8.4", ++ "nextflow": "24.04.2" ++ }, ++ "timestamp": "2024-08-15T10:07:48.500746325" ++ }, ++ "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_tbi - stub": { + "content": [ + { + "0": [ +@@ -158,7 +215,13 @@ + ] + ], + "1": [ +- ++ [ ++ { ++ "id": "test", ++ "single_end": false ++ }, ++ "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" ++ ] + ], + "2": [ + +@@ -170,7 +233,13 @@ + + ], + "tbi": [ +- ++ [ ++ { ++ "id": "test", ++ "single_end": false ++ }, ++ "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" ++ ] + ], + "vcf": [ + [ +@@ -190,84 +259,9 @@ + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, +- "timestamp": "2024-08-15T10:08:43.975017625" +- }, +- "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_tbi": { +- "content": [ +- [ +- [ +- { +- "id": "test", +- "single_end": false +- }, +- "test_vcf.vcf.gz" +- ] +- ], +- [ +- [ +- { +- "id": "test", +- "single_end": false +- }, +- "test_vcf.vcf.gz.tbi" +- ] +- ], +- [ +- +- ], +- [ +- "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" +- ] +- ], +- "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.2" +- }, +- "timestamp": "2024-08-15T10:08:21.354059092" +- }, +- "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_output": { +- "content": [ +- [ +- [ +- { +- "id": "test", +- "single_end": false +- }, +- "test_vcf.vcf.gz" +- ] +- ], +- [ +- "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" +- ] +- ], +- "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.2" +- }, +- "timestamp": "2024-08-15T10:07:37.788393317" +- }, +- "sarscov2 - [vcf, [], annotation, annotation_tbi], [] - vcf_output": { +- "content": [ +- [ +- [ +- { +- "id": "test", +- "single_end": false +- }, +- "test_vcf.vcf.gz" +- ] +- ], +- [ +- "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" +- ] +- ], +- "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.2" +- }, +- "timestamp": "2024-08-15T10:07:48.500746325" +- }, +- "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_tbi - stub": { ++ "timestamp": "2024-08-15T10:09:16.094918834" ++ }, ++ "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_csi - stub": { + "content": [ + { + "0": [ +@@ -280,31 +274,31 @@ + ] + ], + "1": [ +- [ +- { +- "id": "test", +- "single_end": false +- }, +- "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" +- ] ++ + ], + "2": [ +- ++ [ ++ { ++ "id": "test", ++ "single_end": false ++ }, ++ "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" ++ ] + ], + "3": [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ], + "csi": [ +- ++ [ ++ { ++ "id": "test", ++ "single_end": false ++ }, ++ "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" ++ ] + ], + "tbi": [ +- [ +- { +- "id": "test", +- "single_end": false +- }, +- "test_vcf.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" +- ] ++ + ], + "vcf": [ + [ +@@ -324,7 +318,7 @@ + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, +- "timestamp": "2024-08-15T10:09:16.094918834" ++ "timestamp": "2024-08-15T10:09:05.096883418" + }, + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index - stub": { + "content": [ + ************************************************************ diff --git a/modules/nf-core/bcftools/annotate/main.nf b/modules/nf-core/bcftools/annotate/main.nf index 2cfe29a1..890aa8c5 100644 --- a/modules/nf-core/bcftools/annotate/main.nf +++ b/modules/nf-core/bcftools/annotate/main.nf @@ -58,12 +58,12 @@ process BCFTOOLS_ANNOTATE { args.contains("--output-type z") || args.contains("-Oz") ? "vcf.gz" : args.contains("--output-type v") || args.contains("-Ov") ? "vcf" : "vcf" - def index = args.contains("--write-index=tbi") || args.contains("-W=tbi") ? "tbi" : - args.contains("--write-index=csi") || args.contains("-W=csi") ? "csi" : - args.contains("--write-index") || args.contains("-W") ? "csi" : - "" + def index_extension = args.contains("--write-index=tbi") || args.contains("-W=tbi") ? "tbi" : + args.contains("--write-index=csi") || args.contains("-W=csi") ? "csi" : + args.contains("--write-index") || args.contains("-W") ? "csi" : + "" def create_cmd = extension.endsWith(".gz") ? "echo '' | gzip >" : "touch" - def create_index = extension.endsWith(".gz") && index.matches("csi|tbi") ? "touch ${prefix}.${extension}.${index}" : "" + def create_index = extension.endsWith(".gz") && index_extension.matches("csi|tbi") ? "touch ${prefix}.${extension}.${index_extension}" : "" """ ${create_cmd} ${prefix}.${extension} diff --git a/modules/nf-core/bcftools/annotate/meta.yml b/modules/nf-core/bcftools/annotate/meta.yml index 248eee0c..5bfccd2b 100644 --- a/modules/nf-core/bcftools/annotate/meta.yml +++ b/modules/nf-core/bcftools/annotate/meta.yml @@ -13,49 +13,64 @@ tools: documentation: https://samtools.github.io/bcftools/bcftools.html#annotate doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: Query VCF or BCF file, can be either uncompressed or compressed - - index: - type: file - description: Index of the query VCF or BCF file - - annotations: - type: file - description: Bgzip-compressed file with annotations - - annotations_index: - type: file - description: Index of the annotations file - - header_lines: - type: file - description: Contains lines to append to the output VCF header + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: Query VCF or BCF file, can be either uncompressed or compressed + - index: + type: file + description: Index of the query VCF or BCF file + - annotations: + type: file + description: Bgzip-compressed file with annotations + - annotations_index: + type: file + description: Index of the annotations file + - - header_lines: + type: file + description: Contains lines to append to the output VCF header output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - vcf: - type: file - description: Compressed annotated VCF file - pattern: "*{vcf,vcf.gz,bcf,bcf.gz}" - - csi: - type: file - description: Default VCF file index - pattern: "*.csi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{vcf,vcf.gz,bcf,bcf.gz}": + type: file + description: Compressed annotated VCF file + pattern: "*{vcf,vcf.gz,bcf,bcf.gz}" - tbi: - type: file - description: Alternative VCF file index - pattern: "*.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Alternative VCF file index + pattern: "*.tbi" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Default VCF file index + pattern: "*.csi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@projectoriented" - "@ramprasadn" diff --git a/modules/nf-core/bcftools/annotate/tests/main.nf.test b/modules/nf-core/bcftools/annotate/tests/main.nf.test index 3a5c4933..ed21a14d 100644 --- a/modules/nf-core/bcftools/annotate/tests/main.nf.test +++ b/modules/nf-core/bcftools/annotate/tests/main.nf.test @@ -21,9 +21,9 @@ nextflow_process { file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), + [] ] - input[1] = [] """ } } @@ -52,9 +52,9 @@ nextflow_process { file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), [], file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), + [] ] - input[1] = [] """ } } @@ -82,9 +82,9 @@ nextflow_process { file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), + [] ] - input[1] = [] """ } } @@ -116,9 +116,9 @@ nextflow_process { file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), + [] ] - input[1] = [] """ } } @@ -150,9 +150,9 @@ nextflow_process { file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), + [] ] - input[1] = [] """ } } @@ -178,17 +178,18 @@ nextflow_process { when { process { """ - input[0] = [ + input[0] = Channel.of([ [ id:'test', single_end:false ], // meta map file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), [], file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) - ] - input[1] = Channel.of( - '##INFO=', - '##INFO=' - ).collectFile(name:"headers.vcf", newLine:true) + ]).join( + Channel.of( + '##INFO=', + '##INFO=' + ).collectFile(name:"headers.vcf", newLine:true) + ) """ } } @@ -218,9 +219,9 @@ nextflow_process { file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), + [] ] - input[1] = [] """ } } @@ -247,9 +248,9 @@ nextflow_process { file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), + [] ] - input[1] = [] """ } } @@ -277,9 +278,9 @@ nextflow_process { file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), + [] ] - input[1] = [] """ } } @@ -307,9 +308,9 @@ nextflow_process { file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), + [] ] - input[1] = [] """ } } diff --git a/modules/nf-core/bcftools/annotate/tests/main.nf.test.snap b/modules/nf-core/bcftools/annotate/tests/main.nf.test.snap index bac2224a..d7cf1c01 100644 --- a/modules/nf-core/bcftools/annotate/tests/main.nf.test.snap +++ b/modules/nf-core/bcftools/annotate/tests/main.nf.test.snap @@ -1,26 +1,5 @@ { - "bcf": { - "content": [ - [ - [ - { - "id": "test", - "single_end": false - }, - "test_ann.bcf" - ] - ], - [ - "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" - }, - "timestamp": "2024-06-12T16:39:33.331888" - }, - "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index": { + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_csi": { "content": [ [ [ @@ -51,9 +30,9 @@ "nf-test": "0.8.4", "nextflow": "24.04.2" }, - "timestamp": "2024-08-15T10:07:59.658031137" + "timestamp": "2024-08-15T10:08:10.581301219" }, - "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_csi - stub": { + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - stub": { "content": [ { "0": [ @@ -69,25 +48,13 @@ ], "2": [ - [ - { - "id": "test", - "single_end": false - }, - "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" - ] + ], "3": [ "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" ], "csi": [ - [ - { - "id": "test", - "single_end": false - }, - "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" - ] + ], "tbi": [ @@ -110,9 +77,9 @@ "nf-test": "0.8.4", "nextflow": "24.04.2" }, - "timestamp": "2024-08-15T10:09:05.096883418" + "timestamp": "2024-08-15T10:08:43.975017625" }, - "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_csi": { + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_tbi": { "content": [ [ [ @@ -122,9 +89,6 @@ }, "test_vcf.vcf.gz" ] - ], - [ - ], [ [ @@ -132,8 +96,11 @@ "id": "test", "single_end": false }, - "test_vcf.vcf.gz.csi" + "test_vcf.vcf.gz.tbi" ] + ], + [ + ], [ "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" @@ -143,56 +110,24 @@ "nf-test": "0.8.4", "nextflow": "24.04.2" }, - "timestamp": "2024-08-15T10:08:10.581301219" + "timestamp": "2024-08-15T10:08:21.354059092" }, - "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - stub": { + "bcf": { "content": [ - { - "0": [ - [ - { - "id": "test", - "single_end": false - }, - "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" - ] - ], - "1": [ - - ], - "2": [ - - ], - "3": [ - "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" - ], - "csi": [ - - ], - "tbi": [ - - ], - "vcf": [ - [ - { - "id": "test", - "single_end": false - }, - "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" - ] - ], - "versions": [ - "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" - ] - } + [ + + ], + [ + + ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-08-15T10:08:43.975017625" + "timestamp": "2024-11-20T13:52:35.607526048" }, - "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_tbi": { + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_output": { "content": [ [ [ @@ -202,18 +137,6 @@ }, "test_vcf.vcf.gz" ] - ], - [ - [ - { - "id": "test", - "single_end": false - }, - "test_vcf.vcf.gz.tbi" - ] - ], - [ - ], [ "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" @@ -223,9 +146,9 @@ "nf-test": "0.8.4", "nextflow": "24.04.2" }, - "timestamp": "2024-08-15T10:08:21.354059092" + "timestamp": "2024-08-15T10:07:37.788393317" }, - "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_output": { + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index": { "content": [ [ [ @@ -236,6 +159,18 @@ "test_vcf.vcf.gz" ] ], + [ + + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi" + ] + ], [ "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" ] @@ -244,7 +179,7 @@ "nf-test": "0.8.4", "nextflow": "24.04.2" }, - "timestamp": "2024-08-15T10:07:37.788393317" + "timestamp": "2024-08-15T10:07:59.658031137" }, "sarscov2 - [vcf, [], annotation, annotation_tbi], [] - vcf_output": { "content": [ @@ -326,6 +261,65 @@ }, "timestamp": "2024-08-15T10:09:16.094918834" }, + "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index_csi - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + + ], + "2": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "3": [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ], + "csi": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz.csi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "test", + "single_end": false + }, + "test_vcf.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,ea53f98610d42597cf384ff1fa3eb204" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.2" + }, + "timestamp": "2024-08-15T10:09:05.096883418" + }, "sarscov2 - [vcf, tbi, annotation, annotation_tbi], [] - vcf_gz_index - stub": { "content": [ { diff --git a/modules/nf-core/bcftools/filter/main.nf b/modules/nf-core/bcftools/filter/main.nf index 8f92c51a..36cbf8c2 100644 --- a/modules/nf-core/bcftools/filter/main.nf +++ b/modules/nf-core/bcftools/filter/main.nf @@ -8,7 +8,7 @@ process BCFTOOLS_FILTER { 'biocontainers/bcftools:1.20--h8b25389_0' }" input: - tuple val(meta), path(vcf) + tuple val(meta), path(vcf), path(tbi) output: tuple val(meta), path("*.${extension}"), emit: vcf diff --git a/modules/nf-core/bcftools/filter/meta.yml b/modules/nf-core/bcftools/filter/meta.yml index d67c0257..d72f2755 100644 --- a/modules/nf-core/bcftools/filter/meta.yml +++ b/modules/nf-core/bcftools/filter/meta.yml @@ -12,38 +12,57 @@ tools: documentation: http://www.htslib.org/doc/bcftools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: VCF input file - pattern: "*.{vcf,bcf,vcf.gz,bcf.gz}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: VCF input file + pattern: "*.{vcf,bcf,vcf.gz,bcf.gz}" + - tbi: + type: file + description: VCF index file + pattern: "*.tbi" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - vcf: - type: file - description: VCF filtered output file - pattern: "*.{vcf,bcf,vcf.gz,bcf.gz}" - - csi: - type: file - description: Default VCF file index - pattern: "*.csi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.${extension}": + type: file + description: VCF filtered output file + pattern: "*.{vcf,bcf,vcf.gz,bcf.gz}" - tbi: - type: file - description: Alternative VCF file index - pattern: "*.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Alternative VCF file index + pattern: "*.tbi" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Default VCF file index + pattern: "*.csi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/bcftools/filter/tests/main.nf.test b/modules/nf-core/bcftools/filter/tests/main.nf.test index 646f37ad..fadff0e3 100644 --- a/modules/nf-core/bcftools/filter/tests/main.nf.test +++ b/modules/nf-core/bcftools/filter/tests/main.nf.test @@ -18,7 +18,8 @@ nextflow_process { """ input[0] = [ [id:"vcf_test"], - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true), + [] ] """ } @@ -42,7 +43,8 @@ nextflow_process { """ input[0] = [ [id:"vcf_test"], - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true), + [] ] """ } @@ -72,7 +74,8 @@ nextflow_process { """ input[0] = [ [id:"vcf_test"], - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true), + [] ] """ } @@ -102,7 +105,8 @@ nextflow_process { """ input[0] = [ [id:"vcf_test"], - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true), + [] ] """ } @@ -132,7 +136,8 @@ nextflow_process { """ input[0] = [ [id:"bcf_test"], - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true), + [] ] """ } @@ -147,6 +152,31 @@ nextflow_process { } + test("sarscov2 - vcf.gz, tbi - region filter") { + + config "./region_filter.config" + + when { + process { + """ + input[0] = [ + [id:"bcf_test"], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf.gz.tbi', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match("region filter") } + ) + } + + } + test("sarscov2 - vcf - stub") { config "./nextflow.config" @@ -157,7 +187,8 @@ nextflow_process { """ input[0] = [ [id:"vcf_test"], - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true), + [] ] """ } @@ -182,7 +213,8 @@ nextflow_process { """ input[0] = [ [id:"vcf_test"], - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true), + [] ] """ } @@ -208,7 +240,8 @@ nextflow_process { """ input[0] = [ [id:"vcf_test"], - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true), + [] ] """ } @@ -234,7 +267,8 @@ nextflow_process { """ input[0] = [ [id:"vcf_test"], - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true), + [] ] """ } diff --git a/modules/nf-core/bcftools/filter/tests/main.nf.test.snap b/modules/nf-core/bcftools/filter/tests/main.nf.test.snap index 3d7605f2..640907e4 100644 --- a/modules/nf-core/bcftools/filter/tests/main.nf.test.snap +++ b/modules/nf-core/bcftools/filter/tests/main.nf.test.snap @@ -1,4 +1,49 @@ { + "region filter": { + "content": [ + { + "0": [ + [ + { + "id": "bcf_test" + }, + "bcf_test_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + "1": [ + + ], + "2": [ + + ], + "3": [ + "versions.yml:md5,9a336d1ee26b527d7a2bdbeead155f64" + ], + "csi": [ + + ], + "tbi": [ + + ], + "vcf": [ + [ + { + "id": "bcf_test" + }, + "bcf_test_vcf.vcf.gz:md5,8e722884ffb75155212a3fc053918766" + ] + ], + "versions": [ + "versions.yml:md5,9a336d1ee26b527d7a2bdbeead155f64" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-08T09:14:47.394005264" + }, "sarscov2 - vcf_gz_index_tbi - stub": { "content": [ { diff --git a/modules/nf-core/bcftools/filter/tests/region_filter.config b/modules/nf-core/bcftools/filter/tests/region_filter.config new file mode 100644 index 00000000..b18fb4bf --- /dev/null +++ b/modules/nf-core/bcftools/filter/tests/region_filter.config @@ -0,0 +1,4 @@ +process { + ext.prefix = { "${meta.id}_vcf" } + ext.args = "--output-type z -r MT192765.1 --no-version" +} diff --git a/modules/nf-core/bcftools/norm/meta.yml b/modules/nf-core/bcftools/norm/meta.yml index a0cdeaf1..b6edeb4a 100644 --- a/modules/nf-core/bcftools/norm/meta.yml +++ b/modules/nf-core/bcftools/norm/meta.yml @@ -13,54 +13,70 @@ tools: documentation: http://www.htslib.org/doc/bcftools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: | - The vcf file to be normalized - e.g. 'file1.vcf' - pattern: "*.{vcf,vcf.gz}" - - tbi: - type: file - description: | - An optional index of the VCF file (for when the VCF is compressed) - pattern: "*.vcf.gz.tbi" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: FASTA reference file - pattern: "*.{fasta,fa}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: | + The vcf file to be normalized + e.g. 'file1.vcf' + pattern: "*.{vcf,vcf.gz}" + - tbi: + type: file + description: | + An optional index of the VCF file (for when the VCF is compressed) + pattern: "*.vcf.gz.tbi" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: FASTA reference file + pattern: "*.{fasta,fa}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - vcf: - type: file - description: One of uncompressed VCF (.vcf), compressed VCF (.vcf.gz), compressed BCF (.bcf.gz) or uncompressed BCF (.bcf) normalized output file - pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" - - csi: - type: file - description: Default VCF file index - pattern: "*.csi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{vcf,vcf.gz,bcf,bcf.gz}": + type: file + description: One of uncompressed VCF (.vcf), compressed VCF (.vcf.gz), compressed + BCF (.bcf.gz) or uncompressed BCF (.bcf) normalized output file + pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" - tbi: - type: file - description: Alternative VCF file index - pattern: "*.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Alternative VCF file index + pattern: "*.tbi" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Default VCF file index + pattern: "*.csi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@abhi18av" - "@ramprasadn" diff --git a/modules/nf-core/bcftools/pluginscatter/meta.yml b/modules/nf-core/bcftools/pluginscatter/meta.yml index 71805c03..5a31dacc 100644 --- a/modules/nf-core/bcftools/pluginscatter/meta.yml +++ b/modules/nf-core/bcftools/pluginscatter/meta.yml @@ -1,4 +1,3 @@ ---- # yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/yaml-schema.json name: "bcftools_pluginscatter" description: Split VCF by chunks or regions, creating multiple VCFs. @@ -15,65 +14,81 @@ tools: documentation: http://samtools.github.io/bcftools/bcftools.html#reheader doi: 10.1093/gigascience/giab008 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: The input VCF to scatter - pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" - - tbi: - type: file - description: Optional index of the input VCF - pattern: "*.tbi" - - sites_per_chunk: - type: integer - description: | - How many variants should be in each output file - Either this or `scatter` or `scatter_file` have to be given - - scatter: - type: string - description: | - A comma delimited list of regions to scatter into - Either this or `sites_per_chunk` or `scatter_file` have to be given - - scatter_file: - type: file - description: | - A file containing a region on each line with an optional second column containing the filename - Either this or `sites_per_chunk` or `scatter` have to be given - - regions: - type: file - description: Optional file containing the regions to work on - pattern: "*.bed" - - targets: - type: file - description: Optional file containing the regions to work on (but streams instead of index-jumping) - pattern: "*.bed" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: The input VCF to scatter + pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" + - tbi: + type: file + description: Optional index of the input VCF + pattern: "*.tbi" + - - sites_per_chunk: + type: integer + description: | + How many variants should be in each output file + Either this or `scatter` or `scatter_file` have to be given + - - scatter: + type: string + description: | + A comma delimited list of regions to scatter into + Either this or `sites_per_chunk` or `scatter_file` have to be given + - - scatter_file: + type: file + description: | + A file containing a region on each line with an optional second column containing the filename + Either this or `sites_per_chunk` or `scatter` have to be given + - - regions: + type: file + description: Optional file containing the regions to work on + pattern: "*.bed" + - - targets: + type: file + description: Optional file containing the regions to work on (but streams instead + of index-jumping) + pattern: "*.bed" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - scatter: - type: file - description: The resulting files of the scattering - pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" - - csi: - type: file - description: Default VCF file index - pattern: "*.csi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*{vcf,vcf.gz,bcf,bcf.gz}": + type: file + description: The resulting files of the scattering + pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" - tbi: - type: file - description: Alternative VCF file index - pattern: "*.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Alternative VCF file index + pattern: "*.tbi" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Default VCF file index + pattern: "*.csi" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@nvnieuwk" maintainers: diff --git a/modules/nf-core/bcftools/query/meta.yml b/modules/nf-core/bcftools/query/meta.yml index 303ef610..279b3205 100644 --- a/modules/nf-core/bcftools/query/meta.yml +++ b/modules/nf-core/bcftools/query/meta.yml @@ -1,5 +1,6 @@ name: bcftools_query -description: Extracts fields from VCF or BCF files and outputs them in user-defined format. +description: Extracts fields from VCF or BCF files and outputs them in user-defined + format. keywords: - query - variant calling @@ -13,48 +14,51 @@ tools: documentation: http://www.htslib.org/doc/bcftools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: | - The vcf file to be qeuried. - pattern: "*.{vcf.gz, vcf}" - - tbi: - type: file - description: | - The tab index for the VCF file to be inspected. - pattern: "*.tbi" - - regions: - type: file - description: | - Optionally, restrict the operation to regions listed in this file. - - targets: - type: file - description: | - Optionally, restrict the operation to regions listed in this file (doesn't rely upon index files) - - samples: - type: file - description: | - Optional, file of sample names to be included or excluded. - e.g. 'file.tsv' + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: | + The vcf file to be qeuried. + pattern: "*.{vcf.gz, vcf}" + - tbi: + type: file + description: | + The tab index for the VCF file to be inspected. + pattern: "*.tbi" + - - regions: + type: file + description: | + Optionally, restrict the operation to regions listed in this file. + - - targets: + type: file + description: | + Optionally, restrict the operation to regions listed in this file (doesn't rely upon index files) + - - samples: + type: file + description: | + Optional, file of sample names to be included or excluded. + e.g. 'file.tsv' output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - output: - type: file - description: BCFTools query output file + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.${suffix}": + type: file + description: BCFTools query output file - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@abhi18av" - "@drpatelh" diff --git a/modules/nf-core/bcftools/reheader/meta.yml b/modules/nf-core/bcftools/reheader/meta.yml index d903cc0f..47e5344c 100644 --- a/modules/nf-core/bcftools/reheader/meta.yml +++ b/modules/nf-core/bcftools/reheader/meta.yml @@ -12,51 +12,60 @@ tools: documentation: http://samtools.github.io/bcftools/bcftools.html#reheader doi: 10.1093/gigascience/giab008 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: VCF/BCF file - pattern: "*.{vcf.gz,vcf,bcf}" - - header: - type: file - description: New header to add to the VCF - pattern: "*.{header.txt}" - - samples: - type: file - description: File containing sample names to update (one sample per line) - pattern: "*.{samples.txt}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fai: - type: file - description: Fasta index to update header sequences with - pattern: "*.{fai}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: VCF/BCF file + pattern: "*.{vcf.gz,vcf,bcf}" + - header: + type: file + description: New header to add to the VCF + pattern: "*.{header.txt}" + - samples: + type: file + description: File containing sample names to update (one sample per line) + pattern: "*.{samples.txt}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fai: + type: file + description: Fasta index to update header sequences with + pattern: "*.{fai}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: VCF with updated header, bgzipped per default - pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{vcf,vcf.gz,bcf,bcf.gz}": + type: file + description: VCF with updated header, bgzipped per default + pattern: "*.{vcf,vcf.gz,bcf,bcf.gz}" - index: - type: file - description: Index of VCF with updated header - pattern: "*.{csi,tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{csi,tbi}": + type: file + description: Index of VCF with updated header + pattern: "*.{csi,tbi}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@bjohnnyd" - "@jemten" diff --git a/modules/nf-core/bcftools/sort/meta.yml b/modules/nf-core/bcftools/sort/meta.yml index dfbddcba..f7a6eff1 100644 --- a/modules/nf-core/bcftools/sort/meta.yml +++ b/modules/nf-core/bcftools/sort/meta.yml @@ -12,38 +12,53 @@ tools: tool_dev_url: https://github.com/samtools/bcftools doi: "10.1093/bioinformatics/btp352" licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: The VCF/BCF file to be sorted - pattern: "*.{vcf.gz,vcf,bcf}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: The VCF/BCF file to be sorted + pattern: "*.{vcf.gz,vcf,bcf}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: Sorted VCF file - pattern: "*.{vcf.gz}" - - csi: - type: file - description: Default VCF file index - pattern: "*.csi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{vcf,vcf.gz,bcf,bcf.gz}": + type: file + description: Sorted VCF file + pattern: "*.{vcf.gz}" - tbi: - type: file - description: Alternative VCF file index - pattern: "*.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Alternative VCF file index + pattern: "*.tbi" + - csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: Default VCF file index + pattern: "*.csi" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@Gwennid" maintainers: diff --git a/modules/nf-core/bcftools/stats/meta.yml b/modules/nf-core/bcftools/stats/meta.yml index 7ea2103e..655a61c5 100644 --- a/modules/nf-core/bcftools/stats/meta.yml +++ b/modules/nf-core/bcftools/stats/meta.yml @@ -13,58 +13,86 @@ tools: documentation: http://www.htslib.org/doc/bcftools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:bcftools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: VCF input file - pattern: "*.{vcf}" - - tbi: - type: file - description: | - The tab index for the VCF file to be inspected. Optional: only required when parameter regions is chosen. - pattern: "*.tbi" - - regions: - type: file - description: | - Optionally, restrict the operation to regions listed in this file. (VCF, BED or tab-delimited) - - targets: - type: file - description: | - Optionally, restrict the operation to regions listed in this file (doesn't rely upon tbi index files) - - samples: - type: file - description: | - Optional, file of sample names to be included or excluded. - e.g. 'file.tsv' - - exons: - type: file - description: | - Tab-delimited file with exons for indel frameshifts (chr,beg,end; 1-based, inclusive, optionally bgzip compressed). - e.g. 'exons.tsv.gz' - - fasta: - type: file - description: | - Faidx indexed reference sequence file to determine INDEL context. - e.g. 'reference.fa' + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: VCF input file + pattern: "*.{vcf}" + - tbi: + type: file + description: | + The tab index for the VCF file to be inspected. Optional: only required when parameter regions is chosen. + pattern: "*.tbi" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - regions: + type: file + description: | + Optionally, restrict the operation to regions listed in this file. (VCF, BED or tab-delimited) + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - targets: + type: file + description: | + Optionally, restrict the operation to regions listed in this file (doesn't rely upon tbi index files) + - - meta4: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - samples: + type: file + description: | + Optional, file of sample names to be included or excluded. + e.g. 'file.tsv' + - - meta5: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - exons: + type: file + description: | + Tab-delimited file with exons for indel frameshifts (chr,beg,end; 1-based, inclusive, optionally bgzip compressed). + e.g. 'exons.tsv.gz' + - - meta6: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: | + Faidx indexed reference sequence file to determine INDEL context. + e.g. 'reference.fa' output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - stats: - type: file - description: Text output file containing stats - pattern: "*_{stats.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*stats.txt": + type: file + description: Text output file containing stats + pattern: "*_{stats.txt}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/bedtools/intersect/meta.yml b/modules/nf-core/bedtools/intersect/meta.yml index 0939cb54..45ecf377 100644 --- a/modules/nf-core/bedtools/intersect/meta.yml +++ b/modules/nf-core/bedtools/intersect/meta.yml @@ -10,43 +10,47 @@ tools: A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types. documentation: https://bedtools.readthedocs.io/en/latest/content/tools/intersect.html licence: ["MIT"] + identifier: biotools:bedtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - intervals1: - type: file - description: BAM/BED/GFF/VCF - pattern: "*.{bam|bed|gff|vcf}" - - intervals2: - type: file - description: BAM/BED/GFF/VCF - pattern: "*.{bam|bed|gff|vcf}" - - meta2: - type: map - description: | - Groovy Map containing reference chromosome sizes - e.g. [ id:'test' ] - - chrom_sizes: - type: file - description: Chromosome sizes file - pattern: "*{.sizes,.txt}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - intervals1: + type: file + description: BAM/BED/GFF/VCF + pattern: "*.{bam|bed|gff|vcf}" + - intervals2: + type: file + description: BAM/BED/GFF/VCF + pattern: "*.{bam|bed|gff|vcf}" + - - meta2: + type: map + description: | + Groovy Map containing reference chromosome sizes + e.g. [ id:'test' ] + - chrom_sizes: + type: file + description: Chromosome sizes file + pattern: "*{.sizes,.txt}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - intersect: - type: file - description: File containing the description of overlaps found between the two features - pattern: "*.${extension}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.${extension}": + type: file + description: File containing the description of overlaps found between the two + features + pattern: "*.${extension}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@edmundmiller" - "@sruthipsuresh" diff --git a/modules/nf-core/bedtools/intersect/tests/main.nf.test b/modules/nf-core/bedtools/intersect/tests/main.nf.test new file mode 100644 index 00000000..cd770946 --- /dev/null +++ b/modules/nf-core/bedtools/intersect/tests/main.nf.test @@ -0,0 +1,90 @@ +nextflow_process { + + name "Test Process BEDTOOLS_INTERSECT" + script "../main.nf" + process "BEDTOOLS_INTERSECT" + config "./nextflow.config" + + tag "modules" + tag "modules_nfcore" + tag "bedtools" + tag "bedtools/intersect" + + test("sarscov2 - bed - bed") { + + when { + process { + """ + input[0] = [ + [ id:'test' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test2.bed', checkIfExists: true) + ] + + input[1] = [[:], []] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - bam - bam") { + + when { + process { + """ + input[0] = [ + [ id:'test' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.paired_end.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/baits.bed', checkIfExists: true) + ] + + input[1] = [[:], []] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("sarscov2 - bed - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test2.bed', checkIfExists: true) + ] + + input[1] = [[:], []] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/bedtools/intersect/tests/main.nf.test.snap b/modules/nf-core/bedtools/intersect/tests/main.nf.test.snap new file mode 100644 index 00000000..b748dd49 --- /dev/null +++ b/modules/nf-core/bedtools/intersect/tests/main.nf.test.snap @@ -0,0 +1,101 @@ +{ + "sarscov2 - bam - bam": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_out.bam:md5,738324efe2b1e442ceb6539a630c3fe6" + ] + ], + "1": [ + "versions.yml:md5,42ba439339672f4a9193f0f0fe7a7f64" + ], + "intersect": [ + [ + { + "id": "test" + }, + "test_out.bam:md5,738324efe2b1e442ceb6539a630c3fe6" + ] + ], + "versions": [ + "versions.yml:md5,42ba439339672f4a9193f0f0fe7a7f64" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-17T20:55:57.454847668" + }, + "sarscov2 - bed - bed": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_out.bed:md5,afcbf01c2f2013aad71dbe8e34f2c15c" + ] + ], + "1": [ + "versions.yml:md5,42ba439339672f4a9193f0f0fe7a7f64" + ], + "intersect": [ + [ + { + "id": "test" + }, + "test_out.bed:md5,afcbf01c2f2013aad71dbe8e34f2c15c" + ] + ], + "versions": [ + "versions.yml:md5,42ba439339672f4a9193f0f0fe7a7f64" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-17T20:55:49.072132931" + }, + "sarscov2 - bed - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test_out.bed:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,42ba439339672f4a9193f0f0fe7a7f64" + ], + "intersect": [ + [ + { + "id": "test" + }, + "test_out.bed:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,42ba439339672f4a9193f0f0fe7a7f64" + ] + } + ], + "meta": { + "nf-test": "0.8.4", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-17T20:56:06.259192552" + } +} \ No newline at end of file diff --git a/modules/nf-core/bedtools/intersect/tests/nextflow.config b/modules/nf-core/bedtools/intersect/tests/nextflow.config new file mode 100644 index 00000000..f1f9e693 --- /dev/null +++ b/modules/nf-core/bedtools/intersect/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: BEDTOOLS_INTERSECT { + ext.prefix = { "${meta.id}_out" } + } +} diff --git a/modules/nf-core/bedtools/intersect/tests/tags.yml b/modules/nf-core/bedtools/intersect/tests/tags.yml new file mode 100644 index 00000000..6219cc40 --- /dev/null +++ b/modules/nf-core/bedtools/intersect/tests/tags.yml @@ -0,0 +1,2 @@ +bedtools/intersect: + - "modules/nf-core/bedtools/intersect/**" diff --git a/modules/nf-core/bedtools/merge/meta.yml b/modules/nf-core/bedtools/merge/meta.yml index d7463e3d..6da54205 100644 --- a/modules/nf-core/bedtools/merge/meta.yml +++ b/modules/nf-core/bedtools/merge/meta.yml @@ -1,5 +1,6 @@ name: bedtools_merge -description: combines overlapping or “book-ended” features in an interval file into a single feature which spans all of the combined features. +description: combines overlapping or “book-ended” features in an interval file into + a single feature which spans all of the combined features. keywords: - bed - merge @@ -11,30 +12,33 @@ tools: A set of tools for genomic analysis tasks, specifically enabling genome arithmetic (merge, count, complement) on various file types. documentation: https://bedtools.readthedocs.io/en/latest/content/tools/merge.html licence: ["MIT"] + identifier: biotools:bedtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bed: - type: file - description: Input BED file - pattern: "*.{bed}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bed: + type: file + description: Input BED file + pattern: "*.{bed}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bed: - type: file - description: Overlapped bed file with combined features - pattern: "*.{bed}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bed": + type: file + description: Overlapped bed file with combined features + pattern: "*.{bed}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@edmundmiller" - "@sruthipsuresh" diff --git a/modules/nf-core/bedtools/split/main.nf b/modules/nf-core/bedtools/split/main.nf index b555024c..2b3af64f 100644 --- a/modules/nf-core/bedtools/split/main.nf +++ b/modules/nf-core/bedtools/split/main.nf @@ -38,9 +38,9 @@ process BEDTOOLS_SPLIT { stub: def prefix = task.ext.prefix ?: "${meta.id}" - create_beds = (1..count).collect { - number = "0".multiply(4 - it.toString().size()) + "${it}" - " touch ${prefix}.${number}.bed" + def create_beds = (1..count).collect { number -> + def numberString = "0".multiply(4 - number.toString().size()) + "${number}" + " touch ${prefix}.${numberString}.bed" }.join("\n") """ diff --git a/modules/nf-core/bedtools/split/meta.yml b/modules/nf-core/bedtools/split/meta.yml index 725bb9a2..7e126d22 100644 --- a/modules/nf-core/bedtools/split/meta.yml +++ b/modules/nf-core/bedtools/split/meta.yml @@ -9,30 +9,36 @@ tools: description: "A powerful toolset for genome arithmetic" documentation: "https://bedtools.readthedocs.io/en/latest/content/tools/sort.html" licence: ["MIT", "GPL v2"] + identifier: biotools:bedtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bed: - type: file - description: BED file - pattern: "*.bed" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bed: + type: file + description: BED file + pattern: "*.bed" + - count: + type: integer + description: Number of lines per split file output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - beds: - type: list - description: list of split BED files - pattern: "*.bed" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bed": + type: list + description: list of split BED files + pattern: "*.bed" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@nvnieuwk" maintainers: diff --git a/modules/nf-core/elprep/fastatoelfasta/environment.yml b/modules/nf-core/elprep/fastatoelfasta/environment.yml new file mode 100644 index 00000000..6ab3f8fc --- /dev/null +++ b/modules/nf-core/elprep/fastatoelfasta/environment.yml @@ -0,0 +1,7 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json +channels: + - conda-forge + - bioconda +dependencies: + - "bioconda::elprep=5.1.3" diff --git a/modules/nf-core/elprep/fastatoelfasta/main.nf b/modules/nf-core/elprep/fastatoelfasta/main.nf new file mode 100644 index 00000000..861350bf --- /dev/null +++ b/modules/nf-core/elprep/fastatoelfasta/main.nf @@ -0,0 +1,50 @@ +process ELPREP_FASTATOELFASTA { + tag "$meta.id" + label 'process_low' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/elprep:5.1.3--he881be0_1': + 'biocontainers/elprep:5.1.3--he881be0_1' }" + + input: + tuple val(meta), path(fasta) + + output: + tuple val(meta), path("*.elfasta") , emit: elfasta + tuple val(meta), path("logs/elprep/elprep*"), emit: log + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def prefix = task.ext.prefix ?: "${meta.id}" + """ + elprep fasta-to-elfasta \\ + $fasta \\ + ${prefix}.elfasta \\ + --log-path ./ + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + elprep: \$(elprep 2>&1 | head -n2 | tail -n1 |sed 's/^.*version //;s/ compiled.*\$//') + END_VERSIONS + """ + + stub: + def prefix = task.ext.prefix ?: "${meta.id}" + def timestamp = "${java.time.OffsetDateTime.now().format(java.time.format.DateTimeFormatter.ISO_DATE_TIME)}" + + """ + mkdir -p logs/elprep + + touch ${prefix}.elfasta + touch logs/elprep/elprep-${timestamp}.log + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + elprep: \$(elprep 2>&1 | head -n2 | tail -n1 |sed 's/^.*version //;s/ compiled.*\$//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/elprep/fastatoelfasta/meta.yml b/modules/nf-core/elprep/fastatoelfasta/meta.yml new file mode 100644 index 00000000..41a8be31 --- /dev/null +++ b/modules/nf-core/elprep/fastatoelfasta/meta.yml @@ -0,0 +1,55 @@ +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/meta-schema.json +name: "elprep_fastatoelfasta" +description: Convert a file in FASTA format to the ELFASTA format +keywords: + - fasta + - elfasta + - elprep +tools: + - "elprep": + description: "elPrep is a high-performance tool for preparing .sam/.bam files + for variant calling in sequencing pipelines. It can be used as a drop-in replacement + for SAMtools/Picard/GATK4." + homepage: "https://github.com/ExaScience/elprep" + documentation: "https://github.com/ExaScience/elprep" + tool_dev_url: "https://github.com/ExaScience/elprep" + doi: "10.1371/journal.pone.0244471" + licence: ["AGPL v3"] + identifier: biotools:elprep + +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + - fasta: + type: file + description: FASTA file + pattern: "*.{fasta,fa,fna}" +output: + - elfasta: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + pattern: "*.elfasta" + - "*.elfasta": + type: map + description: | + Groovy Map containing sample information + e.g. `[ id:'sample1', single_end:false ]` + pattern: "*.elfasta" + - log: + - meta: {} + - logs/elprep/elprep*: {} + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@nvnieuwk" +maintainers: + - "@nvnieuwk" diff --git a/modules/nf-core/elprep/fastatoelfasta/tests/main.nf.test b/modules/nf-core/elprep/fastatoelfasta/tests/main.nf.test new file mode 100644 index 00000000..d22f6d9d --- /dev/null +++ b/modules/nf-core/elprep/fastatoelfasta/tests/main.nf.test @@ -0,0 +1,66 @@ +nextflow_process { + + name "Test Process ELPREP_FASTATOELFASTA" + script "../main.nf" + process "ELPREP_FASTATOELFASTA" + + tag "modules" + tag "modules_nfcore" + tag "elprep" + tag "elprep/fastatoelfasta" + + test("sarscov2 - fasta") { + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true), + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.elfasta, + process.out.log.collect { [it[0], file(it[1]).exists()] }, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - fasta - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true), + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + process.out.elfasta, + process.out.log.collect { [it[0], file(it[1]).exists()] }, + process.out.versions + ).match() } + ) + } + + } + +} diff --git a/modules/nf-core/elprep/fastatoelfasta/tests/main.nf.test.snap b/modules/nf-core/elprep/fastatoelfasta/tests/main.nf.test.snap new file mode 100644 index 00000000..799bb0fb --- /dev/null +++ b/modules/nf-core/elprep/fastatoelfasta/tests/main.nf.test.snap @@ -0,0 +1,62 @@ +{ + "sarscov2 - fasta - stub": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test.elfasta:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + true + ] + ], + [ + "versions.yml:md5,bf313ed1289a8969464c5593b0ff67be" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-18T14:29:50.861439255" + }, + "sarscov2 - fasta": { + "content": [ + [ + [ + { + "id": "test", + "single_end": false + }, + "test.elfasta:md5,09a6f76bed84ee211ef0d962e26c77f1" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + true + ] + ], + [ + "versions.yml:md5,bf313ed1289a8969464c5593b0ff67be" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-18T14:25:24.238816922" + } +} \ No newline at end of file diff --git a/modules/nf-core/elprep/filter/elprep-filter.diff b/modules/nf-core/elprep/filter/elprep-filter.diff new file mode 100644 index 00000000..c877a871 --- /dev/null +++ b/modules/nf-core/elprep/filter/elprep-filter.diff @@ -0,0 +1,81 @@ +Changes in module 'nf-core/elprep/filter' +Changes in 'elprep/filter/main.nf': +--- modules/nf-core/elprep/filter/main.nf ++++ modules/nf-core/elprep/filter/main.nf +@@ -20,7 +20,6 @@ + + + output: +- tuple val(meta), path("*.{bam,sam}") , emit: bam + tuple val(meta), path("*.log") , emit: logs + tuple val(meta), path("*.metrics.txt") , optional: true, emit: metrics + tuple val(meta), path("*.recall") , optional: true, emit: recall +@@ -65,7 +64,7 @@ + if ("$bam" == "${prefix}.${suffix}") error "Input and output names are the same, set prefix in module configuration to disambiguate!" + + """ +- elprep filter ${bam} ${prefix}.${suffix} \\ ++ elprep filter ${bam} /dev/null \\ + ${reference_sequences_cmd} \\ + ${filter_regions_cmd} \\ + ${markdup_cmd} \\ +@@ -106,7 +105,6 @@ + if ("$bam" == "${prefix}.${suffix}") error "Input and output names are the same, set prefix in module configuration to disambiguate!" + + """ +- touch ${prefix}.${suffix} + touch elprep-${timestamp}.log + ${markdup_cmd} + ${bqsr_cmd} + +'modules/nf-core/elprep/filter/environment.yml' is unchanged +'modules/nf-core/elprep/filter/meta.yml' is unchanged +'modules/nf-core/elprep/filter/tests/main.nf.test' is unchanged +Changes in 'elprep/filter/tests/main.nf.test.snap': +--- modules/nf-core/elprep/filter/tests/main.nf.test.snap ++++ modules/nf-core/elprep/filter/tests/main.nf.test.snap +@@ -2,13 +2,7 @@ + "test-elprep-filter": { + "content": [ + [ +- [ +- { +- "id": "test", +- "single_end": false +- }, +- "test.bam,readsMD5:463ac3b905fbf4ddf113a94dbfa8d69f" +- ] ++ + ], + [ + +@@ -57,22 +51,14 @@ + ] + ], + "meta": { +- "nf-test": "0.9.0", +- "nextflow": "24.04.4" ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" + }, +- "timestamp": "2024-10-22T11:05:45.927224502" ++ "timestamp": "2024-11-05T15:16:40.979143203" + }, + "test-elprep-filter-stub": { + "content": [ +- [ +- [ +- { +- "id": "test", +- "single_end": false +- }, +- "test.bam:md5,d41d8cd98f00b204e9800998ecf8427e" +- ] +- ], ++ null, + [ + + ], + +'modules/nf-core/elprep/filter/tests/nextflow.config' is unchanged +************************************************************ diff --git a/modules/nf-core/snpeff/snpeff/environment.yml b/modules/nf-core/elprep/filter/environment.yml similarity index 66% rename from modules/nf-core/snpeff/snpeff/environment.yml rename to modules/nf-core/elprep/filter/environment.yml index f2ad9251..38dd4f47 100644 --- a/modules/nf-core/snpeff/snpeff/environment.yml +++ b/modules/nf-core/elprep/filter/environment.yml @@ -2,4 +2,4 @@ channels: - conda-forge - bioconda dependencies: - - bioconda::snpeff=5.1 + - bioconda::elprep=5.1.3 diff --git a/modules/nf-core/elprep/filter/main.nf b/modules/nf-core/elprep/filter/main.nf new file mode 100644 index 00000000..df445339 --- /dev/null +++ b/modules/nf-core/elprep/filter/main.nf @@ -0,0 +1,121 @@ +process ELPREP_FILTER { + tag "$meta.id" + label 'process_high' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + 'https://depot.galaxyproject.org/singularity/elprep:5.1.3--he881be0_1': + 'biocontainers/elprep:5.1.3--he881be0_1' }" + + input: + tuple val(meta), path(bam), path(bai), path(target_regions_bed), path(filter_regions_bed), path(intermediate_bqsr_tables), path(recall_file) + tuple val(meta2), path(reference_sequences) + tuple val(meta3), path(reference_elfasta) + tuple val(meta4), path(known_sites_elsites) + val(run_haplotypecaller) + val(run_bqsr) + val(bqsr_tables_only) + val(get_activity_profile) + val(get_assembly_regions) + + + output: + tuple val(meta), path("*.log") , emit: logs + tuple val(meta), path("*.metrics.txt") , optional: true, emit: metrics + tuple val(meta), path("*.recall") , optional: true, emit: recall + tuple val(meta), path("*.vcf.gz") , optional: true, emit: gvcf + tuple val(meta), path("*.table") , optional: true, emit: table + tuple val(meta), path("*.activity_profile.igv") , optional: true, emit: activity_profile + tuple val(meta), path("*.assembly_regions.igv") , optional: true, emit: assembly_regions + path "versions.yml" , emit: versions + + when: + task.ext.when == null || task.ext.when + + script: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def suffix = args.contains("--output-type sam") ? "sam" : "bam" + + // filter args + def reference_sequences_cmd = reference_sequences ? "--replace-reference-sequences ${reference_sequences}" : "" + def filter_regions_cmd = filter_regions_bed ? "--filter-non-overlapping-reads ${filter_regions_bed}" : "" + + // markdup args + def markdup_cmd = args.contains("--mark-duplicates") ? "--mark-optical-duplicates ${prefix}.metrics.txt": "" + + // variant calling args + def haplotyper_cmd = run_haplotypecaller ? "--haplotypecaller ${prefix}.g.vcf.gz": "" + + def fasta_cmd = reference_elfasta ? "--reference ${reference_elfasta}": "" + def known_sites_cmd = known_sites_elsites ? "--known-sites ${known_sites_elsites}": "" + def target_regions_cmd = target_regions_bed ? "--target-regions ${target_regions_bed}": "" + + // bqsr args + def bqsr_cmd = run_bqsr ? "--bqsr ${prefix}.recall": "" + def bqsr_tables_only_cmd = bqsr_tables_only ? "--bqsr-tables-only ${prefix}.table": "" + + def intermediate_bqsr_cmd = intermediate_bqsr_tables ? "--bqsr-apply .": "" + def input_recall_cmd = recall_file ? "--recal-file $recall_file" : "" + // misc + def activity_profile_cmd = get_activity_profile ? "--activity-profile ${prefix}.activity_profile.igv": "" + def assembly_regions_cmd = get_assembly_regions ? "--assembly-regions ${prefix}.assembly_regions.igv": "" + + if ("$bam" == "${prefix}.${suffix}") error "Input and output names are the same, set prefix in module configuration to disambiguate!" + + """ + elprep filter ${bam} /dev/null \\ + ${reference_sequences_cmd} \\ + ${filter_regions_cmd} \\ + ${markdup_cmd} \\ + ${haplotyper_cmd} \\ + ${fasta_cmd} \\ + ${known_sites_cmd} \\ + ${target_regions_cmd} \\ + ${bqsr_cmd} \\ + ${bqsr_tables_only_cmd} \\ + ${intermediate_bqsr_cmd} \\ + ${input_recall_cmd} \\ + ${activity_profile_cmd} \\ + ${assembly_regions_cmd} \\ + --nr-of-threads ${task.cpus} \\ + --log-path ./ \\ + $args + + mv logs/elprep/*.log . + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + elprep: \$(elprep 2>&1 | head -n2 | tail -n1 |sed 's/^.*version //;s/ compiled.*\$//') + END_VERSIONS + """ + + stub: + def args = task.ext.args ?: '' + def prefix = task.ext.prefix ?: "${meta.id}" + def suffix = args.contains("--output-type sam") ? "sam" : "bam" + def timestamp = "${java.time.OffsetDateTime.now().format(java.time.format.DateTimeFormatter.ISO_DATE_TIME)}" + def markdup_cmd = args.contains("--mark-duplicates") ? "touch ${prefix}.metrics.txt": "" + def bqsr_cmd = run_bqsr ? "touch ${prefix}.recall": "" + def haplotyper_cmd = run_haplotypecaller ? "echo | gzip > ${prefix}.g.vcf.gz": "" + def bqsr_tables_only_cmd = bqsr_tables_only ? "echo | gzip > ${prefix}.table": "" + def activity_profile_cmd = get_activity_profile ? "touch ${prefix}.activity_profile.igv": "" + def assembly_regions_cmd = get_assembly_regions ? "touch ${prefix}.assembly_regions.igv": "" + + if ("$bam" == "${prefix}.${suffix}") error "Input and output names are the same, set prefix in module configuration to disambiguate!" + + """ + touch elprep-${timestamp}.log + ${markdup_cmd} + ${bqsr_cmd} + ${haplotyper_cmd} + ${bqsr_tables_only_cmd} + ${activity_profile_cmd} + ${assembly_regions_cmd} + + cat <<-END_VERSIONS > versions.yml + "${task.process}": + elprep: \$(elprep 2>&1 | head -n2 | tail -n1 |sed 's/^.*version //;s/ compiled.*\$//') + END_VERSIONS + """ +} diff --git a/modules/nf-core/elprep/filter/meta.yml b/modules/nf-core/elprep/filter/meta.yml new file mode 100644 index 00000000..2af3b8b1 --- /dev/null +++ b/modules/nf-core/elprep/filter/meta.yml @@ -0,0 +1,212 @@ +name: "elprep_filter" +description: "Filter, sort and markdup sam/bam files, with optional BQSR and variant + calling." +keywords: + - sort + - bam + - sam + - filter + - variant calling +tools: + - "elprep": + description: "elPrep is a high-performance tool for preparing .sam/.bam files + for variant calling in sequencing pipelines. It can be used as a drop-in replacement + for SAMtools/Picard/GATK4." + homepage: "https://github.com/ExaScience/elprep" + documentation: "https://github.com/ExaScience/elprep" + tool_dev_url: "https://github.com/ExaScience/elprep" + doi: "10.1371/journal.pone.0244471" + licence: ["AGPL v3"] + identifier: biotools:elprep +input: + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: Input SAM/BAM file + pattern: "*.{bam,sam}" + - bai: + type: file + description: Input BAM file index + pattern: "*.bai" + - target_regions_bed: + type: file + description: Optional BED file containing target regions for BQSR and variant + calling. + pattern: "*.bed" + - filter_regions_bed: + type: file + description: Optional BED file containing regions to filter. + pattern: "*.bed" + - intermediate_bqsr_tables: + type: file + description: Optional list of BQSR tables, used when parsing files created by + `elprep split` + pattern: "*.table" + - recall_file: + type: file + description: Recall file with intermediate results for bqsr + pattern: "*.recall" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reference_sequences: + type: file + description: Optional SAM header to replace existing header. + pattern: "*.sam" + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - reference_elfasta: + type: file + description: Elfasta file, required for BQSR and variant calling. + pattern: "*.elfasta" + - - meta4: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - known_sites_elsites: + type: file + description: Optional elsites file containing known SNPs for BQSR. + pattern: "*.elsites" + - - run_haplotypecaller: + type: boolean + description: Run variant calling on the input files. Needed to generate gvcf + output. + - - run_bqsr: + type: boolean + description: Run BQSR on the input files. Needed to generate recall metrics. + - - bqsr_tables_only: + type: boolean + description: Write intermediate BQSR tables, used when parsing files created + by `elprep split`. + - - get_activity_profile: + type: boolean + description: Get the activity profile calculated by the haplotypecaller to the + given file in IGV format. + - - get_assembly_regions: + type: boolean + description: Get the assembly regions calculated by haplotypecaller to the speficied + file in IGV format. +output: + - bam: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{bam,sam}" + - "*.{bam,sam}": + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{bam,sam}" + - logs: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "elprep-*.log" + - "*.log": + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "elprep-*.log" + - metrics: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{metrics.txt}" + - "*.metrics.txt": + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{metrics.txt}" + - recall: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{recall}" + - "*.recall": + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{recall}" + - gvcf: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{vcf.gz}" + - "*.vcf.gz": + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{vcf.gz}" + - table: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{table}" + - "*.table": + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{table}" + - activity_profile: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{activity_profile.igv}" + - "*.activity_profile.igv": + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{activity_profile.igv}" + - assembly_regions: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{assembly_regions.igv}" + - "*.assembly_regions.igv": + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + pattern: "*.{assembly_regions.igv}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" +authors: + - "@matthdsm" +maintainers: + - "@matthdsm" diff --git a/modules/nf-core/elprep/filter/tests/main.nf.test b/modules/nf-core/elprep/filter/tests/main.nf.test new file mode 100644 index 00000000..84f6e16c --- /dev/null +++ b/modules/nf-core/elprep/filter/tests/main.nf.test @@ -0,0 +1,120 @@ + +nextflow_process { + + name "Test Process ELPREP_FILTER" + script "../main.nf" + process "ELPREP_FILTER" + config "./nextflow.config" + + tag "modules" + tag "modules_nfcore" + tag "elprep" + tag "elprep/filter" + + test("test-elprep-filter") { + + when { + process { + """ + input[0] = Channel.of([ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true), + [], + [], + [] + ]) // meta, bam, bai, target_regions, bqsr_table, recall + input[1] = [[],[]] // reference sequences + input[2] = [ + [ id:'elfasta' ], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.elfasta', checkIfExists: true) + ] // meta2, reference_elfasta + input[3] = [ + [ id: 'sites' ], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.elsites', checkIfExists: true) + ] // elsites + input[4] = true // haplotypecaller + input[5] = true // bqsr + input[6] = false // bqsr_tables_only + input[7] = true // get_activity_profile + input[8] = true // get_assembly_regions + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert process.out.logs }, // name is unstable + { assert snapshot( + process.out.bam.collect { [it[0], "${file(it[1]).name},readsMD5:${bam(it[1]).getReadsMD5()}"] }, + process.out.metrics.collect { [it[0], file(it[1]).readLines()[10..20]] }, + process.out.recall, + process.out.gvcf.collect { [ it[0], "${file(it[1]).name},variantsMD5:${path(it[1]).vcf.variantsMD5}" ] }, + process.out.table, + process.out.activity_profile, + process.out.assembly_regions, + process.out.versions + ).match() + } + ) + } + } + + test("test-elprep-filter-stub") { + options '-stub' + + when { + process { + """ + input[0] = [ + [ id:'test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true), + [], + [], + [] + ] + input[1] = [ + [ id:'ref_seq'], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.dict', checkIfExists: true) + ] // reference sequences + input[2] = [ + [ id:'elfasta' ], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.elfasta', checkIfExists: true) + ] // meta2, reference_elfasta + input[3] = [ + [ id: 'sites' ], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/vcf/dbsnp_146.hg38.elsites', checkIfExists: true) + ] // elsites + input[4] = true // haplotypecaller + input[5] = false // bqsr + input[6] = false // bqsr_tables_only + input[7] = true // get_activity_profile + input[8] = true // get_assembly_regions + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert process.out.logs }, // name is unstable + { assert snapshot( + process.out.bam, + process.out.metrics, + process.out.recall, + process.out.gvcf, + process.out.table, + process.out.activity_profile, + process.out.assembly_regions, + process.out.versions + ).match() + } + ) + } + } + +} diff --git a/modules/nf-core/elprep/filter/tests/main.nf.test.snap b/modules/nf-core/elprep/filter/tests/main.nf.test.snap new file mode 100644 index 00000000..d4848abc --- /dev/null +++ b/modules/nf-core/elprep/filter/tests/main.nf.test.snap @@ -0,0 +1,108 @@ +{ + "test-elprep-filter": { + "content": [ + [ + + ], + [ + + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.recall:md5,9a7921cc49a7a3f6c20e0278eaf3f235" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.g.vcf.gz,variantsMD5:b74f219f1f3ca2e59d6edfabf503a6a9" + ] + ], + [ + + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.activity_profile.igv:md5,c4b77c1bebcffd7822cafb8b90f70cde" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.assembly_regions.igv:md5,7ec2070b4d4af26532cffbc1c465ba93" + ] + ], + [ + "versions.yml:md5,8193703d0cedd662b76ea48940dac55d" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-05T15:16:40.979143203" + }, + "test-elprep-filter-stub": { + "content": [ + null, + [ + + ], + [ + + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.g.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + [ + + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.activity_profile.igv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + [ + [ + { + "id": "test", + "single_end": false + }, + "test.assembly_regions.igv:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + [ + "versions.yml:md5,8193703d0cedd662b76ea48940dac55d" + ] + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-22T10:45:09.343805413" + } +} \ No newline at end of file diff --git a/modules/nf-core/elprep/filter/tests/nextflow.config b/modules/nf-core/elprep/filter/tests/nextflow.config new file mode 100644 index 00000000..bcb2dae0 --- /dev/null +++ b/modules/nf-core/elprep/filter/tests/nextflow.config @@ -0,0 +1,5 @@ +process { + withName: ELPREP_FILTER { + ext.args = "--reference-confidence GVCF" + } +} diff --git a/modules/nf-core/ensemblvep/download/ensemblvep-download.diff b/modules/nf-core/ensemblvep/download/ensemblvep-download.diff index 6d346d7b..a249d33d 100644 --- a/modules/nf-core/ensemblvep/download/ensemblvep-download.diff +++ b/modules/nf-core/ensemblvep/download/ensemblvep-download.diff @@ -14,21 +14,69 @@ Changes in 'ensemblvep/download/main.nf': input: tuple val(meta), val(assembly), val(species), val(cache_version) -Changes in 'ensemblvep/download/environment.yml': ---- modules/nf-core/ensemblvep/download/environment.yml -+++ modules/nf-core/ensemblvep/download/environment.yml -@@ -1,7 +1,5 @@ --name: ensemblvep_download - channels: - - conda-forge - - bioconda -- - defaults - dependencies: - - bioconda::ensembl-vep=112.0 - +'modules/nf-core/ensemblvep/download/environment.yml' is unchanged 'modules/nf-core/ensemblvep/download/meta.yml' is unchanged 'modules/nf-core/ensemblvep/download/tests/tags.yml' is unchanged 'modules/nf-core/ensemblvep/download/tests/main.nf.test' is unchanged -'modules/nf-core/ensemblvep/download/tests/main.nf.test.snap' is unchanged +Changes in 'ensemblvep/download/tests/main.nf.test.snap': +--- modules/nf-core/ensemblvep/download/tests/main.nf.test.snap ++++ modules/nf-core/ensemblvep/download/tests/main.nf.test.snap +@@ -136,7 +136,7 @@ + ] + ], + "1": [ +- "versions.yml:md5,e32852e9cba2a298b7518ce610011b14" ++ "versions.yml:md5,44c3b4926fae35dfcf138d9bf26acfd1" + ], + "cache": [ + [ +@@ -272,15 +272,15 @@ + ] + ], + "versions": [ +- "versions.yml:md5,e32852e9cba2a298b7518ce610011b14" ++ "versions.yml:md5,44c3b4926fae35dfcf138d9bf26acfd1" + ] + } + ], + "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.4" ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" + }, +- "timestamp": "2024-09-02T13:19:08.690863" ++ "timestamp": "2024-11-20T14:09:48.300197368" + }, + "celegans - download - stub": { + "content": [ +@@ -296,7 +296,7 @@ + ] + ], + "1": [ +- "versions.yml:md5,e32852e9cba2a298b7518ce610011b14" ++ "versions.yml:md5,44c3b4926fae35dfcf138d9bf26acfd1" + ], + "cache": [ + [ +@@ -309,14 +309,14 @@ + ] + ], + "versions": [ +- "versions.yml:md5,e32852e9cba2a298b7518ce610011b14" ++ "versions.yml:md5,44c3b4926fae35dfcf138d9bf26acfd1" + ] + } + ], + "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.4" ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" + }, +- "timestamp": "2024-09-02T13:19:23.308683" ++ "timestamp": "2024-11-20T14:10:02.855040367" + } + } 'modules/nf-core/ensemblvep/download/tests/nextflow.config' is unchanged ************************************************************ diff --git a/modules/nf-core/ensemblvep/download/meta.yml b/modules/nf-core/ensemblvep/download/meta.yml index a4277ad7..8da9621c 100644 --- a/modules/nf-core/ensemblvep/download/meta.yml +++ b/modules/nf-core/ensemblvep/download/meta.yml @@ -1,5 +1,6 @@ name: ensemblvep_download -description: Ensembl Variant Effect Predictor (VEP). The cache downloading options are controlled through `task.ext.args`. +description: Ensembl Variant Effect Predictor (VEP). The cache downloading options + are controlled through `task.ext.args`. keywords: - annotation - cache @@ -12,33 +13,40 @@ tools: homepage: https://www.ensembl.org/info/docs/tools/vep/index.html documentation: https://www.ensembl.org/info/docs/tools/vep/script/index.html licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - assembly: - type: string - description: | - Genome assembly - - species: - type: string - description: | - Specie - - cache_version: - type: string - description: | - cache version + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - assembly: + type: string + description: | + Genome assembly + - species: + type: string + description: | + Specie + - cache_version: + type: string + description: | + cache version output: - cache: - type: file - description: cache - pattern: "*" + - meta: + type: file + description: cache + pattern: "*" + - prefix: + type: file + description: cache + pattern: "*" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" maintainers: diff --git a/modules/nf-core/ensemblvep/download/tests/main.nf.test.snap b/modules/nf-core/ensemblvep/download/tests/main.nf.test.snap index 9e303c54..0445c84a 100644 --- a/modules/nf-core/ensemblvep/download/tests/main.nf.test.snap +++ b/modules/nf-core/ensemblvep/download/tests/main.nf.test.snap @@ -136,7 +136,7 @@ ] ], "1": [ - "versions.yml:md5,e32852e9cba2a298b7518ce610011b14" + "versions.yml:md5,44c3b4926fae35dfcf138d9bf26acfd1" ], "cache": [ [ @@ -272,15 +272,15 @@ ] ], "versions": [ - "versions.yml:md5,e32852e9cba2a298b7518ce610011b14" + "versions.yml:md5,44c3b4926fae35dfcf138d9bf26acfd1" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-02T13:19:08.690863" + "timestamp": "2024-11-20T14:09:48.300197368" }, "celegans - download - stub": { "content": [ @@ -296,7 +296,7 @@ ] ], "1": [ - "versions.yml:md5,e32852e9cba2a298b7518ce610011b14" + "versions.yml:md5,44c3b4926fae35dfcf138d9bf26acfd1" ], "cache": [ [ @@ -309,14 +309,14 @@ ] ], "versions": [ - "versions.yml:md5,e32852e9cba2a298b7518ce610011b14" + "versions.yml:md5,44c3b4926fae35dfcf138d9bf26acfd1" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-02T13:19:23.308683" + "timestamp": "2024-11-20T14:10:02.855040367" } } \ No newline at end of file diff --git a/modules/nf-core/ensemblvep/vep/ensemblvep-vep.diff b/modules/nf-core/ensemblvep/vep/ensemblvep-vep.diff index e09aa620..bcc6ba95 100644 --- a/modules/nf-core/ensemblvep/vep/ensemblvep-vep.diff +++ b/modules/nf-core/ensemblvep/vep/ensemblvep-vep.diff @@ -14,22 +14,60 @@ Changes in 'ensemblvep/vep/main.nf': input: tuple val(meta), path(vcf), path(custom_extra_files) -Changes in 'ensemblvep/vep/environment.yml': ---- modules/nf-core/ensemblvep/vep/environment.yml -+++ modules/nf-core/ensemblvep/vep/environment.yml -@@ -2,6 +2,5 @@ - channels: - - conda-forge - - bioconda -- - defaults - dependencies: - - bioconda::ensembl-vep=112.0 - +'modules/nf-core/ensemblvep/vep/environment.yml' is unchanged 'modules/nf-core/ensemblvep/vep/meta.yml' is unchanged 'modules/nf-core/ensemblvep/vep/tests/tags.yml' is unchanged 'modules/nf-core/ensemblvep/vep/tests/tab.gz.config' is unchanged 'modules/nf-core/ensemblvep/vep/tests/vcf.config' is unchanged -'modules/nf-core/ensemblvep/vep/tests/main.nf.test' is unchanged -'modules/nf-core/ensemblvep/vep/tests/main.nf.test.snap' is unchanged +Changes in 'ensemblvep/vep/tests/main.nf.test': +--- modules/nf-core/ensemblvep/vep/tests/main.nf.test ++++ modules/nf-core/ensemblvep/vep/tests/main.nf.test +@@ -107,7 +107,7 @@ + assertAll( + { assert process.success }, + { assert snapshot(process.out.versions).match() }, +- { assert path(process.out.tab.get(0).get(1)).linesGzip.contains("## ENSEMBL VARIANT EFFECT PREDICTOR v112.0") } ++ { assert path(process.out.tab.get(0).get(1)).linesGzip.contains("## ENSEMBL VARIANT EFFECT PREDICTOR v105.0") } + ) + } + } + +Changes in 'ensemblvep/vep/tests/main.nf.test.snap': +--- modules/nf-core/ensemblvep/vep/tests/main.nf.test.snap ++++ modules/nf-core/ensemblvep/vep/tests/main.nf.test.snap +@@ -2,25 +2,25 @@ + "test_ensemblvep_vep_fasta_tab_gz": { + "content": [ + [ +- "versions.yml:md5,d06f1eb60f534489026d682eb3aa5559" ++ "versions.yml:md5,c6d58a35e7be5e6ab46a3f9757f6e259" + ] + ], + "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.4" ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" + }, +- "timestamp": "2024-09-02T10:15:18.228927" ++ "timestamp": "2024-11-20T14:10:59.846254319" + }, + "test_ensemblvep_vep_fasta_vcf": { + "content": [ + [ +- "versions.yml:md5,d06f1eb60f534489026d682eb3aa5559" ++ "versions.yml:md5,c6d58a35e7be5e6ab46a3f9757f6e259" + ] + ], + "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.4" ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" + }, +- "timestamp": "2024-09-02T10:14:50.193861" ++ "timestamp": "2024-11-20T14:10:44.092773407" + } + } 'modules/nf-core/ensemblvep/vep/tests/nextflow.config' is unchanged ************************************************************ diff --git a/modules/nf-core/ensemblvep/vep/environment.yml b/modules/nf-core/ensemblvep/vep/environment.yml index 87b88372..283a45bb 100644 --- a/modules/nf-core/ensemblvep/vep/environment.yml +++ b/modules/nf-core/ensemblvep/vep/environment.yml @@ -1,4 +1,3 @@ -name: ensemblvep_vep channels: - conda-forge - bioconda diff --git a/modules/nf-core/ensemblvep/vep/meta.yml b/modules/nf-core/ensemblvep/vep/meta.yml index d8ff8d14..9288a938 100644 --- a/modules/nf-core/ensemblvep/vep/meta.yml +++ b/modules/nf-core/ensemblvep/vep/meta.yml @@ -1,5 +1,6 @@ name: ensemblvep_vep -description: Ensembl Variant Effect Predictor (VEP). The output-file-format is controlled through `task.ext.args`. +description: Ensembl Variant Effect Predictor (VEP). The output-file-format is controlled + through `task.ext.args`. keywords: - annotation - vcf @@ -13,75 +14,96 @@ tools: homepage: https://www.ensembl.org/info/docs/tools/vep/index.html documentation: https://www.ensembl.org/info/docs/tools/vep/script/index.html licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: | - vcf to annotate - - custom_extra_files: - type: file - description: | - extra sample-specific files to be used with the `--custom` flag to be configured with ext.args - (optional) - - genome: - type: string - description: | - which genome to annotate with - - species: - type: string - description: | - which species to annotate with - - cache_version: - type: integer - description: | - which version of the cache to annotate with - - cache: - type: file - description: | - path to VEP cache (optional) - - meta2: - type: map - description: | - Groovy Map containing fasta reference information - e.g. [ id:'test' ] - - fasta: - type: file - description: | - reference FASTA file (optional) - pattern: "*.{fasta,fa}" - - extra_files: - type: file - description: | - path to file(s) needed for plugins (optional) + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: | + vcf to annotate + - custom_extra_files: + type: file + description: | + extra sample-specific files to be used with the `--custom` flag to be configured with ext.args + (optional) + - - genome: + type: string + description: | + which genome to annotate with + - - species: + type: string + description: | + which species to annotate with + - - cache_version: + type: integer + description: | + which version of the cache to annotate with + - - cache: + type: file + description: | + path to VEP cache (optional) + - - meta2: + type: map + description: | + Groovy Map containing fasta reference information + e.g. [ id:'test' ] + - fasta: + type: file + description: | + reference FASTA file (optional) + pattern: "*.{fasta,fa}" + - - extra_files: + type: file + description: | + path to file(s) needed for plugins (optional) output: - vcf: - type: file - description: | - annotated vcf (optional) - pattern: "*.ann.vcf.gz" + - meta: + type: file + description: | + annotated vcf (optional) + pattern: "*.ann.vcf.gz" + - "*.vcf.gz": + type: file + description: | + annotated vcf (optional) + pattern: "*.ann.vcf.gz" - tab: - type: file - description: | - tab file with annotated variants (optional) - pattern: "*.ann.tab.gz" + - meta: + type: file + description: | + tab file with annotated variants (optional) + pattern: "*.ann.tab.gz" + - "*.tab.gz": + type: file + description: | + tab file with annotated variants (optional) + pattern: "*.ann.tab.gz" - json: - type: file - description: | - json file with annotated variants (optional) - pattern: "*.ann.json.gz" + - meta: + type: file + description: | + json file with annotated variants (optional) + pattern: "*.ann.json.gz" + - "*.json.gz": + type: file + description: | + json file with annotated variants (optional) + pattern: "*.ann.json.gz" - report: - type: file - description: VEP report file - pattern: "*.html" + - "*.html": + type: file + description: VEP report file + pattern: "*.html" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" - "@matthdsm" diff --git a/modules/nf-core/ensemblvep/vep/tests/main.nf.test b/modules/nf-core/ensemblvep/vep/tests/main.nf.test index e68fff3c..f66e867e 100644 --- a/modules/nf-core/ensemblvep/vep/tests/main.nf.test +++ b/modules/nf-core/ensemblvep/vep/tests/main.nf.test @@ -107,7 +107,7 @@ nextflow_process { assertAll( { assert process.success }, { assert snapshot(process.out.versions).match() }, - { assert path(process.out.tab.get(0).get(1)).linesGzip.contains("## ENSEMBL VARIANT EFFECT PREDICTOR v112.0") } + { assert path(process.out.tab.get(0).get(1)).linesGzip.contains("## ENSEMBL VARIANT EFFECT PREDICTOR v105.0") } ) } } diff --git a/modules/nf-core/ensemblvep/vep/tests/main.nf.test.snap b/modules/nf-core/ensemblvep/vep/tests/main.nf.test.snap index 1c4c0e4e..2d215500 100644 --- a/modules/nf-core/ensemblvep/vep/tests/main.nf.test.snap +++ b/modules/nf-core/ensemblvep/vep/tests/main.nf.test.snap @@ -2,25 +2,25 @@ "test_ensemblvep_vep_fasta_tab_gz": { "content": [ [ - "versions.yml:md5,d06f1eb60f534489026d682eb3aa5559" + "versions.yml:md5,c6d58a35e7be5e6ab46a3f9757f6e259" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-02T10:15:18.228927" + "timestamp": "2024-11-20T14:10:59.846254319" }, "test_ensemblvep_vep_fasta_vcf": { "content": [ [ - "versions.yml:md5,d06f1eb60f534489026d682eb3aa5559" + "versions.yml:md5,c6d58a35e7be5e6ab46a3f9757f6e259" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-02T10:14:50.193861" + "timestamp": "2024-11-20T14:10:44.092773407" } } \ No newline at end of file diff --git a/modules/nf-core/gatk4/calibratedragstrmodel/meta.yml b/modules/nf-core/gatk4/calibratedragstrmodel/meta.yml index bf2ca2d7..cc19131b 100644 --- a/modules/nf-core/gatk4/calibratedragstrmodel/meta.yml +++ b/modules/nf-core/gatk4/calibratedragstrmodel/meta.yml @@ -8,62 +8,65 @@ keywords: - calibratedragstrmodel tools: - gatk4: - description: Genome Analysis Toolkit (GATK4). Developed in the Data Sciences Platform at the Broad Institute, the toolkit offers a wide variety of tools with a primary focus on variant discovery and genotyping. Its powerful processing engine and high-performance computing features make it capable of taking on projects of any size. + description: Genome Analysis Toolkit (GATK4). Developed in the Data Sciences Platform + at the Broad Institute, the toolkit offers a wide variety of tools with a primary + focus on variant discovery and genotyping. Its powerful processing engine and + high-performance computing features make it capable of taking on projects of + any size. homepage: https://gatk.broadinstitute.org/hc/en-us documentation: https://gatk.broadinstitute.org/hc/en-us/articles/360057441571-CalibrateDragstrModel-BETA- tool_dev_url: https://github.com/broadinstitute/gatk doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: # Only when we have meta - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" - - bam_index: - type: file - description: index of the BAM/CRAM/SAM file - pattern: "*.{bai,crai,sai}" - - intervals: - type: file - description: BED file or interval list containing regions (optional) - pattern: "*.{bed,interval_list}" - - fasta: - type: file - description: The reference FASTA file - pattern: "*.{fasta,fa}" - - fasta_fai: - type: file - description: The index of the reference FASTA file - pattern: "*.fai" - - dict: - type: file - description: The sequence dictionary of the reference FASTA file - pattern: "*.dict" - - strtablefile: - type: file - description: The StrTableFile zip folder of the reference FASTA file - pattern: "*.zip" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: BAM/CRAM/SAM file + pattern: "*.{bam,cram,sam}" + - bam_index: + type: file + description: index of the BAM/CRAM/SAM file + pattern: "*.{bai,crai,sai}" + - - fasta: + type: file + description: The reference FASTA file + pattern: "*.{fasta,fa}" + - - fasta_fai: + type: file + description: The index of the reference FASTA file + pattern: "*.fai" + - - dict: + type: file + description: The sequence dictionary of the reference FASTA file + pattern: "*.dict" + - - strtablefile: + type: file + description: The StrTableFile zip folder of the reference FASTA file + pattern: "*.zip" output: #Only when we have meta - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - dragstr_model: - type: file - description: The DragSTR model - pattern: "*.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.txt": + type: file + description: The DragSTR model + pattern: "*.txt" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@nvnieuwk" maintainers: diff --git a/modules/nf-core/gatk4/composestrtablefile/meta.yml b/modules/nf-core/gatk4/composestrtablefile/meta.yml index 249aed24..fd56c797 100644 --- a/modules/nf-core/gatk4/composestrtablefile/meta.yml +++ b/modules/nf-core/gatk4/composestrtablefile/meta.yml @@ -1,5 +1,7 @@ name: "gatk4_composestrtablefile" -description: This tool looks for low-complexity STR sequences along the reference that are later used to estimate the Dragstr model during single sample auto calibration CalibrateDragstrModel. +description: This tool looks for low-complexity STR sequences along the reference + that are later used to estimate the Dragstr model during single sample auto calibration + CalibrateDragstrModel. keywords: - composestrtablefile - dragstr @@ -14,28 +16,31 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/articles/4405451249819-ComposeSTRTableFile doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - fasta: - type: file - description: FASTA reference file - pattern: "*.{fasta,fa}" - - fasta_fai: - type: file - description: index of the FASTA reference file - pattern: "*.fai" - - dict: - type: file - description: Sequence dictionary of the FASTA reference file - pattern: "*.dict" + - - fasta: + type: file + description: FASTA reference file + pattern: "*.{fasta,fa}" + - - fasta_fai: + type: file + description: index of the FASTA reference file + pattern: "*.fai" + - - dict: + type: file + description: Sequence dictionary of the FASTA reference file + pattern: "*.dict" output: - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - str_table: - type: file - description: A zipped folder containing the STR table files - pattern: "*.zip" + - "*.zip": + type: file + description: A zipped folder containing the STR table files + pattern: "*.zip" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@nvnieuwk" maintainers: diff --git a/modules/nf-core/gatk4/createsequencedictionary/gatk4-createsequencedictionary.diff b/modules/nf-core/gatk4/createsequencedictionary/gatk4-createsequencedictionary.diff new file mode 100644 index 00000000..74ff3550 --- /dev/null +++ b/modules/nf-core/gatk4/createsequencedictionary/gatk4-createsequencedictionary.diff @@ -0,0 +1,19 @@ +Changes in module 'nf-core/gatk4/createsequencedictionary' +'modules/nf-core/gatk4/createsequencedictionary/environment.yml' is unchanged +'modules/nf-core/gatk4/createsequencedictionary/meta.yml' is unchanged +Changes in 'gatk4/createsequencedictionary/main.nf': +--- modules/nf-core/gatk4/createsequencedictionary/main.nf ++++ modules/nf-core/gatk4/createsequencedictionary/main.nf +@@ -1,6 +1,6 @@ + process GATK4_CREATESEQUENCEDICTIONARY { + tag "$fasta" +- label 'process_medium' ++ label 'process_low' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + +'modules/nf-core/gatk4/createsequencedictionary/tests/main.nf.test.snap' is unchanged +'modules/nf-core/gatk4/createsequencedictionary/tests/tags.yml' is unchanged +'modules/nf-core/gatk4/createsequencedictionary/tests/main.nf.test' is unchanged +************************************************************ diff --git a/modules/nf-core/gatk4/createsequencedictionary/main.nf b/modules/nf-core/gatk4/createsequencedictionary/main.nf index c7f1d75b..ab58da95 100644 --- a/modules/nf-core/gatk4/createsequencedictionary/main.nf +++ b/modules/nf-core/gatk4/createsequencedictionary/main.nf @@ -1,6 +1,6 @@ process GATK4_CREATESEQUENCEDICTIONARY { tag "$fasta" - label 'process_medium' + label 'process_low' conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? diff --git a/modules/nf-core/gatk4/createsequencedictionary/meta.yml b/modules/nf-core/gatk4/createsequencedictionary/meta.yml index f9d70be0..7b5156bb 100644 --- a/modules/nf-core/gatk4/createsequencedictionary/meta.yml +++ b/modules/nf-core/gatk4/createsequencedictionary/meta.yml @@ -15,25 +15,32 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: Input fasta file - pattern: "*.{fasta,fa}" + - - meta: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: Input fasta file + pattern: "*.{fasta,fa}" output: - dict: - type: file - description: gatk dictionary file - pattern: "*.{dict}" + - meta: + type: file + description: gatk dictionary file + pattern: "*.{dict}" + - "*.dict": + type: file + description: gatk dictionary file + pattern: "*.{dict}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" - "@ramprasadn" diff --git a/modules/nf-core/gatk4/genomicsdbimport/gatk4-genomicsdbimport.diff b/modules/nf-core/gatk4/genomicsdbimport/gatk4-genomicsdbimport.diff new file mode 100644 index 00000000..1ff710cd --- /dev/null +++ b/modules/nf-core/gatk4/genomicsdbimport/gatk4-genomicsdbimport.diff @@ -0,0 +1,20 @@ +Changes in module 'nf-core/gatk4/genomicsdbimport' +'modules/nf-core/gatk4/genomicsdbimport/environment.yml' is unchanged +'modules/nf-core/gatk4/genomicsdbimport/meta.yml' is unchanged +Changes in 'gatk4/genomicsdbimport/main.nf': +--- modules/nf-core/gatk4/genomicsdbimport/main.nf ++++ modules/nf-core/gatk4/genomicsdbimport/main.nf +@@ -1,6 +1,6 @@ + process GATK4_GENOMICSDBIMPORT { + tag "$meta.id" +- label 'process_medium' ++ label 'process_low' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + +'modules/nf-core/gatk4/genomicsdbimport/tests/main.nf.test.snap' is unchanged +'modules/nf-core/gatk4/genomicsdbimport/tests/tags.yml' is unchanged +'modules/nf-core/gatk4/genomicsdbimport/tests/nextflow.config' is unchanged +'modules/nf-core/gatk4/genomicsdbimport/tests/main.nf.test' is unchanged +************************************************************ diff --git a/modules/nf-core/gatk4/genomicsdbimport/main.nf b/modules/nf-core/gatk4/genomicsdbimport/main.nf index 6f1d4c53..fb756a90 100644 --- a/modules/nf-core/gatk4/genomicsdbimport/main.nf +++ b/modules/nf-core/gatk4/genomicsdbimport/main.nf @@ -1,6 +1,6 @@ process GATK4_GENOMICSDBIMPORT { tag "$meta.id" - label 'process_medium' + label 'process_low' conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? diff --git a/modules/nf-core/gatk4/genomicsdbimport/meta.yml b/modules/nf-core/gatk4/genomicsdbimport/meta.yml index 11e565b1..174ae2eb 100644 --- a/modules/nf-core/gatk4/genomicsdbimport/meta.yml +++ b/modules/nf-core/gatk4/genomicsdbimport/meta.yml @@ -1,5 +1,6 @@ name: gatk4_genomicsdbimport -description: merge GVCFs from multiple samples. For use in joint genotyping or somatic panel of normal creation. +description: merge GVCFs from multiple samples. For use in joint genotyping or somatic + panel of normal creation. keywords: - gatk4 - genomicsdb @@ -15,61 +16,99 @@ tools: homepage: https://gatk.broadinstitute.org/hc/en-us documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test'] - - vcf: - type: list - description: either a list of vcf files to be used to create or update a genomicsdb, or a file that contains a map to vcf files to be used. - pattern: "*.vcf.gz" - - tbi: - type: list - description: list of tbi files that match with the input vcf files - pattern: "*.vcf.gz_tbi" - - wspace: - type: file - description: path to an existing genomicsdb to be used in update db mode or get intervals mode. This WILL NOT specify name of a new genomicsdb in create db mode. - pattern: "/path/to/existing/gendb" - - interval_file: - type: file - description: file containing the intervals to be used when creating the genomicsdb - pattern: "*.interval_list" - - interval_value: - type: string - description: if an intervals file has not been spcified, the value enetered here will be used as an interval via the "-L" argument - pattern: "example: chr1:1000-10000" - - run_intlist: - type: boolean - description: Specify whether to run get interval list mode, this option cannot be specified at the same time as run_updatewspace. - pattern: "true/false" - - run_updatewspace: - type: boolean - description: Specify whether to run update genomicsdb mode, this option takes priority over run_intlist. - pattern: "true/false" - - input_map: - type: boolean - description: Specify whether the vcf input is providing a list of vcf file(s) or a single file containing a map of paths to vcf files to be used to create or update a genomicsdb. - pattern: "*.sample_map" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test'] + - vcf: + type: list + description: either a list of vcf files to be used to create or update a genomicsdb, + or a file that contains a map to vcf files to be used. + pattern: "*.vcf.gz" + - tbi: + type: list + description: list of tbi files that match with the input vcf files + pattern: "*.vcf.gz_tbi" + - interval_file: + type: file + description: file containing the intervals to be used when creating the genomicsdb + pattern: "*.interval_list" + - interval_value: + type: string + description: if an intervals file has not been spcified, the value enetered + here will be used as an interval via the "-L" argument + pattern: "example: chr1:1000-10000" + - wspace: + type: file + description: path to an existing genomicsdb to be used in update db mode or + get intervals mode. This WILL NOT specify name of a new genomicsdb in create + db mode. + pattern: "/path/to/existing/gendb" + - - run_intlist: + type: boolean + description: Specify whether to run get interval list mode, this option cannot + be specified at the same time as run_updatewspace. + pattern: "true/false" + - - run_updatewspace: + type: boolean + description: Specify whether to run update genomicsdb mode, this option takes + priority over run_intlist. + pattern: "true/false" + - - input_map: + type: boolean + description: Specify whether the vcf input is providing a list of vcf file(s) + or a single file containing a map of paths to vcf files to be used to create + or update a genomicsdb. + pattern: "*.sample_map" output: - genomicsdb: - type: directory - description: Directory containing the files that compose the genomicsdb workspace, this is only output for create mode, as update changes an existing db - pattern: "*/$prefix" + - meta: + type: directory + description: Directory containing the files that compose the genomicsdb workspace, + this is only output for create mode, as update changes an existing db + pattern: "*/$prefix" + - $prefix: + type: directory + description: Directory containing the files that compose the genomicsdb workspace, + this is only output for create mode, as update changes an existing db + pattern: "*/$prefix" - updatedb: - type: directory - description: Directory containing the files that compose the updated genomicsdb workspace, this is only output for update mode, and should be the same path as the input wspace. - pattern: "same/path/as/wspace" + - meta: + type: directory + description: Directory containing the files that compose the updated genomicsdb + workspace, this is only output for update mode, and should be the same path + as the input wspace. + pattern: "same/path/as/wspace" + - $updated_db: + type: directory + description: Directory containing the files that compose the updated genomicsdb + workspace, this is only output for update mode, and should be the same path + as the input wspace. + pattern: "same/path/as/wspace" - intervallist: - type: file - description: File containing the intervals used to generate the genomicsdb, only created by get intervals mode. - pattern: "*.interval_list" + - meta: + type: file + description: File containing the intervals used to generate the genomicsdb, + only created by get intervals mode. + pattern: "*.interval_list" + - "*.interval_list": + type: file + description: File containing the intervals used to generate the genomicsdb, + only created by get intervals mode. + pattern: "*.interval_list" + - list: + type: file + description: File containing the intervals used to generate the genomicsdb, + only created by get intervals mode. + pattern: "*.interval_list" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@GCJMackenzie" maintainers: diff --git a/modules/nf-core/gatk4/genotypegvcfs/gatk4-genotypegvcfs.diff b/modules/nf-core/gatk4/genotypegvcfs/gatk4-genotypegvcfs.diff new file mode 100644 index 00000000..c099c7aa --- /dev/null +++ b/modules/nf-core/gatk4/genotypegvcfs/gatk4-genotypegvcfs.diff @@ -0,0 +1,18 @@ +Changes in module 'nf-core/gatk4/genotypegvcfs' +'modules/nf-core/gatk4/genotypegvcfs/environment.yml' is unchanged +'modules/nf-core/gatk4/genotypegvcfs/meta.yml' is unchanged +Changes in 'gatk4/genotypegvcfs/main.nf': +--- modules/nf-core/gatk4/genotypegvcfs/main.nf ++++ modules/nf-core/gatk4/genotypegvcfs/main.nf +@@ -1,6 +1,6 @@ + process GATK4_GENOTYPEGVCFS { + tag "$meta.id" +- label 'process_high' ++ label 'process_single' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? + +'modules/nf-core/gatk4/genotypegvcfs/tests/main.nf.test.snap' is unchanged +'modules/nf-core/gatk4/genotypegvcfs/tests/main.nf.test' is unchanged +************************************************************ diff --git a/modules/nf-core/gatk4/genotypegvcfs/main.nf b/modules/nf-core/gatk4/genotypegvcfs/main.nf index f180f749..b3684ce3 100644 --- a/modules/nf-core/gatk4/genotypegvcfs/main.nf +++ b/modules/nf-core/gatk4/genotypegvcfs/main.nf @@ -1,6 +1,6 @@ process GATK4_GENOTYPEGVCFS { tag "$meta.id" - label 'process_high' + label 'process_single' conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? diff --git a/modules/nf-core/gatk4/genotypegvcfs/meta.yml b/modules/nf-core/gatk4/genotypegvcfs/meta.yml index eb704364..0c1fe491 100644 --- a/modules/nf-core/gatk4/genotypegvcfs/meta.yml +++ b/modules/nf-core/gatk4/genotypegvcfs/meta.yml @@ -14,91 +14,101 @@ tools: tool_dev_url: https://github.com/broadinstitute/gatk doi: "10.1158/1538-7445.AM2017-3590" licence: ["BSD-3-clause"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: | - gVCF(.gz) file or a GenomicsDB - pattern: "*.{vcf,vcf.gz}" - - gvcf_index: - type: file - description: | - index of gvcf file, or empty when providing GenomicsDB - pattern: "*.{idx,tbi}" - - intervals: - type: file - description: Interval file with the genomic regions included in the library (optional) - - intervals_index: - type: file - description: Interval index file (optional) - - meta2: - type: map - description: | - Groovy Map containing fasta information - e.g. [ id:'test' ] - - fasta: - type: file - description: Reference fasta file - pattern: "*.fasta" - - meta3: - type: map - description: | - Groovy Map containing fai information - e.g. [ id:'test' ] - - fai: - type: file - description: Reference fasta index file - pattern: "*.fai" - - meta4: - type: map - description: | - Groovy Map containing dict information - e.g. [ id:'test' ] - - dict: - type: file - description: Reference fasta sequence dict file - pattern: "*.dict" - - meta5: - type: map - description: | - Groovy Map containing dbsnp information - e.g. [ id:'test' ] - - dbsnp: - type: file - description: dbSNP VCF file - pattern: "*.vcf.gz" - - meta6: - type: map - description: | - Groovy Map containing dbsnp tbi information - e.g. [ id:'test' ] - - dbsnp_tbi: - type: file - description: dbSNP VCF index file - pattern: "*.tbi" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: | + gVCF(.gz) file or a GenomicsDB + pattern: "*.{vcf,vcf.gz}" + - gvcf_index: + type: file + description: | + index of gvcf file, or empty when providing GenomicsDB + pattern: "*.{idx,tbi}" + - intervals: + type: file + description: Interval file with the genomic regions included in the library + (optional) + - intervals_index: + type: file + description: Interval index file (optional) + - - meta2: + type: map + description: | + Groovy Map containing fasta information + e.g. [ id:'test' ] + - fasta: + type: file + description: Reference fasta file + pattern: "*.fasta" + - - meta3: + type: map + description: | + Groovy Map containing fai information + e.g. [ id:'test' ] + - fai: + type: file + description: Reference fasta index file + pattern: "*.fai" + - - meta4: + type: map + description: | + Groovy Map containing dict information + e.g. [ id:'test' ] + - dict: + type: file + description: Reference fasta sequence dict file + pattern: "*.dict" + - - meta5: + type: map + description: | + Groovy Map containing dbsnp information + e.g. [ id:'test' ] + - dbsnp: + type: file + description: dbSNP VCF file + pattern: "*.vcf.gz" + - - meta6: + type: map + description: | + Groovy Map containing dbsnp tbi information + e.g. [ id:'test' ] + - dbsnp_tbi: + type: file + description: dbSNP VCF index file + pattern: "*.tbi" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - vcf: - type: file - description: Genotyped VCF file - pattern: "*.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz": + type: file + description: Genotyped VCF file + pattern: "*.vcf.gz" - tbi: - type: file - description: Tbi index for VCF file - pattern: "*.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Tbi index for VCF file + pattern: "*.vcf.gz" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@santiagorevale" - "@maxulysse" diff --git a/modules/nf-core/gatk4/haplotypecaller/gatk4-haplotypecaller.diff b/modules/nf-core/gatk4/haplotypecaller/gatk4-haplotypecaller.diff new file mode 100644 index 00000000..28465529 --- /dev/null +++ b/modules/nf-core/gatk4/haplotypecaller/gatk4-haplotypecaller.diff @@ -0,0 +1,21 @@ +Changes in module 'nf-core/gatk4/haplotypecaller' +--- modules/nf-core/gatk4/haplotypecaller/main.nf ++++ modules/nf-core/gatk4/haplotypecaller/main.nf +@@ -1,6 +1,6 @@ + process GATK4_HAPLOTYPECALLER { + tag "$meta.id" +- label 'process_medium' ++ label 'process_low' + + conda "${moduleDir}/environment.yml" + container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? +@@ -44,7 +44,6 @@ + --input $input \\ + --output ${prefix}.vcf.gz \\ + --reference $fasta \\ +- --native-pair-hmm-threads ${task.cpus} \\ + $dbsnp_command \\ + $interval_command \\ + $dragstr_command \\ + +************************************************************ diff --git a/modules/nf-core/gatk4/haplotypecaller/main.nf b/modules/nf-core/gatk4/haplotypecaller/main.nf index 3043ee07..51701376 100644 --- a/modules/nf-core/gatk4/haplotypecaller/main.nf +++ b/modules/nf-core/gatk4/haplotypecaller/main.nf @@ -1,6 +1,6 @@ process GATK4_HAPLOTYPECALLER { tag "$meta.id" - label 'process_medium' + label 'process_low' conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? diff --git a/modules/nf-core/gatk4/haplotypecaller/meta.yml b/modules/nf-core/gatk4/haplotypecaller/meta.yml index 703b99a0..9d4a05e9 100644 --- a/modules/nf-core/gatk4/haplotypecaller/meta.yml +++ b/modules/nf-core/gatk4/haplotypecaller/meta.yml @@ -14,92 +14,108 @@ tools: documentation: https://gatk.broadinstitute.org/hc/en-us/categories/360002369672s doi: 10.1158/1538-7445.AM2017-3590 licence: ["Apache-2.0"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file from alignment - pattern: "*.{bam,cram}" - - input_index: - type: file - description: BAI/CRAI file from alignment - pattern: "*.{bai,crai}" - - intervals: - type: file - description: Bed file with the genomic regions included in the library (optional) - - dragstr_model: - type: file - description: Text file containing the DragSTR model of the used BAM/CRAM file (optional) - pattern: "*.txt" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test_reference' ] - - fasta: - type: file - description: The reference fasta file - pattern: "*.fasta" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test_reference' ] - - fai: - type: file - description: Index of reference fasta file - pattern: "fasta.fai" - - meta4: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test_reference' ] - - dict: - type: file - description: GATK sequence dictionary - pattern: "*.dict" - - meta5: - type: map - description: | - Groovy Map containing dbsnp information - e.g. [ id:'test_dbsnp' ] - - dbsnp: - type: file - description: VCF file containing known sites (optional) - - meta6: - type: map - description: | - Groovy Map containing dbsnp information - e.g. [ id:'test_dbsnp' ] - - dbsnp_tbi: - type: file - description: VCF index of dbsnp (optional) + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file from alignment + pattern: "*.{bam,cram}" + - input_index: + type: file + description: BAI/CRAI file from alignment + pattern: "*.{bai,crai}" + - intervals: + type: file + description: Bed file with the genomic regions included in the library (optional) + - dragstr_model: + type: file + description: Text file containing the DragSTR model of the used BAM/CRAM file + (optional) + pattern: "*.txt" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test_reference' ] + - fasta: + type: file + description: The reference fasta file + pattern: "*.fasta" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test_reference' ] + - fai: + type: file + description: Index of reference fasta file + pattern: "fasta.fai" + - - meta4: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test_reference' ] + - dict: + type: file + description: GATK sequence dictionary + pattern: "*.dict" + - - meta5: + type: map + description: | + Groovy Map containing dbsnp information + e.g. [ id:'test_dbsnp' ] + - dbsnp: + type: file + description: VCF file containing known sites (optional) + - - meta6: + type: map + description: | + Groovy Map containing dbsnp information + e.g. [ id:'test_dbsnp' ] + - dbsnp_tbi: + type: file + description: VCF index of dbsnp (optional) output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: Compressed VCF file - pattern: "*.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz": + type: file + description: Compressed VCF file + pattern: "*.vcf.gz" - tbi: - type: file - description: Index of VCF file - pattern: "*.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: Index of VCF file + pattern: "*.vcf.gz.tbi" - bam: - type: file - description: Assembled haplotypes and locally realigned reads - pattern: "*.realigned.bam" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.realigned.bam": + type: file + description: Assembled haplotypes and locally realigned reads + pattern: "*.realigned.bam" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@suzannejin" - "@FriederikeHanssen" diff --git a/modules/nf-core/gawk/meta.yml b/modules/nf-core/gawk/meta.yml index 2b6033b0..05170082 100644 --- a/modules/nf-core/gawk/meta.yml +++ b/modules/nf-core/gawk/meta.yml @@ -16,34 +16,40 @@ tools: documentation: "https://www.gnu.org/software/gawk/manual/" tool_dev_url: "https://www.gnu.org/prep/ftp.html" licence: ["GPL v3"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: The input file - Specify the logic that needs to be executed on this file on the `ext.args2` or in the program file - pattern: "*" - - program_file: - type: file - description: Optional file containing logic for awk to execute. If you don't wish to use a file, you can use `ext.args2` to specify the logic. - pattern: "*" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: The input file - Specify the logic that needs to be executed on + this file on the `ext.args2` or in the program file + pattern: "*" + - - program_file: + type: file + description: Optional file containing logic for awk to execute. If you don't + wish to use a file, you can use `ext.args2` to specify the logic. + pattern: "*" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - output: - type: file - description: The output file - specify the name of this file using `ext.prefix` and the extension using `ext.suffix` - pattern: "*" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.${suffix}: + type: file + description: The output file - specify the name of this file using `ext.prefix` + and the extension using `ext.suffix` + pattern: "*" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@nvnieuwk" maintainers: diff --git a/modules/nf-core/mosdepth/meta.yml b/modules/nf-core/mosdepth/meta.yml index 9caaf2cd..dc783c90 100644 --- a/modules/nf-core/mosdepth/meta.yml +++ b/modules/nf-core/mosdepth/meta.yml @@ -12,91 +12,161 @@ tools: documentation: https://github.com/brentp/mosdepth doi: 10.1093/bioinformatics/btx699 licence: ["MIT"] + identifier: biotools:mosdepth input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: Input BAM/CRAM file - pattern: "*.{bam,cram}" - - bai: - type: file - description: Index for BAM/CRAM file - pattern: "*.{bai,crai}" - - bed: - type: file - description: BED file with intersected intervals - pattern: "*.{bed}" - - meta2: - type: map - description: | - Groovy Map containing bed information - e.g. [ id:'test' ] - - fasta: - type: file - description: Reference genome FASTA file - pattern: "*.{fa,fasta}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bam: + type: file + description: Input BAM/CRAM file + pattern: "*.{bam,cram}" + - bai: + type: file + description: Index for BAM/CRAM file + pattern: "*.{bai,crai}" + - bed: + type: file + description: BED file with intersected intervals + pattern: "*.{bed}" + - - meta2: + type: map + description: | + Groovy Map containing bed information + e.g. [ id:'test' ] + - fasta: + type: file + description: Reference genome FASTA file + pattern: "*.{fa,fasta}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - global_txt: - type: file - description: Text file with global cumulative coverage distribution - pattern: "*.{global.dist.txt}" - - regions_txt: - type: file - description: Text file with region cumulative coverage distribution - pattern: "*.{region.dist.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.global.dist.txt": + type: file + description: Text file with global cumulative coverage distribution + pattern: "*.{global.dist.txt}" - summary_txt: - type: file - description: Text file with summary mean depths per chromosome and regions - pattern: "*.{summary.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.summary.txt": + type: file + description: Text file with summary mean depths per chromosome and regions + pattern: "*.{summary.txt}" + - regions_txt: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.region.dist.txt": + type: file + description: Text file with region cumulative coverage distribution + pattern: "*.{region.dist.txt}" + - per_base_d4: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.per-base.d4": + type: file + description: D4 file with per-base coverage + pattern: "*.{per-base.d4}" - per_base_bed: - type: file - description: BED file with per-base coverage - pattern: "*.{per-base.bed.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.per-base.bed.gz": + type: file + description: BED file with per-base coverage + pattern: "*.{per-base.bed.gz}" - per_base_csi: - type: file - description: Index file for BED file with per-base coverage - pattern: "*.{per-base.bed.gz.csi}" - - per_base_d4: - type: file - description: D4 file with per-base coverage - pattern: "*.{per-base.d4}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.per-base.bed.gz.csi": + type: file + description: Index file for BED file with per-base coverage + pattern: "*.{per-base.bed.gz.csi}" - regions_bed: - type: file - description: BED file with per-region coverage - pattern: "*.{regions.bed.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.regions.bed.gz": + type: file + description: BED file with per-region coverage + pattern: "*.{regions.bed.gz}" - regions_csi: - type: file - description: Index file for BED file with per-region coverage - pattern: "*.{regions.bed.gz.csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.regions.bed.gz.csi": + type: file + description: Index file for BED file with per-region coverage + pattern: "*.{regions.bed.gz.csi}" - quantized_bed: - type: file - description: BED file with binned coverage - pattern: "*.{quantized.bed.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.quantized.bed.gz": + type: file + description: BED file with binned coverage + pattern: "*.{quantized.bed.gz}" - quantized_csi: - type: file - description: Index file for BED file with binned coverage - pattern: "*.{quantized.bed.gz.csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.quantized.bed.gz.csi": + type: file + description: Index file for BED file with binned coverage + pattern: "*.{quantized.bed.gz.csi}" - thresholds_bed: - type: file - description: BED file with the number of bases in each region that are covered at or above each threshold - pattern: "*.{thresholds.bed.gz}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.thresholds.bed.gz": + type: file + description: BED file with the number of bases in each region that are covered + at or above each threshold + pattern: "*.{thresholds.bed.gz}" - thresholds_csi: - type: file - description: Index file for BED file with threshold coverage - pattern: "*.{thresholds.bed.gz.csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.thresholds.bed.gz.csi": + type: file + description: Index file for BED file with threshold coverage + pattern: "*.{thresholds.bed.gz.csi}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/mosdepth/mosdepth.diff b/modules/nf-core/mosdepth/mosdepth.diff new file mode 100644 index 00000000..049e0f1a --- /dev/null +++ b/modules/nf-core/mosdepth/mosdepth.diff @@ -0,0 +1,64 @@ +Changes in module 'nf-core/mosdepth' +'modules/nf-core/mosdepth/main.nf' is unchanged +'modules/nf-core/mosdepth/environment.yml' is unchanged +'modules/nf-core/mosdepth/meta.yml' is unchanged +'modules/nf-core/mosdepth/tests/tags.yml' is unchanged +'modules/nf-core/mosdepth/tests/quantized.config' is unchanged +'modules/nf-core/mosdepth/tests/threshold.config' is unchanged +'modules/nf-core/mosdepth/tests/window.config' is unchanged +'modules/nf-core/mosdepth/tests/main.nf.test' is unchanged +Changes in 'mosdepth/tests/main.nf.test.snap': +--- modules/nf-core/mosdepth/tests/main.nf.test.snap ++++ modules/nf-core/mosdepth/tests/main.nf.test.snap +@@ -471,7 +471,7 @@ + "id": "test", + "single_end": true + }, +- "test.quantized.bed.gz:md5,f037c215449d361112efc10108fcc17c" ++ "test.quantized.bed.gz:md5,b083304a7964b43313a2789c762738df" + ] + ], + "9": [ +@@ -480,7 +480,7 @@ + "id": "test", + "single_end": true + }, +- "test.quantized.bed.gz.csi:md5,4f69e6ace50206a2768be66ded3a56f0" ++ "test.quantized.bed.gz.csi:md5,3c5e7a03ab29089f33ac4aa5c44bcb8b" + ] + ], + "global_txt": [ +@@ -519,7 +519,7 @@ + "id": "test", + "single_end": true + }, +- "test.quantized.bed.gz:md5,f037c215449d361112efc10108fcc17c" ++ "test.quantized.bed.gz:md5,b083304a7964b43313a2789c762738df" + ] + ], + "quantized_csi": [ +@@ -528,7 +528,7 @@ + "id": "test", + "single_end": true + }, +- "test.quantized.bed.gz.csi:md5,4f69e6ace50206a2768be66ded3a56f0" ++ "test.quantized.bed.gz.csi:md5,3c5e7a03ab29089f33ac4aa5c44bcb8b" + ] + ], + "regions_bed": [ +@@ -561,10 +561,10 @@ + } + ], + "meta": { +- "nf-test": "0.8.4", +- "nextflow": "23.10.1" ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" + }, +- "timestamp": "2024-04-29T13:33:01.164885111" ++ "timestamp": "2024-11-20T17:10:36.940635059" + }, + "homo_sapiens - bam, bai, bed": { + "content": [ + +************************************************************ diff --git a/modules/nf-core/mosdepth/tests/main.nf.test.snap b/modules/nf-core/mosdepth/tests/main.nf.test.snap index c604540b..21803178 100644 --- a/modules/nf-core/mosdepth/tests/main.nf.test.snap +++ b/modules/nf-core/mosdepth/tests/main.nf.test.snap @@ -471,7 +471,7 @@ "id": "test", "single_end": true }, - "test.quantized.bed.gz:md5,f037c215449d361112efc10108fcc17c" + "test.quantized.bed.gz:md5,b083304a7964b43313a2789c762738df" ] ], "9": [ @@ -480,7 +480,7 @@ "id": "test", "single_end": true }, - "test.quantized.bed.gz.csi:md5,4f69e6ace50206a2768be66ded3a56f0" + "test.quantized.bed.gz.csi:md5,3c5e7a03ab29089f33ac4aa5c44bcb8b" ] ], "global_txt": [ @@ -519,7 +519,7 @@ "id": "test", "single_end": true }, - "test.quantized.bed.gz:md5,f037c215449d361112efc10108fcc17c" + "test.quantized.bed.gz:md5,b083304a7964b43313a2789c762738df" ] ], "quantized_csi": [ @@ -528,7 +528,7 @@ "id": "test", "single_end": true }, - "test.quantized.bed.gz.csi:md5,4f69e6ace50206a2768be66ded3a56f0" + "test.quantized.bed.gz.csi:md5,3c5e7a03ab29089f33ac4aa5c44bcb8b" ] ], "regions_bed": [ @@ -561,10 +561,10 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-04-29T13:33:01.164885111" + "timestamp": "2024-11-20T17:10:36.940635059" }, "homo_sapiens - bam, bai, bed": { "content": [ diff --git a/modules/nf-core/multiqc/environment.yml b/modules/nf-core/multiqc/environment.yml index f1cd99b0..6f5b867b 100644 --- a/modules/nf-core/multiqc/environment.yml +++ b/modules/nf-core/multiqc/environment.yml @@ -2,4 +2,4 @@ channels: - conda-forge - bioconda dependencies: - - bioconda::multiqc=1.24.1 + - bioconda::multiqc=1.25.1 diff --git a/modules/nf-core/multiqc/main.nf b/modules/nf-core/multiqc/main.nf index ceaec139..9724d2f3 100644 --- a/modules/nf-core/multiqc/main.nf +++ b/modules/nf-core/multiqc/main.nf @@ -3,8 +3,8 @@ process MULTIQC { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/multiqc:1.24.1--pyhdfd78af_0' : - 'biocontainers/multiqc:1.24.1--pyhdfd78af_0' }" + 'https://depot.galaxyproject.org/singularity/multiqc:1.25.1--pyhdfd78af_0' : + 'biocontainers/multiqc:1.25.1--pyhdfd78af_0' }" input: path multiqc_files, stageAs: "?/*" diff --git a/modules/nf-core/multiqc/meta.yml b/modules/nf-core/multiqc/meta.yml index 382c08cb..b16c1879 100644 --- a/modules/nf-core/multiqc/meta.yml +++ b/modules/nf-core/multiqc/meta.yml @@ -1,5 +1,6 @@ name: multiqc -description: Aggregate results from bioinformatics analyses across many samples into a single report +description: Aggregate results from bioinformatics analyses across many samples into + a single report keywords: - QC - bioinformatics tools @@ -12,53 +13,59 @@ tools: homepage: https://multiqc.info/ documentation: https://multiqc.info/docs/ licence: ["GPL-3.0-or-later"] + identifier: biotools:multiqc input: - - multiqc_files: - type: file - description: | - List of reports / files recognised by MultiQC, for example the html and zip output of FastQC - - multiqc_config: - type: file - description: Optional config yml for MultiQC - pattern: "*.{yml,yaml}" - - extra_multiqc_config: - type: file - description: Second optional config yml for MultiQC. Will override common sections in multiqc_config. - pattern: "*.{yml,yaml}" - - multiqc_logo: - type: file - description: Optional logo file for MultiQC - pattern: "*.{png}" - - replace_names: - type: file - description: | - Optional two-column sample renaming file. First column a set of - patterns, second column a set of corresponding replacements. Passed via - MultiQC's `--replace-names` option. - pattern: "*.{tsv}" - - sample_names: - type: file - description: | - Optional TSV file with headers, passed to the MultiQC --sample_names - argument. - pattern: "*.{tsv}" + - - multiqc_files: + type: file + description: | + List of reports / files recognised by MultiQC, for example the html and zip output of FastQC + - - multiqc_config: + type: file + description: Optional config yml for MultiQC + pattern: "*.{yml,yaml}" + - - extra_multiqc_config: + type: file + description: Second optional config yml for MultiQC. Will override common sections + in multiqc_config. + pattern: "*.{yml,yaml}" + - - multiqc_logo: + type: file + description: Optional logo file for MultiQC + pattern: "*.{png}" + - - replace_names: + type: file + description: | + Optional two-column sample renaming file. First column a set of + patterns, second column a set of corresponding replacements. Passed via + MultiQC's `--replace-names` option. + pattern: "*.{tsv}" + - - sample_names: + type: file + description: | + Optional TSV file with headers, passed to the MultiQC --sample_names + argument. + pattern: "*.{tsv}" output: - report: - type: file - description: MultiQC report file - pattern: "multiqc_report.html" + - "*multiqc_report.html": + type: file + description: MultiQC report file + pattern: "multiqc_report.html" - data: - type: directory - description: MultiQC data dir - pattern: "multiqc_data" + - "*_data": + type: directory + description: MultiQC data dir + pattern: "multiqc_data" - plots: - type: file - description: Plots created by MultiQC - pattern: "*_data" + - "*_plots": + type: file + description: Plots created by MultiQC + pattern: "*_data" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@abhi18av" - "@bunop" diff --git a/modules/nf-core/multiqc/tests/main.nf.test.snap b/modules/nf-core/multiqc/tests/main.nf.test.snap index 83fa080c..2fcbb5ff 100644 --- a/modules/nf-core/multiqc/tests/main.nf.test.snap +++ b/modules/nf-core/multiqc/tests/main.nf.test.snap @@ -2,14 +2,14 @@ "multiqc_versions_single": { "content": [ [ - "versions.yml:md5,6eb13f3b11bbcbfc98ad3166420ff760" + "versions.yml:md5,41f391dcedce7f93ca188f3a3ffa0916" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-07-10T12:41:34.562023" + "timestamp": "2024-10-02T17:51:46.317523" }, "multiqc_stub": { "content": [ @@ -17,25 +17,25 @@ "multiqc_report.html", "multiqc_data", "multiqc_plots", - "versions.yml:md5,6eb13f3b11bbcbfc98ad3166420ff760" + "versions.yml:md5,41f391dcedce7f93ca188f3a3ffa0916" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-07-10T11:27:11.933869532" + "timestamp": "2024-10-02T17:52:20.680978" }, "multiqc_versions_config": { "content": [ [ - "versions.yml:md5,6eb13f3b11bbcbfc98ad3166420ff760" + "versions.yml:md5,41f391dcedce7f93ca188f3a3ffa0916" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-07-10T11:26:56.709849369" + "timestamp": "2024-10-02T17:52:09.185842" } -} +} \ No newline at end of file diff --git a/modules/nf-core/rtgtools/format/main.nf b/modules/nf-core/rtgtools/format/main.nf index 802d3b20..9cae7f99 100644 --- a/modules/nf-core/rtgtools/format/main.nf +++ b/modules/nf-core/rtgtools/format/main.nf @@ -46,7 +46,6 @@ process RTGTOOLS_FORMAT { """ stub: - def args = task.ext.args ?: '' def prefix = task.ext.prefix ?: "${meta.id}" def avail_mem = "3G" diff --git a/modules/nf-core/rtgtools/format/meta.yml b/modules/nf-core/rtgtools/format/meta.yml index 1991b807..e09aff3a 100644 --- a/modules/nf-core/rtgtools/format/meta.yml +++ b/modules/nf-core/rtgtools/format/meta.yml @@ -1,5 +1,6 @@ name: "rtgtools_format" -description: Converts the contents of sequence data files (FASTA/FASTQ/SAM/BAM) into the RTG Sequence Data File (SDF) format. +description: Converts the contents of sequence data files (FASTA/FASTQ/SAM/BAM) into + the RTG Sequence Data File (SDF) format. keywords: - rtg - fasta @@ -8,43 +9,49 @@ keywords: - sam tools: - "rtgtools": - description: "RealTimeGenomics Tools -- Utilities for accurate VCF comparison and manipulation" + description: "RealTimeGenomics Tools -- Utilities for accurate VCF comparison + and manipulation" homepage: "https://www.realtimegenomics.com/products/rtg-tools" documentation: "https://github.com/RealTimeGenomics/rtg-tools" tool_dev_url: "https://github.com/RealTimeGenomics/rtg-tools" licence: ["BSD"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input1: - type: file - description: FASTA, FASTQ, BAM or SAM file. This should be the left input file when using paired end FASTQ/FASTA data - pattern: "*.{fasta,fa,fna,fastq,fastq.gz,fq,fq.gz,bam,sam}" - - input2: - type: file - description: The right input file when using paired end FASTQ/FASTA data - pattern: "*.{fasta,fa,fna,fastq,fastq.gz,fq,fq.gz}" - - sam_rg: - type: file - description: A file containing a single readgroup header as a SAM header. This can also be supplied as a string in `task.ext.args` as `--sam-rg `. - pattern: "*.{txt,sam}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input1: + type: file + description: FASTA, FASTQ, BAM or SAM file. This should be the left input file + when using paired end FASTQ/FASTA data + pattern: "*.{fasta,fa,fna,fastq,fastq.gz,fq,fq.gz,bam,sam}" + - input2: + type: file + description: The right input file when using paired end FASTQ/FASTA data + pattern: "*.{fasta,fa,fna,fastq,fastq.gz,fq,fq.gz}" + - sam_rg: + type: file + description: A file containing a single readgroup header as a SAM header. This + can also be supplied as a string in `task.ext.args` as `--sam-rg `. + pattern: "*.{txt,sam}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - sdf: - type: directory - description: The sequence dictionary format folder created from the input file(s) - pattern: "*.sdf" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.sdf": + type: directory + description: The sequence dictionary format folder created from the input file(s) + pattern: "*.sdf" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@nvnieuwk" maintainers: diff --git a/modules/nf-core/rtgtools/format/tests/fastq.config b/modules/nf-core/rtgtools/format/tests/fastq.config new file mode 100644 index 00000000..24540986 --- /dev/null +++ b/modules/nf-core/rtgtools/format/tests/fastq.config @@ -0,0 +1,3 @@ +process { + ext.args = "--format fastq" +} diff --git a/modules/nf-core/rtgtools/format/tests/main.nf.test b/modules/nf-core/rtgtools/format/tests/main.nf.test new file mode 100644 index 00000000..d3b39800 --- /dev/null +++ b/modules/nf-core/rtgtools/format/tests/main.nf.test @@ -0,0 +1,138 @@ +nextflow_process { + + name "Test Process RTGTOOLS_FORMAT" + script "../main.nf" + process "RTGTOOLS_FORMAT" + + tag "modules" + tag "modules_nfcore" + tag "rtgtools" + tag "rtgtools/format" + + test("sarscov2 - fasta") { + + when { + process { + """ + input[0] = [ + [id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true), + [], + [] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + path(process.out.sdf[0][1]).list().collect { file(it.toString()).name }, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - fastqs") { + + config "./fastq.config" + + when { + process { + """ + input[0] = [ + [id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true), + [] + ] + """ + } + } + + when { + process { + """ + input[0] = [ + [id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_1.fastq.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/fastq/test_2.fastq.gz', checkIfExists: true), + [] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + path(process.out.sdf[0][1]).list().collect { file(it.toString()).name }, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - bam, rg") { + + config "./sam.config" + + when { + process { + """ + def rg = Channel.of("@RG\tID:READGROUP1\tSM:SAMPLE\tPL:ILLUMINA") + .collectFile(name:'sam_rg.txt') + + input[0] = Channel.of([ + [id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/bam/test.single_end.bam', checkIfExists: true), + [] + ]).combine(rg) + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot( + path(process.out.sdf[0][1]).list().collect { file(it.toString()).name }, + process.out.versions + ).match() } + ) + } + + } + + test("sarscov2 - fasta - stub") { + + options "-stub" + + when { + process { + """ + input[0] = [ + [id:'test'], + file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/genome.fasta', checkIfExists: true), + [], + [] + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/rtgtools/format/tests/main.nf.test.snap b/modules/nf-core/rtgtools/format/tests/main.nf.test.snap new file mode 100644 index 00000000..1494221b --- /dev/null +++ b/modules/nf-core/rtgtools/format/tests/main.nf.test.snap @@ -0,0 +1,118 @@ +{ + "sarscov2 - fasta - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.sdf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + "versions.yml:md5,2ed5b05daa28126a8c34ab9e30f0f3b5" + ], + "sdf": [ + [ + { + "id": "test" + }, + "test.sdf:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,2ed5b05daa28126a8c34ab9e30f0f3b5" + ] + } + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-04T16:20:31.290508593" + }, + "sarscov2 - fasta": { + "content": [ + [ + "done", + "format.log", + "mainIndex", + "nameIndex0", + "namedata0", + "namepointer0", + "progress", + "seqdata0", + "seqpointer0", + "sequenceIndex0", + "suffixIndex0", + "suffixdata0", + "suffixpointer0", + "summary.txt" + ], + [ + "versions.yml:md5,2ed5b05daa28126a8c34ab9e30f0f3b5" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-05T12:26:34.333493927" + }, + "sarscov2 - fastqs": { + "content": [ + [ + "done", + "format.log", + "mainIndex", + "nameIndex0", + "namedata0", + "namepointer0", + "progress", + "qualitydata0", + "seqdata0", + "seqpointer0", + "sequenceIndex0", + "suffixIndex0", + "suffixdata0", + "suffixpointer0", + "summary.txt" + ], + [ + "versions.yml:md5,2ed5b05daa28126a8c34ab9e30f0f3b5" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-05T12:26:56.018604071" + }, + "sarscov2 - bam, rg": { + "content": [ + [ + "done", + "format.log", + "mainIndex", + "nameIndex0", + "namedata0", + "namepointer0", + "progress", + "qualitydata0", + "seqdata0", + "seqpointer0", + "sequenceIndex0", + "summary.txt" + ], + [ + "versions.yml:md5,2ed5b05daa28126a8c34ab9e30f0f3b5" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-05T12:27:10.103855977" + } +} \ No newline at end of file diff --git a/modules/nf-core/rtgtools/format/tests/sam.config b/modules/nf-core/rtgtools/format/tests/sam.config new file mode 100644 index 00000000..cd57e8b6 --- /dev/null +++ b/modules/nf-core/rtgtools/format/tests/sam.config @@ -0,0 +1,3 @@ +process { + ext.args = "--format sam-se" +} \ No newline at end of file diff --git a/modules/nf-core/rtgtools/pedfilter/meta.yml b/modules/nf-core/rtgtools/pedfilter/meta.yml index 4e90fd7b..c8eee99d 100644 --- a/modules/nf-core/rtgtools/pedfilter/meta.yml +++ b/modules/nf-core/rtgtools/pedfilter/meta.yml @@ -7,37 +7,41 @@ keywords: - ped tools: - "rtgtools": - description: "RealTimeGenomics Tools -- Utilities for accurate VCF comparison and manipulation" + description: "RealTimeGenomics Tools -- Utilities for accurate VCF comparison + and manipulation" homepage: "https://www.realtimegenomics.com/products/rtg-tools" documentation: "https://github.com/RealTimeGenomics/rtg-tools" tool_dev_url: "https://github.com/RealTimeGenomics/rtg-tools" licence: ["BSD"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: The input file, can be either a PED or a VCF file - pattern: "*.{vcf,vcf.gz,ped}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: The input file, can be either a PED or a VCF file + pattern: "*.{vcf,vcf.gz,ped}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - output: - type: file - description: | - The output file, can be either a filtered PED file - or a VCF file containing the PED headers (needs --vcf as argument) - pattern: "*.{vcf.gz,ped}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{vcf.gz,ped}": + type: file + description: | + The output file, can be either a filtered PED file + or a VCF file containing the PED headers (needs --vcf as argument) + pattern: "*.{vcf.gz,ped}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@nvnieuwk" maintainers: diff --git a/modules/nf-core/rtgtools/pedfilter/rtgtools-pedfilter.diff b/modules/nf-core/rtgtools/pedfilter/rtgtools-pedfilter.diff index 18590bf8..518588a2 100644 --- a/modules/nf-core/rtgtools/pedfilter/rtgtools-pedfilter.diff +++ b/modules/nf-core/rtgtools/pedfilter/rtgtools-pedfilter.diff @@ -14,6 +14,71 @@ Changes in 'rtgtools/pedfilter/main.nf': 'modules/nf-core/rtgtools/pedfilter/environment.yml' is unchanged 'modules/nf-core/rtgtools/pedfilter/meta.yml' is unchanged 'modules/nf-core/rtgtools/pedfilter/tests/main.nf.test' is unchanged -'modules/nf-core/rtgtools/pedfilter/tests/main.nf.test.snap' is unchanged +Changes in 'rtgtools/pedfilter/tests/main.nf.test.snap': +--- modules/nf-core/rtgtools/pedfilter/tests/main.nf.test.snap ++++ modules/nf-core/rtgtools/pedfilter/tests/main.nf.test.snap +@@ -8,7 +8,7 @@ + "id": "test", + "single_end": false + }, +- "test.ped:md5,342135c8bf22e573367b75ef5e1c5e6b" ++ "test.ped:md5,16e5773a0aaaa27870b0601e572f24b5" + ] + ], + "1": [ +@@ -20,7 +20,7 @@ + "id": "test", + "single_end": false + }, +- "test.ped:md5,342135c8bf22e573367b75ef5e1c5e6b" ++ "test.ped:md5,16e5773a0aaaa27870b0601e572f24b5" + ] + ], + "versions": [ +@@ -29,10 +29,10 @@ + } + ], + "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.4" ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" + }, +- "timestamp": "2024-08-23T16:11:01.797205" ++ "timestamp": "2024-11-20T13:14:39.858764858" + }, + "test-rtgtools-pedfilter-vcf-output": { + "content": [ +@@ -62,7 +62,7 @@ + "id": "test", + "single_end": false + }, +- "test.ped:md5,a8b8f6538e1738d6e06fddfe15d36f09" ++ "test.ped:md5,88d121b6ce2b3d8b0a6b0098b85db865" + ] + ], + "1": [ +@@ -74,7 +74,7 @@ + "id": "test", + "single_end": false + }, +- "test.ped:md5,a8b8f6538e1738d6e06fddfe15d36f09" ++ "test.ped:md5,88d121b6ce2b3d8b0a6b0098b85db865" + ] + ], + "versions": [ +@@ -83,9 +83,9 @@ + } + ], + "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.4" ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" + }, +- "timestamp": "2024-08-23T16:10:53.473351" ++ "timestamp": "2024-11-20T13:14:31.005949979" + } + } 'modules/nf-core/rtgtools/pedfilter/tests/nextflow.config' is unchanged ************************************************************ diff --git a/modules/nf-core/rtgtools/pedfilter/tests/main.nf.test.snap b/modules/nf-core/rtgtools/pedfilter/tests/main.nf.test.snap index 7475fa8b..6b0a6970 100644 --- a/modules/nf-core/rtgtools/pedfilter/tests/main.nf.test.snap +++ b/modules/nf-core/rtgtools/pedfilter/tests/main.nf.test.snap @@ -8,7 +8,7 @@ "id": "test", "single_end": false }, - "test.ped:md5,342135c8bf22e573367b75ef5e1c5e6b" + "test.ped:md5,16e5773a0aaaa27870b0601e572f24b5" ] ], "1": [ @@ -20,7 +20,7 @@ "id": "test", "single_end": false }, - "test.ped:md5,342135c8bf22e573367b75ef5e1c5e6b" + "test.ped:md5,16e5773a0aaaa27870b0601e572f24b5" ] ], "versions": [ @@ -29,10 +29,10 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-08-23T16:11:01.797205" + "timestamp": "2024-11-20T13:14:39.858764858" }, "test-rtgtools-pedfilter-vcf-output": { "content": [ @@ -62,7 +62,7 @@ "id": "test", "single_end": false }, - "test.ped:md5,a8b8f6538e1738d6e06fddfe15d36f09" + "test.ped:md5,88d121b6ce2b3d8b0a6b0098b85db865" ] ], "1": [ @@ -74,7 +74,7 @@ "id": "test", "single_end": false }, - "test.ped:md5,a8b8f6538e1738d6e06fddfe15d36f09" + "test.ped:md5,88d121b6ce2b3d8b0a6b0098b85db865" ] ], "versions": [ @@ -83,9 +83,9 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-08-23T16:10:53.473351" + "timestamp": "2024-11-20T13:14:31.005949979" } } \ No newline at end of file diff --git a/modules/nf-core/rtgtools/rocplot/meta.yml b/modules/nf-core/rtgtools/rocplot/meta.yml index 2b4d43d7..8fccb318 100644 --- a/modules/nf-core/rtgtools/rocplot/meta.yml +++ b/modules/nf-core/rtgtools/rocplot/meta.yml @@ -1,5 +1,6 @@ name: "rtgtools_rocplot" -description: Plot ROC curves from vcfeval ROC data files, either to an image, or an interactive GUI. The interactive GUI isn't possible for nextflow. +description: Plot ROC curves from vcfeval ROC data files, either to an image, or an + interactive GUI. The interactive GUI isn't possible for nextflow. keywords: - rtgtools - rocplot @@ -7,39 +8,49 @@ keywords: - vcf tools: - "rtgtools": - description: "RealTimeGenomics Tools -- Utilities for accurate VCF comparison and manipulation" + description: "RealTimeGenomics Tools -- Utilities for accurate VCF comparison + and manipulation" homepage: "https://www.realtimegenomics.com/products/rtg-tools" documentation: "https://github.com/RealTimeGenomics/rtg-tools" tool_dev_url: "https://github.com/RealTimeGenomics/rtg-tools" licence: ["BSD"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: Input TSV ROC files created with RTGTOOLS_VCFEVAL - pattern: "*.tsv.gz" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: Input TSV ROC files created with RTGTOOLS_VCFEVAL + pattern: "*.tsv.gz" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - png: - type: file - description: The resulting rocplot in PNG format - pattern: "*.png" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.png": + type: file + description: The resulting rocplot in PNG format + pattern: "*.png" - svg: - type: file - description: The resulting rocplot in SVG format - pattern: "*.svg" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.svg": + type: file + description: The resulting rocplot in SVG format + pattern: "*.svg" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@nvnieuwk" maintainers: diff --git a/modules/nf-core/rtgtools/rocplot/tests/main.nf.test b/modules/nf-core/rtgtools/rocplot/tests/main.nf.test index c7c37084..52c5386d 100644 --- a/modules/nf-core/rtgtools/rocplot/tests/main.nf.test +++ b/modules/nf-core/rtgtools/rocplot/tests/main.nf.test @@ -35,8 +35,8 @@ nextflow_process { file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.vcf.gz.tbi', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.ann.vcf.gz', checkIfExists: true), file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.ann.vcf.gz.tbi', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed', checkIfExists: true), - file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true) + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed', checkIfExists: true) ] input[1] = UNTAR.out.untar println(projectDir) diff --git a/modules/nf-core/rtgtools/vcfeval/main.nf b/modules/nf-core/rtgtools/vcfeval/main.nf index 98f9adb1..330a1f3d 100644 --- a/modules/nf-core/rtgtools/vcfeval/main.nf +++ b/modules/nf-core/rtgtools/vcfeval/main.nf @@ -8,7 +8,7 @@ process RTGTOOLS_VCFEVAL { 'biocontainers/rtg-tools:3.12.1--hdfd78af_0' }" input: - tuple val(meta), path(query_vcf), path(query_vcf_tbi), path(truth_vcf), path(truth_vcf_tbi), path(truth_bed), path(evaluation_bed) + tuple val(meta), path(query_vcf), path(query_vcf_tbi), path(truth_vcf), path(truth_vcf_tbi), path(truth_bed), path(regions_bed) tuple val(meta2), path(sdf) output: @@ -33,8 +33,8 @@ process RTGTOOLS_VCFEVAL { script: def args = task.ext.args ?: "" def prefix = task.ext.prefix ?: "${meta.id}" - def bed_regions = truth_bed ? "--bed-regions=${truth_bed}" : "" - def eval_regions = evaluation_bed ? "--evaluation-regions=${evaluation_bed}" : "" + def bed_regions = regions_bed ? "--bed-regions=${regions_bed}" : "" + def eval_regions = truth_bed ? "--evaluation-regions=${truth_bed}" : "" def truth_index = truth_vcf_tbi ? "" : "rtg index ${truth_vcf}" def query_index = query_vcf_tbi ? "" : "rtg index ${query_vcf}" def avail_mem = task.memory.toGiga() + "G" @@ -68,17 +68,17 @@ process RTGTOOLS_VCFEVAL { def prefix = task.ext.prefix ?: "${meta.id}" """ - touch ${prefix}.tp.vcf.gz + echo | gzip > ${prefix}.tp.vcf.gz touch ${prefix}.tp.vcf.gz.tbi - touch ${prefix}.fn.vcf.gz + echo | gzip > ${prefix}.fn.vcf.gz touch ${prefix}.fn.vcf.gz.tbi - touch ${prefix}.fp.vcf.gz + echo | gzip > ${prefix}.fp.vcf.gz touch ${prefix}.fp.vcf.gz.tbi - touch ${prefix}.tp-baseline.vcf.gz + echo | gzip > ${prefix}.tp-baseline.vcf.gz touch ${prefix}.tp-baseline.vcf.gz.tbi - touch ${prefix}.snp_roc.tsv.gz - touch ${prefix}.non_snp_roc.tsv.gz - touch ${prefix}.weighted_roc.tsv.gz + echo | gzip > ${prefix}.snp_roc.tsv.gz + echo | gzip > ${prefix}.non_snp_roc.tsv.gz + echo | gzip > ${prefix}.weighted_roc.tsv.gz touch ${prefix}.summary.txt touch ${prefix}.phasing.txt diff --git a/modules/nf-core/rtgtools/vcfeval/meta.yml b/modules/nf-core/rtgtools/vcfeval/meta.yml index 5023ac91..4c59bab5 100644 --- a/modules/nf-core/rtgtools/vcfeval/meta.yml +++ b/modules/nf-core/rtgtools/vcfeval/meta.yml @@ -1,111 +1,198 @@ name: "rtgtools_vcfeval" -description: The VCFeval tool of RTG tools. It is used to evaluate called variants for agreement with a baseline variant set +description: The VCFeval tool of RTG tools. It is used to evaluate called variants + for agreement with a baseline variant set keywords: - benchmarking - vcf - rtg-tools tools: - "rtgtools": - description: "RealTimeGenomics Tools -- Utilities for accurate VCF comparison and manipulation" + description: "RealTimeGenomics Tools -- Utilities for accurate VCF comparison + and manipulation" homepage: "https://www.realtimegenomics.com/products/rtg-tools" documentation: "https://github.com/RealTimeGenomics/rtg-tools" tool_dev_url: "https://github.com/RealTimeGenomics/rtg-tools" licence: ["BSD"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - query_vcf: - type: file - description: A VCF with called variants to benchmark against the standard - pattern: "*.{vcf,vcf.gz}" - - query_vcf_index: - type: file - description: The index of the called VCF (optional) - pattern: "*.tbi" - - truth_vcf: - type: file - description: A standard VCF to compare against - pattern: "*.{vcf,vcf.gz}" - - truth_vcf_index: - type: file - description: The index of the standard VCF (optional) - pattern: "*.tbi" - - truth_bed: - type: file - description: A BED file containining the strict regions where VCFeval should only evaluate the fully overlapping variants (optional) - pattern: "*.bed" - - evaluation_bed: - type: file - description: A BED file containing the regions where VCFeval will evaluate every fully and partially overlapping variant (optional) - pattern: "*.bed" - - sdf: - type: file - description: The SDF (RTG Sequence Data File) folder of the reference genome + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - query_vcf: + type: file + description: A VCF with called variants to benchmark against the standard + pattern: "*.{vcf,vcf.gz}" + - query_vcf_tbi: + type: file + description: The index of the VCF file with called variants to benchmark against the standard + pattern: "*.{vcf.gz.tbi, vcf.tbi}" + - truth_vcf: + type: file + description: A standard VCF to compare against + pattern: "*.{vcf,vcf.gz}" + - truth_vcf_tbi: + type: file + description: The index of the standard VCF to compare against + pattern: "*.{vcf.gz.tbi, vcf.tbi}" + - truth_bed: + type: file + description: A BED file containining the strict regions where VCFeval should + only evaluate the fully overlapping variants (optional) + This input should be used to provide the golden truth BED files. + pattern: "*.bed" + - regions_bed: + type: file + description: A BED file containing the regions where VCFeval will evaluate every + fully and partially overlapping variant (optional) + This input should be used to provide the regions used by the analysis + pattern: "*.bed" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - sdf: + type: file + description: The SDF (RTG Sequence Data File) folder of the reference genome output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - tp_vcf: - type: file - description: A VCF file for the true positive variants - pattern: "*.tp.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tp.vcf.gz": + type: file + description: A VCF file for the true positive variants + pattern: "*.tp.vcf.gz" - tp_tbi: - type: file - description: The index of the VCF file for the true positive variants - pattern: "*.tp.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tp.vcf.gz.tbi": + type: file + description: The index of the VCF file for the true positive variants + pattern: "*.tp.vcf.gz.tbi" - fn_vcf: - type: file - description: A VCF file for the false negative variants - pattern: "*.fn.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fn.vcf.gz": + type: file + description: A VCF file for the false negative variants + pattern: "*.fn.vcf.gz" - fn_tbi: - type: file - description: The index of the VCF file for the false negative variants - pattern: "*.fn.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fn.vcf.gz.tbi": + type: file + description: The index of the VCF file for the false negative variants + pattern: "*.fn.vcf.gz.tbi" - fp_vcf: - type: file - description: A VCF file for the false positive variants - pattern: "*.fp.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fp.vcf.gz": + type: file + description: A VCF file for the false positive variants + pattern: "*.fp.vcf.gz" - fp_tbi: - type: file - description: The index of the VCF file for the false positive variants - pattern: "*.fp.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fp.vcf.gz.tbi": + type: file + description: The index of the VCF file for the false positive variants + pattern: "*.fp.vcf.gz.tbi" - baseline_vcf: - type: file - description: A VCF file for the true positive variants from the baseline - pattern: "*.tp-baseline.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tp-baseline.vcf.gz": + type: file + description: A VCF file for the true positive variants from the baseline + pattern: "*.tp-baseline.vcf.gz" - baseline_tbi: - type: file - description: The index of the VCF file for the true positive variants from the baseline - pattern: "*.tp-baseline.vcf.gz.tbi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tp-baseline.vcf.gz.tbi": + type: file + description: The index of the VCF file for the true positive variants from the + baseline + pattern: "*.tp-baseline.vcf.gz.tbi" - snp_roc: - type: file - description: TSV files containing ROC data for the SNPs - pattern: "*.snp_roc.tsv.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.snp_roc.tsv.gz": + type: file + description: TSV files containing ROC data for the SNPs + pattern: "*.snp_roc.tsv.gz" - non_snp_roc: - type: file - description: TSV files containing ROC data for all variants except SNPs - pattern: "*.non_snp_roc.tsv.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.non_snp_roc.tsv.gz": + type: file + description: TSV files containing ROC data for all variants except SNPs + pattern: "*.non_snp_roc.tsv.gz" - weighted_roc: - type: file - description: TSV files containing weighted ROC data for all variants - pattern: "*.weighted_snp_roc.tsv.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.weighted_roc.tsv.gz": + type: file + description: TSV files containing weighted ROC data for all variants + pattern: "*.weighted_snp_roc.tsv.gz" - summary: - type: file - description: A TXT file containing the summary of the evaluation - pattern: "*.summary.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.summary.txt": + type: file + description: A TXT file containing the summary of the evaluation + pattern: "*.summary.txt" - phasing: - type: file - description: A TXT file containing the data on the phasing - pattern: "*.phasing.txt" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.phasing.txt": + type: file + description: A TXT file containing the data on the phasing + pattern: "*.phasing.txt" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@nvnieuwk" maintainers: diff --git a/modules/nf-core/rtgtools/vcfeval/tests/main.nf.test b/modules/nf-core/rtgtools/vcfeval/tests/main.nf.test new file mode 100644 index 00000000..55abc842 --- /dev/null +++ b/modules/nf-core/rtgtools/vcfeval/tests/main.nf.test @@ -0,0 +1,113 @@ +nextflow_process { + + name "Test Process RTGTOOLS_VCFEVAL" + script "../main.nf" + process "RTGTOOLS_VCFEVAL" + + tag "modules" + tag "modules_nfcore" + tag "rtgtools" + tag "rtgtools/vcfeval" + tag "untar" + + setup { + run("UNTAR") { + script "../../../untar/main.nf" + process { + """ + input[0] = Channel.value([ + [id:'test'], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/genome_sdf.tar.gz', checkIfExists:true) + ]) + """ + } + } + } + + test("homo_sapiens - [vcf, tbi, truth, truth_tbi, truth_bed, regions_bed], sdf") { + + when { + process { + """ + input[0] = Channel.of([ + [id:'test'], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.vcf.gz', checkIfExists:true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.vcf.gz.tbi', checkIfExists:true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.ann.vcf.gz', checkIfExists:true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.ann.vcf.gz.tbi', checkIfExists:true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists:true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed', checkIfExists:true) + ]) + input[1] = UNTAR.out.untar + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("homo_sapiens - [vcf, [], truth, [], [], []], sdf") { + + when { + process { + """ + input[0] = Channel.of([ + [id:'test'], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.vcf.gz', checkIfExists:true), + [], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.ann.vcf.gz', checkIfExists:true), + [], + [], + [] + ]) + input[1] = UNTAR.out.untar + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + + test("homo_sapiens - [vcf, tbi, truth, truth_tbi, truth_bed, regions_bed], sdf - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.of([ + [id:'test'], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.vcf.gz', checkIfExists:true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.vcf.gz.tbi', checkIfExists:true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.ann.vcf.gz', checkIfExists:true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/gatk/haplotypecaller_calls/test2_haplotc.ann.vcf.gz.tbi', checkIfExists:true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists:true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/chr21/sequence/multi_intervals.bed', checkIfExists:true) + ]) + input[1] = UNTAR.out.untar + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + } + + } + +} diff --git a/modules/nf-core/rtgtools/vcfeval/tests/main.nf.test.snap b/modules/nf-core/rtgtools/vcfeval/tests/main.nf.test.snap new file mode 100644 index 00000000..4f39e2d4 --- /dev/null +++ b/modules/nf-core/rtgtools/vcfeval/tests/main.nf.test.snap @@ -0,0 +1,677 @@ +{ + "homo_sapiens - [vcf, tbi, truth, truth_tbi, truth_bed, regions_bed], sdf": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.tp.vcf.gz:md5,5171021307097220337dbcaccc860495" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test.tp.vcf.gz.tbi:md5,092a7a3162e7cff25d273525751eb284" + ] + ], + "10": [ + [ + { + "id": "test" + }, + "test.weighted_roc.tsv.gz:md5,de36bf613b3dacf4a043311336bb4a94" + ] + ], + "11": [ + [ + { + "id": "test" + }, + "test.summary.txt:md5,f4c8df93c8bdab603036bbc27b4a28c3" + ] + ], + "12": [ + [ + { + "id": "test" + }, + "test.phasing.txt:md5,31988234bee208cacb3de90dabe1797f" + ] + ], + "13": [ + "versions.yml:md5,a228f0d9e8b205b4cc7c485151a77bb0" + ], + "2": [ + [ + { + "id": "test" + }, + "test.fn.vcf.gz:md5,fc419367818700d47df073615aeb9077" + ] + ], + "3": [ + [ + { + "id": "test" + }, + "test.fn.vcf.gz.tbi:md5,092a7a3162e7cff25d273525751eb284" + ] + ], + "4": [ + [ + { + "id": "test" + }, + "test.fp.vcf.gz:md5,5171021307097220337dbcaccc860495" + ] + ], + "5": [ + [ + { + "id": "test" + }, + "test.fp.vcf.gz.tbi:md5,092a7a3162e7cff25d273525751eb284" + ] + ], + "6": [ + [ + { + "id": "test" + }, + "test.tp-baseline.vcf.gz:md5,fc419367818700d47df073615aeb9077" + ] + ], + "7": [ + [ + { + "id": "test" + }, + "test.tp-baseline.vcf.gz.tbi:md5,092a7a3162e7cff25d273525751eb284" + ] + ], + "8": [ + [ + { + "id": "test" + }, + "test.snp_roc.tsv.gz:md5,11d7393a16c25ac0a092382fecafee9b" + ] + ], + "9": [ + [ + { + "id": "test" + }, + "test.non_snp_roc.tsv.gz:md5,eb0910409b8b088655defbd152103b81" + ] + ], + "baseline_tbi": [ + [ + { + "id": "test" + }, + "test.tp-baseline.vcf.gz.tbi:md5,092a7a3162e7cff25d273525751eb284" + ] + ], + "baseline_vcf": [ + [ + { + "id": "test" + }, + "test.tp-baseline.vcf.gz:md5,fc419367818700d47df073615aeb9077" + ] + ], + "fn_tbi": [ + [ + { + "id": "test" + }, + "test.fn.vcf.gz.tbi:md5,092a7a3162e7cff25d273525751eb284" + ] + ], + "fn_vcf": [ + [ + { + "id": "test" + }, + "test.fn.vcf.gz:md5,fc419367818700d47df073615aeb9077" + ] + ], + "fp_tbi": [ + [ + { + "id": "test" + }, + "test.fp.vcf.gz.tbi:md5,092a7a3162e7cff25d273525751eb284" + ] + ], + "fp_vcf": [ + [ + { + "id": "test" + }, + "test.fp.vcf.gz:md5,5171021307097220337dbcaccc860495" + ] + ], + "non_snp_roc": [ + [ + { + "id": "test" + }, + "test.non_snp_roc.tsv.gz:md5,eb0910409b8b088655defbd152103b81" + ] + ], + "phasing": [ + [ + { + "id": "test" + }, + "test.phasing.txt:md5,31988234bee208cacb3de90dabe1797f" + ] + ], + "snp_roc": [ + [ + { + "id": "test" + }, + "test.snp_roc.tsv.gz:md5,11d7393a16c25ac0a092382fecafee9b" + ] + ], + "summary": [ + [ + { + "id": "test" + }, + "test.summary.txt:md5,f4c8df93c8bdab603036bbc27b4a28c3" + ] + ], + "tp_tbi": [ + [ + { + "id": "test" + }, + "test.tp.vcf.gz.tbi:md5,092a7a3162e7cff25d273525751eb284" + ] + ], + "tp_vcf": [ + [ + { + "id": "test" + }, + "test.tp.vcf.gz:md5,5171021307097220337dbcaccc860495" + ] + ], + "versions": [ + "versions.yml:md5,a228f0d9e8b205b4cc7c485151a77bb0" + ], + "weighted_roc": [ + [ + { + "id": "test" + }, + "test.weighted_roc.tsv.gz:md5,de36bf613b3dacf4a043311336bb4a94" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-30T15:17:31.564974666" + }, + "homo_sapiens - [vcf, [], truth, [], [], []], sdf": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.tp.vcf.gz:md5,5125ee41457c9d93f46b19e32788edb4" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test.tp.vcf.gz.tbi:md5,a0e9ac2d38c04bd591ab8f857c5c9133" + ] + ], + "10": [ + [ + { + "id": "test" + }, + "test.weighted_roc.tsv.gz:md5,5dfacd641b080cc8ad22eebec015c698" + ] + ], + "11": [ + [ + { + "id": "test" + }, + "test.summary.txt:md5,f33feb32f84958fb931063044fba369b" + ] + ], + "12": [ + [ + { + "id": "test" + }, + "test.phasing.txt:md5,133677dbd8be657439ea2b03fdfb8795" + ] + ], + "13": [ + "versions.yml:md5,a228f0d9e8b205b4cc7c485151a77bb0" + ], + "2": [ + [ + { + "id": "test" + }, + "test.fn.vcf.gz:md5,df96e4e4014cdb3050cb6f221f0cdca9" + ] + ], + "3": [ + [ + { + "id": "test" + }, + "test.fn.vcf.gz.tbi:md5,092a7a3162e7cff25d273525751eb284" + ] + ], + "4": [ + [ + { + "id": "test" + }, + "test.fp.vcf.gz:md5,d4bfa2c7271351ca19589f0f57f210b7" + ] + ], + "5": [ + [ + { + "id": "test" + }, + "test.fp.vcf.gz.tbi:md5,092a7a3162e7cff25d273525751eb284" + ] + ], + "6": [ + [ + { + "id": "test" + }, + "test.tp-baseline.vcf.gz:md5,920af25c3c18a438b11440702562fa35" + ] + ], + "7": [ + [ + { + "id": "test" + }, + "test.tp-baseline.vcf.gz.tbi:md5,95938320b425e28cf06c45ab45ad0360" + ] + ], + "8": [ + [ + { + "id": "test" + }, + "test.snp_roc.tsv.gz:md5,85edc0101bb9e8d3edc11abe4fdcda93" + ] + ], + "9": [ + [ + { + "id": "test" + }, + "test.non_snp_roc.tsv.gz:md5,30283ede3bcc5dd247f8a84bf345bf9a" + ] + ], + "baseline_tbi": [ + [ + { + "id": "test" + }, + "test.tp-baseline.vcf.gz.tbi:md5,95938320b425e28cf06c45ab45ad0360" + ] + ], + "baseline_vcf": [ + [ + { + "id": "test" + }, + "test.tp-baseline.vcf.gz:md5,920af25c3c18a438b11440702562fa35" + ] + ], + "fn_tbi": [ + [ + { + "id": "test" + }, + "test.fn.vcf.gz.tbi:md5,092a7a3162e7cff25d273525751eb284" + ] + ], + "fn_vcf": [ + [ + { + "id": "test" + }, + "test.fn.vcf.gz:md5,df96e4e4014cdb3050cb6f221f0cdca9" + ] + ], + "fp_tbi": [ + [ + { + "id": "test" + }, + "test.fp.vcf.gz.tbi:md5,092a7a3162e7cff25d273525751eb284" + ] + ], + "fp_vcf": [ + [ + { + "id": "test" + }, + "test.fp.vcf.gz:md5,d4bfa2c7271351ca19589f0f57f210b7" + ] + ], + "non_snp_roc": [ + [ + { + "id": "test" + }, + "test.non_snp_roc.tsv.gz:md5,30283ede3bcc5dd247f8a84bf345bf9a" + ] + ], + "phasing": [ + [ + { + "id": "test" + }, + "test.phasing.txt:md5,133677dbd8be657439ea2b03fdfb8795" + ] + ], + "snp_roc": [ + [ + { + "id": "test" + }, + "test.snp_roc.tsv.gz:md5,85edc0101bb9e8d3edc11abe4fdcda93" + ] + ], + "summary": [ + [ + { + "id": "test" + }, + "test.summary.txt:md5,f33feb32f84958fb931063044fba369b" + ] + ], + "tp_tbi": [ + [ + { + "id": "test" + }, + "test.tp.vcf.gz.tbi:md5,a0e9ac2d38c04bd591ab8f857c5c9133" + ] + ], + "tp_vcf": [ + [ + { + "id": "test" + }, + "test.tp.vcf.gz:md5,5125ee41457c9d93f46b19e32788edb4" + ] + ], + "versions": [ + "versions.yml:md5,a228f0d9e8b205b4cc7c485151a77bb0" + ], + "weighted_roc": [ + [ + { + "id": "test" + }, + "test.weighted_roc.tsv.gz:md5,5dfacd641b080cc8ad22eebec015c698" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-30T15:18:04.344989466" + }, + "homo_sapiens - [vcf, tbi, truth, truth_tbi, truth_bed, regions_bed], sdf - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.tp.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + [ + { + "id": "test" + }, + "test.tp.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "10": [ + [ + { + "id": "test" + }, + "test.weighted_roc.tsv.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "11": [ + [ + { + "id": "test" + }, + "test.summary.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "12": [ + [ + { + "id": "test" + }, + "test.phasing.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "13": [ + "versions.yml:md5,a228f0d9e8b205b4cc7c485151a77bb0" + ], + "2": [ + [ + { + "id": "test" + }, + "test.fn.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "3": [ + [ + { + "id": "test" + }, + "test.fn.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ + [ + { + "id": "test" + }, + "test.fp.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "5": [ + [ + { + "id": "test" + }, + "test.fp.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "6": [ + [ + { + "id": "test" + }, + "test.tp-baseline.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "7": [ + [ + { + "id": "test" + }, + "test.tp-baseline.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "8": [ + [ + { + "id": "test" + }, + "test.snp_roc.tsv.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "9": [ + [ + { + "id": "test" + }, + "test.non_snp_roc.tsv.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "baseline_tbi": [ + [ + { + "id": "test" + }, + "test.tp-baseline.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "baseline_vcf": [ + [ + { + "id": "test" + }, + "test.tp-baseline.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "fn_tbi": [ + [ + { + "id": "test" + }, + "test.fn.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fn_vcf": [ + [ + { + "id": "test" + }, + "test.fn.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "fp_tbi": [ + [ + { + "id": "test" + }, + "test.fp.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "fp_vcf": [ + [ + { + "id": "test" + }, + "test.fp.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "non_snp_roc": [ + [ + { + "id": "test" + }, + "test.non_snp_roc.tsv.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "phasing": [ + [ + { + "id": "test" + }, + "test.phasing.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "snp_roc": [ + [ + { + "id": "test" + }, + "test.snp_roc.tsv.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "summary": [ + [ + { + "id": "test" + }, + "test.summary.txt:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tp_tbi": [ + [ + { + "id": "test" + }, + "test.tp.vcf.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "tp_vcf": [ + [ + { + "id": "test" + }, + "test.tp.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,a228f0d9e8b205b4cc7c485151a77bb0" + ], + "weighted_roc": [ + [ + { + "id": "test" + }, + "test.weighted_roc.tsv.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-30T15:23:21.165461388" + } +} \ No newline at end of file diff --git a/modules/nf-core/rtgtools/vcfeval/tests/nextflow.config b/modules/nf-core/rtgtools/vcfeval/tests/nextflow.config new file mode 100644 index 00000000..75635215 --- /dev/null +++ b/modules/nf-core/rtgtools/vcfeval/tests/nextflow.config @@ -0,0 +1,4 @@ +process { + withName: UNTAR { + } +} diff --git a/modules/nf-core/samtools/convert/environment.yml b/modules/nf-core/samtools/convert/environment.yml index da2df5e4..62054fc9 100644 --- a/modules/nf-core/samtools/convert/environment.yml +++ b/modules/nf-core/samtools/convert/environment.yml @@ -1,6 +1,8 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda dependencies: - - bioconda::samtools=1.20 - - bioconda::htslib=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/convert/main.nf b/modules/nf-core/samtools/convert/main.nf index 03b7b525..cf9253d1 100644 --- a/modules/nf-core/samtools/convert/main.nf +++ b/modules/nf-core/samtools/convert/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_CONVERT { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(input), path(index) diff --git a/modules/nf-core/samtools/convert/meta.yml b/modules/nf-core/samtools/convert/meta.yml index 55828971..d5bfa161 100644 --- a/modules/nf-core/samtools/convert/meta.yml +++ b/modules/nf-core/samtools/convert/meta.yml @@ -15,50 +15,85 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM file - pattern: "*.{bam,cram}" - - index: - type: file - description: BAM/CRAM index file - pattern: "*.{bai,crai}" - - fasta: - type: file - description: Reference file to create the CRAM file - pattern: "*.{fasta,fa}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM file + pattern: "*.{bam,cram}" + - index: + type: file + description: BAM/CRAM index file + pattern: "*.{bai,crai}" + - - meta2: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: Reference file to create the CRAM file + pattern: "*.{fasta,fa}" + - - meta3: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - fai: + type: file + description: Reference index file to create the CRAM file + pattern: "*.{fai}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bam: - type: file - description: filtered/converted BAM file - pattern: "*{.bam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bam": + type: file + description: filtered/converted BAM file + pattern: "*{.bam}" - cram: - type: file - description: filtered/converted CRAM file - pattern: "*{cram}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.cram": + type: file + description: filtered/converted CRAM file + pattern: "*{cram}" - bai: - type: file - description: filtered/converted BAM index - pattern: "*{.bai}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bai": + type: file + description: filtered/converted BAM index + pattern: "*{.bai}" - crai: - type: file - description: filtered/converted CRAM index - pattern: "*{.crai}" - - version: - type: file - description: File containing software version - pattern: "*.{version.txt}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.crai": + type: file + description: filtered/converted CRAM index + pattern: "*{.crai}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@FriederikeHanssen" - "@maxulysse" diff --git a/modules/nf-core/samtools/convert/tests/main.nf.test.snap b/modules/nf-core/samtools/convert/tests/main.nf.test.snap index 51362902..a021254e 100644 --- a/modules/nf-core/samtools/convert/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/convert/tests/main.nf.test.snap @@ -22,26 +22,26 @@ "cram_to_bam_versions": { "content": [ [ - "versions.yml:md5,b1040cd80ce16abb9b2c2902b62d5fcd" + "versions.yml:md5,5bc6eb42ab2a1ea6661f8ee998467ad6" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:44:34.618037105" + "timestamp": "2024-09-16T07:52:35.516411351" }, "bam_to_cram_versions": { "content": [ [ - "versions.yml:md5,b1040cd80ce16abb9b2c2902b62d5fcd" + "versions.yml:md5,5bc6eb42ab2a1ea6661f8ee998467ad6" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:44:29.165839679" + "timestamp": "2024-09-16T07:52:24.694454205" }, "stub": { "content": [ @@ -71,7 +71,7 @@ ] ], "4": [ - "versions.yml:md5,b1040cd80ce16abb9b2c2902b62d5fcd" + "versions.yml:md5,5bc6eb42ab2a1ea6661f8ee998467ad6" ], "bai": [ @@ -98,15 +98,15 @@ ] ], "versions": [ - "versions.yml:md5,b1040cd80ce16abb9b2c2902b62d5fcd" + "versions.yml:md5,5bc6eb42ab2a1ea6661f8ee998467ad6" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:44:40.258233921" + "timestamp": "2024-09-16T07:52:45.799885099" }, "bam_to_cram_index": { "content": [ diff --git a/modules/nf-core/samtools/faidx/environment.yml b/modules/nf-core/samtools/faidx/environment.yml index f8450fa5..62054fc9 100644 --- a/modules/nf-core/samtools/faidx/environment.yml +++ b/modules/nf-core/samtools/faidx/environment.yml @@ -1,10 +1,8 @@ -name: samtools_faidx - +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda - - defaults - dependencies: - - bioconda::htslib=1.20 - - bioconda::samtools=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/faidx/main.nf b/modules/nf-core/samtools/faidx/main.nf index bdcdbc95..28c0a81c 100644 --- a/modules/nf-core/samtools/faidx/main.nf +++ b/modules/nf-core/samtools/faidx/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_FAIDX { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(fasta) diff --git a/modules/nf-core/samtools/faidx/meta.yml b/modules/nf-core/samtools/faidx/meta.yml index f3c25de2..6721b2cb 100644 --- a/modules/nf-core/samtools/faidx/meta.yml +++ b/modules/nf-core/samtools/faidx/meta.yml @@ -14,47 +14,62 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test' ] - - fasta: - type: file - description: FASTA file - pattern: "*.{fa,fasta}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'test' ] - - fai: - type: file - description: FASTA index file - pattern: "*.{fai}" + - - meta: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test' ] + - fasta: + type: file + description: FASTA file + pattern: "*.{fa,fasta}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'test' ] + - fai: + type: file + description: FASTA index file + pattern: "*.{fai}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - fa: - type: file - description: FASTA file - pattern: "*.{fa}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.{fa,fasta}": + type: file + description: FASTA file + pattern: "*.{fa}" - fai: - type: file - description: FASTA index file - pattern: "*.{fai}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.fai": + type: file + description: FASTA index file + pattern: "*.{fai}" - gzi: - type: file - description: Optional gzip index file for compressed inputs - pattern: "*.gzi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.gzi": + type: file + description: Optional gzip index file for compressed inputs + pattern: "*.gzi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@ewels" diff --git a/modules/nf-core/samtools/faidx/tests/main.nf.test.snap b/modules/nf-core/samtools/faidx/tests/main.nf.test.snap index 3223b72b..1bbb3ec2 100644 --- a/modules/nf-core/samtools/faidx/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/faidx/tests/main.nf.test.snap @@ -18,7 +18,7 @@ ], "3": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ], "fa": [ @@ -36,15 +36,15 @@ ], "versions": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:14.779784761" + "timestamp": "2024-09-16T07:57:47.450887871" }, "test_samtools_faidx_bgzip": { "content": [ @@ -71,7 +71,7 @@ ] ], "3": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ], "fa": [ @@ -95,15 +95,15 @@ ] ], "versions": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:20.256633877" + "timestamp": "2024-09-16T07:58:04.804905659" }, "test_samtools_faidx_fasta": { "content": [ @@ -124,7 +124,7 @@ ], "3": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ], "fa": [ [ @@ -142,15 +142,15 @@ ], "versions": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:25.632577273" + "timestamp": "2024-09-16T07:58:23.831268154" }, "test_samtools_faidx_stub_fasta": { "content": [ @@ -171,7 +171,7 @@ ], "3": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ], "fa": [ [ @@ -189,15 +189,15 @@ ], "versions": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:31.058424849" + "timestamp": "2024-09-16T07:58:35.600243706" }, "test_samtools_faidx_stub_fai": { "content": [ @@ -218,7 +218,7 @@ ], "3": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ], "fa": [ @@ -236,14 +236,14 @@ ], "versions": [ - "versions.yml:md5,2db78952923a61e05d50b95518b21856" + "versions.yml:md5,6bbe80a2e14bd61202ca63e12d66027f" ] } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:42:36.479929617" + "timestamp": "2024-09-16T07:58:54.705460167" } } \ No newline at end of file diff --git a/modules/nf-core/samtools/index/environment.yml b/modules/nf-core/samtools/index/environment.yml index da2df5e4..62054fc9 100644 --- a/modules/nf-core/samtools/index/environment.yml +++ b/modules/nf-core/samtools/index/environment.yml @@ -1,6 +1,8 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda dependencies: - - bioconda::samtools=1.20 - - bioconda::htslib=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/index/main.nf b/modules/nf-core/samtools/index/main.nf index e002585b..31175610 100644 --- a/modules/nf-core/samtools/index/main.nf +++ b/modules/nf-core/samtools/index/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_INDEX { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(input) diff --git a/modules/nf-core/samtools/index/meta.yml b/modules/nf-core/samtools/index/meta.yml index 01a4ee03..db8df0d5 100644 --- a/modules/nf-core/samtools/index/meta.yml +++ b/modules/nf-core/samtools/index/meta.yml @@ -15,38 +15,52 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bam: - type: file - description: BAM/CRAM/SAM file - pattern: "*.{bam,cram,sam}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: input file output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bai: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" - - crai: - type: file - description: BAM/CRAM/SAM index file - pattern: "*.{bai,crai,sai}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.bai": + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" - csi: - type: file - description: CSI index file - pattern: "*.{csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: CSI index file + pattern: "*.{csi}" + - crai: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.crai": + type: file + description: BAM/CRAM/SAM index file + pattern: "*.{bai,crai,sai}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@ewels" diff --git a/modules/nf-core/samtools/index/tests/main.nf.test.snap b/modules/nf-core/samtools/index/tests/main.nf.test.snap index 799d199c..72d65e81 100644 --- a/modules/nf-core/samtools/index/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/index/tests/main.nf.test.snap @@ -18,7 +18,7 @@ ], "3": [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ], "bai": [ @@ -36,15 +36,15 @@ ] ], "versions": [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ] } ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.04.4" }, - "timestamp": "2024-07-22T16:51:53.9057" + "timestamp": "2024-09-16T08:21:25.261127166" }, "crai - stub": { "content": [ @@ -65,7 +65,7 @@ ] ], "3": [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ], "bai": [ @@ -83,15 +83,15 @@ ], "versions": [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ] } ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.04.4" }, - "timestamp": "2024-07-22T16:51:45.931558" + "timestamp": "2024-09-16T08:21:12.653194876" }, "bai - stub": { "content": [ @@ -112,7 +112,7 @@ ], "3": [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ], "bai": [ [ @@ -130,28 +130,28 @@ ], "versions": [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ] } ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.04.4" }, - "timestamp": "2024-07-22T16:51:34.807525" + "timestamp": "2024-09-16T08:21:01.854932651" }, "csi": { "content": [ "test.paired_end.sorted.bam.csi", [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ] ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.04.4" }, - "timestamp": "2024-07-22T16:52:55.688799" + "timestamp": "2024-09-16T08:20:51.485364222" }, "crai": { "content": [ @@ -172,7 +172,7 @@ ] ], "3": [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ], "bai": [ @@ -190,15 +190,15 @@ ], "versions": [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ] } ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.04.4" }, - "timestamp": "2024-07-22T16:51:17.609533" + "timestamp": "2024-09-16T08:20:40.518873972" }, "bai": { "content": [ @@ -219,7 +219,7 @@ ], "3": [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ], "bai": [ [ @@ -237,14 +237,14 @@ ], "versions": [ - "versions.yml:md5,802c9776d9c5e95314e888cf18e96d77" + "versions.yml:md5,5e09a6fdf76de396728f877193d72315" ] } ], "meta": { "nf-test": "0.9.0", - "nextflow": "24.04.3" + "nextflow": "24.04.4" }, - "timestamp": "2024-07-22T16:51:04.16585" + "timestamp": "2024-09-16T08:20:21.184050361" } } \ No newline at end of file diff --git a/modules/nf-core/samtools/merge/environment.yml b/modules/nf-core/samtools/merge/environment.yml index da2df5e4..62054fc9 100644 --- a/modules/nf-core/samtools/merge/environment.yml +++ b/modules/nf-core/samtools/merge/environment.yml @@ -1,6 +1,8 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda dependencies: - - bioconda::samtools=1.20 - - bioconda::htslib=1.20 + - bioconda::htslib=1.21 + - bioconda::samtools=1.21 diff --git a/modules/nf-core/samtools/merge/main.nf b/modules/nf-core/samtools/merge/main.nf index 693b1d80..34da4c7c 100644 --- a/modules/nf-core/samtools/merge/main.nf +++ b/modules/nf-core/samtools/merge/main.nf @@ -4,8 +4,8 @@ process SAMTOOLS_MERGE { conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/samtools:1.20--h50ea8bc_0' : - 'biocontainers/samtools:1.20--h50ea8bc_0' }" + 'https://depot.galaxyproject.org/singularity/samtools:1.21--h50ea8bc_0' : + 'biocontainers/samtools:1.21--h50ea8bc_0' }" input: tuple val(meta), path(input_files, stageAs: "?/*") diff --git a/modules/nf-core/samtools/merge/meta.yml b/modules/nf-core/samtools/merge/meta.yml index 2e8f3dbb..235aa219 100644 --- a/modules/nf-core/samtools/merge/meta.yml +++ b/modules/nf-core/samtools/merge/meta.yml @@ -15,60 +15,81 @@ tools: documentation: http://www.htslib.org/doc/samtools.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:samtools input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input_files: - type: file - description: BAM/CRAM file - pattern: "*.{bam,cram,sam}" - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fasta: - type: file - description: Reference file the CRAM was created with (optional) - pattern: "*.{fasta,fa}" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'genome' ] - - fai: - type: file - description: Index of the reference file the CRAM was created with (optional) - pattern: "*.fai" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input_files: + type: file + description: BAM/CRAM file + pattern: "*.{bam,cram,sam}" + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fasta: + type: file + description: Reference file the CRAM was created with (optional) + pattern: "*.{fasta,fa}" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'genome' ] + - fai: + type: file + description: Index of the reference file the CRAM was created with (optional) + pattern: "*.fai" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - bam: - type: file - description: BAM file - pattern: "*.{bam}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.bam: + type: file + description: BAM file + pattern: "*.{bam}" - cram: - type: file - description: CRAM file - pattern: "*.{cram}" - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${prefix}.cram: + type: file + description: CRAM file + pattern: "*.{cram}" - csi: - type: file - description: BAM index file (optional) - pattern: "*.csi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: BAM index file (optional) + pattern: "*.csi" - crai: - type: file - description: CRAM index file (optional) - pattern: "*.crai" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.crai": + type: file + description: CRAM index file (optional) + pattern: "*.crai" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@drpatelh" - "@yuukiiwa " diff --git a/modules/nf-core/samtools/merge/tests/main.nf.test.snap b/modules/nf-core/samtools/merge/tests/main.nf.test.snap index 17bc846f..0a41e01a 100644 --- a/modules/nf-core/samtools/merge/tests/main.nf.test.snap +++ b/modules/nf-core/samtools/merge/tests/main.nf.test.snap @@ -80,14 +80,14 @@ "bam_versions": { "content": [ [ - "versions.yml:md5,84dab54b9812780df48f5cecef690c34" + "versions.yml:md5,d51d18a97513e370e43f0c891c51dfc4" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:46:35.851936597" + "timestamp": "2024-09-16T09:16:30.476887194" }, "bams_csi": { "content": [ @@ -124,14 +124,14 @@ "bams_stub_versions": { "content": [ [ - "versions.yml:md5,84dab54b9812780df48f5cecef690c34" + "versions.yml:md5,d51d18a97513e370e43f0c891c51dfc4" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:46:41.405707643" + "timestamp": "2024-09-16T09:16:52.203823961" }, "bam_cram": { "content": [ @@ -158,14 +158,14 @@ "bams_versions": { "content": [ [ - "versions.yml:md5,84dab54b9812780df48f5cecef690c34" + "versions.yml:md5,d51d18a97513e370e43f0c891c51dfc4" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:45:51.695689923" + "timestamp": "2024-09-16T08:29:57.524363148" }, "crams_bam": { "content": [ @@ -182,14 +182,14 @@ "crams_versions": { "content": [ [ - "versions.yml:md5,84dab54b9812780df48f5cecef690c34" + "versions.yml:md5,d51d18a97513e370e43f0c891c51dfc4" ] ], "meta": { - "nf-test": "0.8.4", - "nextflow": "23.10.1" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-05-28T15:46:30.185392319" + "timestamp": "2024-09-16T09:16:06.977096207" }, "bam_csi": { "content": [ diff --git a/modules/nf-core/snpeff/snpeff/main.nf b/modules/nf-core/snpeff/snpeff/main.nf deleted file mode 100644 index 28d13826..00000000 --- a/modules/nf-core/snpeff/snpeff/main.nf +++ /dev/null @@ -1,62 +0,0 @@ -process SNPEFF_SNPEFF { - tag "$meta.id" - label 'process_medium' - - conda "${moduleDir}/environment.yml" - container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/snpeff:5.1--hdfd78af_2' : - 'biocontainers/snpeff:5.1--hdfd78af_2' }" - - input: - tuple val(meta), path(vcf) - val db - tuple val(meta2), path(cache) - - output: - tuple val(meta), path("*.ann.vcf"), emit: vcf - tuple val(meta), path("*.csv"), emit: report - tuple val(meta), path("*.html"), emit: summary_html - tuple val(meta), path("*.genes.txt"), emit: genes_txt - path "versions.yml" , emit: versions - - when: - task.ext.when == null || task.ext.when - - script: - def args = task.ext.args ?: '' - def avail_mem = 6144 - if (!task.memory) { - log.info '[snpEff] Available memory not known - defaulting to 6GB. Specify process memory requirements to change this.' - } else { - avail_mem = (task.memory.mega*0.8).intValue() - } - def prefix = task.ext.prefix ?: "${meta.id}" - def cache_command = cache ? "-dataDir \${PWD}/${cache}" : "" - """ - snpEff \\ - -Xmx${avail_mem}M \\ - $db \\ - $args \\ - -csvStats ${prefix}.csv \\ - $cache_command \\ - $vcf \\ - > ${prefix}.ann.vcf - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - snpeff: \$(echo \$(snpEff -version 2>&1) | cut -f 2 -d ' ') - END_VERSIONS - """ - - stub: - def prefix = task.ext.prefix ?: "${meta.id}" - """ - touch ${prefix}.ann.vcf - - cat <<-END_VERSIONS > versions.yml - "${task.process}": - snpeff: \$(echo \$(snpEff -version 2>&1) | cut -f 2 -d ' ') - END_VERSIONS - """ - -} diff --git a/modules/nf-core/snpeff/snpeff/meta.yml b/modules/nf-core/snpeff/snpeff/meta.yml deleted file mode 100644 index 7559c3de..00000000 --- a/modules/nf-core/snpeff/snpeff/meta.yml +++ /dev/null @@ -1,60 +0,0 @@ -name: snpeff_snpeff -description: Genetic variant annotation and functional effect prediction toolbox -keywords: - - annotation - - effect prediction - - snpeff - - variant - - vcf -tools: - - snpeff: - description: | - SnpEff is a variant annotation and effect prediction tool. - It annotates and predicts the effects of genetic variants on genes and proteins (such as amino acid changes). - homepage: https://pcingola.github.io/SnpEff/ - documentation: https://pcingola.github.io/SnpEff/se_introduction/ - licence: ["MIT"] -input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: | - vcf to annotate - - db: - type: string - description: | - which db to annotate with - - cache: - type: file - description: | - path to snpEff cache (optional) -output: - - vcf: - type: file - description: | - annotated vcf - pattern: "*.ann.vcf" - - report: - type: file - description: snpEff report csv file - pattern: "*.csv" - - summary_html: - type: file - description: snpEff summary statistics in html file - pattern: "*.html" - - genes_txt: - type: file - description: txt (tab separated) file having counts of the number of variants affecting each transcript and gene - pattern: "*.genes.txt" - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" -authors: - - "@maxulysse" -maintainers: - - "@maxulysse" diff --git a/modules/nf-core/snpeff/snpeff/tests/main.nf.test b/modules/nf-core/snpeff/snpeff/tests/main.nf.test deleted file mode 100644 index 803ff02c..00000000 --- a/modules/nf-core/snpeff/snpeff/tests/main.nf.test +++ /dev/null @@ -1,51 +0,0 @@ -nextflow_process { - - name "Test Process SNPEFF_SNPEFF" - script "../main.nf" - process "SNPEFF_SNPEFF" - config "./nextflow.config" - tag "modules" - tag "modules_nfcore" - tag "modules_snpeff" - tag "snpeff" - tag "snpeff/download" - tag "snpeff/snpeff" - - test("test_SNPEFF_SNPEFF") { - - setup { - run("SNPEFF_DOWNLOAD") { - script "../../download/main.nf" - process { - """ - input[0] = Channel.of([[id:params.snpeff_genome + '.' + params.snpeff_cache_version], params.snpeff_genome, params.snpeff_cache_version]) - """ - } - } - } - - when { - process { - """ - input[0] = Channel.of([ - [ id:'test' ], // meta map - file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test.vcf', checkIfExists: true) - ]) - input[1] = params.snpeff_genome + '.' + params.snpeff_cache_version - input[2] = SNPEFF_DOWNLOAD.out.cache - """ - } - } - - then { - assertAll( - { assert process.success }, - { assert path(process.out.report[0][1]).exists() }, - { assert path(process.out.summary_html[0][1]).exists() }, - { assert path(process.out.vcf[0][1]).exists() }, - { assert snapshot(process.out.genes_txt).match("genes_txt") }, - { assert snapshot(process.out.versions).match("versions") } - ) - } - } -} diff --git a/modules/nf-core/snpeff/snpeff/tests/main.nf.test.snap b/modules/nf-core/snpeff/snpeff/tests/main.nf.test.snap deleted file mode 100644 index 0891b844..00000000 --- a/modules/nf-core/snpeff/snpeff/tests/main.nf.test.snap +++ /dev/null @@ -1,31 +0,0 @@ -{ - "versions": { - "content": [ - [ - "versions.yml:md5,25d44a118d558b331d51ec00be0d997c" - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.02.0" - }, - "timestamp": "2024-03-18T17:37:18.879477" - }, - "genes_txt": { - "content": [ - [ - [ - { - "id": "test" - }, - "test.genes.txt:md5,130536bf0237d7f3f746d32aaa32840a" - ] - ] - ], - "meta": { - "nf-test": "0.8.4", - "nextflow": "24.02.0" - }, - "timestamp": "2024-03-18T17:37:18.874822" - } -} \ No newline at end of file diff --git a/modules/nf-core/snpeff/snpeff/tests/nextflow.config b/modules/nf-core/snpeff/snpeff/tests/nextflow.config deleted file mode 100644 index d31ebf6b..00000000 --- a/modules/nf-core/snpeff/snpeff/tests/nextflow.config +++ /dev/null @@ -1,4 +0,0 @@ -params { - snpeff_cache_version = "105" - snpeff_genome = "WBcel235" -} diff --git a/modules/nf-core/snpeff/snpeff/tests/tags.yml b/modules/nf-core/snpeff/snpeff/tests/tags.yml deleted file mode 100644 index 427b588d..00000000 --- a/modules/nf-core/snpeff/snpeff/tests/tags.yml +++ /dev/null @@ -1,2 +0,0 @@ -snpeff/snpeff: - - "modules/nf-core/snpeff/snpeff/**" diff --git a/modules/nf-core/somalier/extract/meta.yml b/modules/nf-core/somalier/extract/meta.yml index aabaf5d6..25621667 100644 --- a/modules/nf-core/somalier/extract/meta.yml +++ b/modules/nf-core/somalier/extract/meta.yml @@ -1,5 +1,6 @@ name: "somalier_extract" -description: Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs +description: Somalier can extract informative sites, evaluate relatedness, and perform + quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs keywords: - relatedness - QC @@ -14,65 +15,69 @@ keywords: - family tools: - "somalier": - description: "Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs" + description: "Somalier can extract informative sites, evaluate relatedness, and + perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs" homepage: "https://github.com/brentp/somalier" documentation: "https://github.com/brentp/somalier/blob/master/README.md" tool_dev_url: "https://github.com/brentp/somalier" doi: "10.1186/s13073-020-00761-2" licence: ["MIT"] + identifier: biotools:somalier input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: BAM/CRAM/SAM/BCF/VCF/GVCF or jointly-called VCF file - - input_index: - type: file - description: index file of the input data, e.g., bam.bai, cram.crai - - meta2: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'hg38' ] - - fasta: - type: file - description: The reference fasta file - pattern: "*.{fasta,fna,fas,fa}" - - meta3: - type: map - description: | - Groovy Map containing reference information - e.g. [ id:'hg38' ] - - fai: - type: file - description: FASTA index file - pattern: "*.fai" - - meta4: - type: map - description: | - Groovy Map containing sites information - e.g. [ id:'hg38' ] - - sites: - type: file - description: sites file in VCF format which can be taken from https://github.com/brentp/somalier - pattern: "*.vcf.gz" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: BAM/CRAM/SAM/BCF/VCF/GVCF or jointly-called VCF file + - input_index: + type: file + description: index file of the input data, e.g., bam.bai, cram.crai + - - meta2: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'hg38' ] + - fasta: + type: file + description: The reference fasta file + pattern: "*.{fasta,fna,fas,fa}" + - - meta3: + type: map + description: | + Groovy Map containing reference information + e.g. [ id:'hg38' ] + - fai: + type: file + description: FASTA index file + pattern: "*.fai" + - - meta4: + type: map + description: | + Groovy Map containing sites information + e.g. [ id:'hg38' ] + - sites: + type: file + description: sites file in VCF format which can be taken from https://github.com/brentp/somalier + pattern: "*.vcf.gz" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - extract: - type: file - description: binary output file based on extracted sites - pattern: "*.{somalier}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.somalier": + type: file + description: binary output file based on extracted sites + pattern: "*.{somalier}" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@ashotmarg" - "@nvnieuwk" diff --git a/modules/nf-core/somalier/relate/meta.yml b/modules/nf-core/somalier/relate/meta.yml index 42638f4f..0da72821 100644 --- a/modules/nf-core/somalier/relate/meta.yml +++ b/modules/nf-core/somalier/relate/meta.yml @@ -1,5 +1,6 @@ name: "somalier_relate" -description: Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs +description: Somalier can extract informative sites, evaluate relatedness, and perform + quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs keywords: - relatedness - QC @@ -14,47 +15,67 @@ keywords: - family tools: - "somalier": - description: "Somalier can extract informative sites, evaluate relatedness, and perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs" + description: "Somalier can extract informative sites, evaluate relatedness, and + perform quality-control on BAM/CRAM/BCF/VCF/GVCF or from jointly-called VCFs" homepage: "https://github.com/brentp/somalier" documentation: "https://github.com/brentp/somalier/blob/master/README.md" tool_dev_url: "https://github.com/brentp/somalier" doi: "10.1186/s13073-020-00761-2" licence: ["MIT"] + identifier: biotools:somalier input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - extract: - type: file - description: extract file(s) from Somalier extract - pattern: "*.somalier" - - ped: - type: file - description: optional path to a ped or fam file indicating the expected relationships among samples - pattern: "*.{ped,fam}" - - sample_groups: - type: file - description: optional path to expected groups of samples such as tumor normal pairs specified as comma-separated groups per line - pattern: "*.{txt,csv}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - extract: + type: file + description: extract file(s) from Somalier extract + pattern: "*.somalier" + - ped: + type: file + description: optional path to a ped or fam file indicating the expected relationships + among samples + pattern: "*.{ped,fam}" + - - sample_groups: + type: file + description: optional path to expected groups of samples such as tumor normal + pairs specified as comma-separated groups per line + pattern: "*.{txt,csv}" output: - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - html: - type: file - description: html file - pattern: "*.html" + - meta: + type: file + description: html file + pattern: "*.html" + - "*.html": + type: file + description: html file + pattern: "*.html" - pairs_tsv: - type: file - description: tsv file with output stats for pairs of samples - pattern: "*.pairs.tsv" + - meta: + type: file + description: tsv file with output stats for pairs of samples + pattern: "*.pairs.tsv" + - "*.pairs.tsv": + type: file + description: tsv file with output stats for pairs of samples + pattern: "*.pairs.tsv" - samples_tsv: - type: file - description: tsv file with sample-level information - pattern: "*.samples.tsv" + - meta: + type: file + description: tsv file with sample-level information + pattern: "*.samples.tsv" + - "*.samples.tsv": + type: file + description: tsv file with sample-level information + pattern: "*.samples.tsv" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@ashotmarg" - "@nvnieuwk" diff --git a/modules/nf-core/somalier/relate/somalier-relate.diff b/modules/nf-core/somalier/relate/somalier-relate.diff index 918d80b2..571c6446 100644 --- a/modules/nf-core/somalier/relate/somalier-relate.diff +++ b/modules/nf-core/somalier/relate/somalier-relate.diff @@ -32,5 +32,190 @@ Changes in 'somalier/relate/main.nf': 'modules/nf-core/somalier/relate/meta.yml' is unchanged 'modules/nf-core/somalier/relate/tests/tags.yml' is unchanged 'modules/nf-core/somalier/relate/tests/main.nf.test' is unchanged -'modules/nf-core/somalier/relate/tests/main.nf.test.snap' is unchanged +Changes in 'somalier/relate/tests/main.nf.test.snap': +--- modules/nf-core/somalier/relate/tests/main.nf.test.snap ++++ modules/nf-core/somalier/relate/tests/main.nf.test.snap +@@ -30,6 +30,15 @@ + ] + ], + "3": [ ++ [ ++ { ++ "id": "cohort", ++ "single_end": false ++ }, ++ "cohort_somalier.ped:md5,377126dd9cfb8218ec6783fa68d53e67" ++ ] ++ ], ++ "4": [ + "versions.yml:md5,59d805a9f89558414535c136c814bea6" + ], + "html": [ +@@ -48,6 +57,15 @@ + "single_end": false + }, + "cohort.pairs.tsv:md5,54d1e9fca1bf9d747d4254c6fa98edcf" ++ ] ++ ], ++ "ped": [ ++ [ ++ { ++ "id": "cohort", ++ "single_end": false ++ }, ++ "cohort_somalier.ped:md5,377126dd9cfb8218ec6783fa68d53e67" + ] + ], + "samples_tsv": [ +@@ -65,10 +83,10 @@ + } + ], + "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.2" ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" + }, +- "timestamp": "2024-07-02T05:29:21.162582556" ++ "timestamp": "2024-11-20T13:00:49.22698226" + }, + "[ delete_me, [] ], [] -stub": { + "content": [ +@@ -101,6 +119,15 @@ + ] + ], + "3": [ ++ [ ++ { ++ "id": "cohort", ++ "single_end": false ++ }, ++ "cohort_somalier.ped:md5,d41d8cd98f00b204e9800998ecf8427e" ++ ] ++ ], ++ "4": [ + "versions.yml:md5,59d805a9f89558414535c136c814bea6" + ], + "html": [ +@@ -119,6 +146,15 @@ + "single_end": false + }, + "cohort.pairs.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" ++ ] ++ ], ++ "ped": [ ++ [ ++ { ++ "id": "cohort", ++ "single_end": false ++ }, ++ "cohort_somalier.ped:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "samples_tsv": [ +@@ -136,10 +172,10 @@ + } + ], + "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.2" ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" + }, +- "timestamp": "2024-07-02T05:29:43.887124223" ++ "timestamp": "2024-11-20T13:01:06.098709152" + }, + "[ delete_me, ped ], groups -stub": { + "content": [ +@@ -172,6 +208,15 @@ + ] + ], + "3": [ ++ [ ++ { ++ "id": "cohort", ++ "single_end": false ++ }, ++ "cohort_somalier.ped:md5,d41d8cd98f00b204e9800998ecf8427e" ++ ] ++ ], ++ "4": [ + "versions.yml:md5,59d805a9f89558414535c136c814bea6" + ], + "html": [ +@@ -190,6 +235,15 @@ + "single_end": false + }, + "cohort.pairs.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" ++ ] ++ ], ++ "ped": [ ++ [ ++ { ++ "id": "cohort", ++ "single_end": false ++ }, ++ "cohort_somalier.ped:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "samples_tsv": [ +@@ -207,10 +261,10 @@ + } + ], + "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.2" ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" + }, +- "timestamp": "2024-07-02T05:29:55.034913513" ++ "timestamp": "2024-11-20T13:01:14.932484293" + }, + "[ delete_me, ped ], groups": { + "content": [ +@@ -243,6 +297,15 @@ + ] + ], + "3": [ ++ [ ++ { ++ "id": "cohort", ++ "single_end": false ++ }, ++ "cohort_somalier.ped:md5,1eec67a4157fff88730161a080430ba5" ++ ] ++ ], ++ "4": [ + "versions.yml:md5,59d805a9f89558414535c136c814bea6" + ], + "html": [ +@@ -261,6 +324,15 @@ + "single_end": false + }, + "cohort.pairs.tsv:md5,8655714f1e5359329188e9f501168131" ++ ] ++ ], ++ "ped": [ ++ [ ++ { ++ "id": "cohort", ++ "single_end": false ++ }, ++ "cohort_somalier.ped:md5,1eec67a4157fff88730161a080430ba5" + ] + ], + "samples_tsv": [ +@@ -278,9 +350,9 @@ + } + ], + "meta": { +- "nf-test": "0.8.4", +- "nextflow": "24.04.2" ++ "nf-test": "0.9.1", ++ "nextflow": "24.10.0" + }, +- "timestamp": "2024-07-02T05:29:32.451456985" ++ "timestamp": "2024-11-20T13:00:57.673492011" + } + } ************************************************************ diff --git a/modules/nf-core/somalier/relate/tests/main.nf.test.snap b/modules/nf-core/somalier/relate/tests/main.nf.test.snap index 54a73033..fe3ff467 100644 --- a/modules/nf-core/somalier/relate/tests/main.nf.test.snap +++ b/modules/nf-core/somalier/relate/tests/main.nf.test.snap @@ -30,6 +30,15 @@ ] ], "3": [ + [ + { + "id": "cohort", + "single_end": false + }, + "cohort_somalier.ped:md5,377126dd9cfb8218ec6783fa68d53e67" + ] + ], + "4": [ "versions.yml:md5,59d805a9f89558414535c136c814bea6" ], "html": [ @@ -50,6 +59,15 @@ "cohort.pairs.tsv:md5,54d1e9fca1bf9d747d4254c6fa98edcf" ] ], + "ped": [ + [ + { + "id": "cohort", + "single_end": false + }, + "cohort_somalier.ped:md5,377126dd9cfb8218ec6783fa68d53e67" + ] + ], "samples_tsv": [ [ { @@ -65,10 +83,10 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-07-02T05:29:21.162582556" + "timestamp": "2024-11-20T13:00:49.22698226" }, "[ delete_me, [] ], [] -stub": { "content": [ @@ -101,6 +119,15 @@ ] ], "3": [ + [ + { + "id": "cohort", + "single_end": false + }, + "cohort_somalier.ped:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ "versions.yml:md5,59d805a9f89558414535c136c814bea6" ], "html": [ @@ -121,6 +148,15 @@ "cohort.pairs.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], + "ped": [ + [ + { + "id": "cohort", + "single_end": false + }, + "cohort_somalier.ped:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], "samples_tsv": [ [ { @@ -136,10 +172,10 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-07-02T05:29:43.887124223" + "timestamp": "2024-11-20T13:01:06.098709152" }, "[ delete_me, ped ], groups -stub": { "content": [ @@ -172,6 +208,15 @@ ] ], "3": [ + [ + { + "id": "cohort", + "single_end": false + }, + "cohort_somalier.ped:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "4": [ "versions.yml:md5,59d805a9f89558414535c136c814bea6" ], "html": [ @@ -192,6 +237,15 @@ "cohort.pairs.tsv:md5,d41d8cd98f00b204e9800998ecf8427e" ] ], + "ped": [ + [ + { + "id": "cohort", + "single_end": false + }, + "cohort_somalier.ped:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], "samples_tsv": [ [ { @@ -207,10 +261,10 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-07-02T05:29:55.034913513" + "timestamp": "2024-11-20T13:01:14.932484293" }, "[ delete_me, ped ], groups": { "content": [ @@ -243,6 +297,15 @@ ] ], "3": [ + [ + { + "id": "cohort", + "single_end": false + }, + "cohort_somalier.ped:md5,1eec67a4157fff88730161a080430ba5" + ] + ], + "4": [ "versions.yml:md5,59d805a9f89558414535c136c814bea6" ], "html": [ @@ -263,6 +326,15 @@ "cohort.pairs.tsv:md5,8655714f1e5359329188e9f501168131" ] ], + "ped": [ + [ + { + "id": "cohort", + "single_end": false + }, + "cohort_somalier.ped:md5,1eec67a4157fff88730161a080430ba5" + ] + ], "samples_tsv": [ [ { @@ -278,9 +350,9 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-07-02T05:29:32.451456985" + "timestamp": "2024-11-20T13:00:57.673492011" } } \ No newline at end of file diff --git a/modules/nf-core/tabix/bgzip/environment.yml b/modules/nf-core/tabix/bgzip/environment.yml index c863e920..017c259d 100644 --- a/modules/nf-core/tabix/bgzip/environment.yml +++ b/modules/nf-core/tabix/bgzip/environment.yml @@ -1,6 +1,7 @@ channels: - conda-forge - bioconda + dependencies: - - bioconda::tabix=1.11 - bioconda::htslib=1.20 + - bioconda::tabix=1.11 diff --git a/modules/nf-core/tabix/bgzip/meta.yml b/modules/nf-core/tabix/bgzip/meta.yml index 621d49ea..131e92cf 100644 --- a/modules/nf-core/tabix/bgzip/meta.yml +++ b/modules/nf-core/tabix/bgzip/meta.yml @@ -13,33 +13,42 @@ tools: documentation: http://www.htslib.org/doc/bgzip.html doi: 10.1093/bioinformatics/btp352 licence: ["MIT"] + identifier: biotools:tabix input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - input: - type: file - description: file to compress or to decompress + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: file to compress or to decompress output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - output: - type: file - description: Output compressed/decompressed file - pattern: "*." + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${output}: + type: file + description: Output compressed/decompressed file + pattern: "*." - gzi: - type: file - description: Optional gzip index file for compressed inputs - pattern: "*.gzi" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - ${output}.gzi: + type: file + description: Optional gzip index file for compressed inputs + pattern: "*.gzi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/tabix/bgziptabix/environment.yml b/modules/nf-core/tabix/bgziptabix/environment.yml index c863e920..017c259d 100644 --- a/modules/nf-core/tabix/bgziptabix/environment.yml +++ b/modules/nf-core/tabix/bgziptabix/environment.yml @@ -1,6 +1,7 @@ channels: - conda-forge - bioconda + dependencies: - - bioconda::tabix=1.11 - bioconda::htslib=1.20 + - bioconda::tabix=1.11 diff --git a/modules/nf-core/tabix/bgziptabix/main.nf b/modules/nf-core/tabix/bgziptabix/main.nf index 05041f49..22f37a77 100644 --- a/modules/nf-core/tabix/bgziptabix/main.nf +++ b/modules/nf-core/tabix/bgziptabix/main.nf @@ -34,10 +34,11 @@ process TABIX_BGZIPTABIX { stub: def prefix = task.ext.prefix ?: "${meta.id}" + def args2 = task.ext.args2 ?: '' + def index = args2.contains("-C ") || args2.contains("--csi") ? "csi" : "tbi" """ echo "" | gzip > ${prefix}.${input.getExtension()}.gz - touch ${prefix}.${input.getExtension()}.gz.tbi - touch ${prefix}.${input.getExtension()}.gz.csi + touch ${prefix}.${input.getExtension()}.gz.${index} cat <<-END_VERSIONS > versions.yml "${task.process}": diff --git a/modules/nf-core/tabix/bgziptabix/meta.yml b/modules/nf-core/tabix/bgziptabix/meta.yml index 438aba4d..806fbc12 100644 --- a/modules/nf-core/tabix/bgziptabix/meta.yml +++ b/modules/nf-core/tabix/bgziptabix/meta.yml @@ -13,38 +13,50 @@ tools: documentation: https://www.htslib.org/doc/tabix.1.html doi: 10.1093/bioinformatics/btq671 licence: ["MIT"] + identifier: biotools:tabix input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - tab: - type: file - description: TAB-delimited genome position file - pattern: "*.{bed,gff,sam,vcf}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - input: + type: file + description: Sorted tab-delimited genome file output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - gz: - type: file - description: Output compressed file - pattern: "*.{gz}" - - tbi: - type: file - description: tabix index file - pattern: "*.{gz.tbi}" - - csi: - type: file - description: tabix alternate index file - pattern: "*.{gz.csi}" + - gz_tbi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.gz": + type: file + description: bgzipped tab-delimited genome file + pattern: "*.gz" + - "*.tbi": + type: file + description: tabix index file + pattern: "*.tbi" + - gz_csi: + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.gz": + type: file + description: bgzipped tab-delimited genome file + pattern: "*.gz" + - "*.csi": + type: file + description: csi index file + pattern: "*.csi" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@maxulysse" - "@DLBPointon" diff --git a/modules/nf-core/tabix/bgziptabix/tests/main.nf.test b/modules/nf-core/tabix/bgziptabix/tests/main.nf.test index 1a84d74f..4d4130dc 100644 --- a/modules/nf-core/tabix/bgziptabix/tests/main.nf.test +++ b/modules/nf-core/tabix/bgziptabix/tests/main.nf.test @@ -91,4 +91,33 @@ nextflow_process { } + test("sarscov2_bed_tbi_stub") { + config "./tabix_tbi.config" + + options "-stub" + + when { + process { + """ + input[0] = [ + [ id:'test' ], + [ file(params.modules_testdata_base_path + 'genomics/sarscov2/genome/bed/test.bed', checkIfExists: true) ] + ] + """ + } + } + + then { + assertAll ( + { assert process.success }, + { assert snapshot(process.out).match() }, + { assert snapshot( + file(process.out.gz_tbi[0][1]).name + ).match("tbi_stub") + } + ) + } + + } + } diff --git a/modules/nf-core/tabix/bgziptabix/tests/main.nf.test.snap b/modules/nf-core/tabix/bgziptabix/tests/main.nf.test.snap index c166ea72..fb87799b 100644 --- a/modules/nf-core/tabix/bgziptabix/tests/main.nf.test.snap +++ b/modules/nf-core/tabix/bgziptabix/tests/main.nf.test.snap @@ -91,6 +91,47 @@ }, "timestamp": "2024-02-19T14:51:00.548801" }, + "sarscov2_bed_tbi_stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.bed.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "1": [ + + ], + "2": [ + "versions.yml:md5,736e7c3b16a3ac525253e5b5f5d8fdfa" + ], + "gz_csi": [ + + ], + "gz_tbi": [ + [ + { + "id": "test" + }, + "test.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", + "test.bed.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" + ] + ], + "versions": [ + "versions.yml:md5,736e7c3b16a3ac525253e5b5f5d8fdfa" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-25T14:45:18.533169949" + }, "csi_stub": { "content": [ "test.bed.gz" @@ -101,6 +142,16 @@ }, "timestamp": "2024-02-19T14:51:09.218454" }, + "tbi_stub": { + "content": [ + "test.bed.gz" + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-09-25T14:45:18.550930179" + }, "tbi_test": { "content": [ "tbi_test.bed.gz" @@ -115,13 +166,7 @@ "content": [ { "0": [ - [ - { - "id": "test" - }, - "test.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", - "test.bed.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" - ] + ], "1": [ [ @@ -145,13 +190,7 @@ ] ], "gz_tbi": [ - [ - { - "id": "test" - }, - "test.bed.gz:md5,68b329da9893e34099c7d8ad5cb9c940", - "test.bed.gz.tbi:md5,d41d8cd98f00b204e9800998ecf8427e" - ] + ], "versions": [ "versions.yml:md5,736e7c3b16a3ac525253e5b5f5d8fdfa" @@ -159,9 +198,9 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-07-19T11:29:45.105209991" + "timestamp": "2024-09-25T14:44:19.786135972" } } \ No newline at end of file diff --git a/modules/nf-core/tabix/tabix/environment.yml b/modules/nf-core/tabix/tabix/environment.yml index 4d1f9dd4..017c259d 100644 --- a/modules/nf-core/tabix/tabix/environment.yml +++ b/modules/nf-core/tabix/tabix/environment.yml @@ -1,9 +1,6 @@ -name: tabix_tabix - channels: - conda-forge - bioconda - - defaults dependencies: - bioconda::htslib=1.20 diff --git a/modules/nf-core/tabix/tabix/main.nf b/modules/nf-core/tabix/tabix/main.nf index 13acd670..fd09383b 100644 --- a/modules/nf-core/tabix/tabix/main.nf +++ b/modules/nf-core/tabix/tabix/main.nf @@ -23,6 +23,7 @@ process TABIX_TABIX { """ tabix \\ --threads $task.cpus \\ + --force \\ $args \\ $tab diff --git a/modules/nf-core/tabix/tabix/meta.yml b/modules/nf-core/tabix/tabix/meta.yml index ae5b4f43..7864832d 100644 --- a/modules/nf-core/tabix/tabix/meta.yml +++ b/modules/nf-core/tabix/tabix/meta.yml @@ -11,34 +11,43 @@ tools: documentation: https://www.htslib.org/doc/tabix.1.html doi: 10.1093/bioinformatics/btq671 licence: ["MIT"] + identifier: biotools:tabix input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - tab: - type: file - description: TAB-delimited genome position file compressed with bgzip - pattern: "*.{bed.gz,gff.gz,sam.gz,vcf.gz}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - tab: + type: file + description: TAB-delimited genome position file compressed with bgzip + pattern: "*.{bed.gz,gff.gz,sam.gz,vcf.gz}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - tbi: - type: file - description: tabix index file - pattern: "*.{tbi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.tbi": + type: file + description: tabix index file + pattern: "*.{tbi}" - csi: - type: file - description: coordinate sorted index file - pattern: "*.{csi}" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.csi": + type: file + description: coordinate sorted index file + pattern: "*.{csi}" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/tabix/tabix/tabix-tabix.diff b/modules/nf-core/tabix/tabix/tabix-tabix.diff new file mode 100644 index 00000000..135c80ce --- /dev/null +++ b/modules/nf-core/tabix/tabix/tabix-tabix.diff @@ -0,0 +1,23 @@ +Changes in module 'nf-core/tabix/tabix' +'modules/nf-core/tabix/tabix/environment.yml' is unchanged +'modules/nf-core/tabix/tabix/meta.yml' is unchanged +Changes in 'tabix/tabix/main.nf': +--- modules/nf-core/tabix/tabix/main.nf ++++ modules/nf-core/tabix/tabix/main.nf +@@ -23,6 +23,7 @@ + """ + tabix \\ + --threads $task.cpus \\ ++ --force \\ + $args \\ + $tab + + +'modules/nf-core/tabix/tabix/tests/main.nf.test.snap' is unchanged +'modules/nf-core/tabix/tabix/tests/tabix_bed.config' is unchanged +'modules/nf-core/tabix/tabix/tests/tags.yml' is unchanged +'modules/nf-core/tabix/tabix/tests/tabix_gff.config' is unchanged +'modules/nf-core/tabix/tabix/tests/tabix_vcf_tbi.config' is unchanged +'modules/nf-core/tabix/tabix/tests/main.nf.test' is unchanged +'modules/nf-core/tabix/tabix/tests/tabix_vcf_csi.config' is unchanged +************************************************************ diff --git a/modules/nf-core/untar/environment.yml b/modules/nf-core/untar/environment.yml index 4f498244..c7794856 100644 --- a/modules/nf-core/untar/environment.yml +++ b/modules/nf-core/untar/environment.yml @@ -1,8 +1,6 @@ -name: untar channels: - conda-forge - bioconda - - defaults dependencies: - conda-forge::grep=3.11 - conda-forge::sed=4.8 diff --git a/modules/nf-core/untar/meta.yml b/modules/nf-core/untar/meta.yml index a9a2110f..290346b3 100644 --- a/modules/nf-core/untar/meta.yml +++ b/modules/nf-core/untar/meta.yml @@ -10,30 +10,33 @@ tools: Extract tar.gz files. documentation: https://www.gnu.org/software/tar/manual/ licence: ["GPL-3.0-or-later"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - archive: - type: file - description: File to be untar - pattern: "*.{tar}.{gz}" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - archive: + type: file + description: File to be untar + pattern: "*.{tar}.{gz}" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - untar: - type: directory - description: Directory containing contents of archive - pattern: "*/" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - $prefix: + type: directory + description: Directory containing contents of archive + pattern: "*/" - versions: - type: file - description: File containing software versions - pattern: "versions.yml" + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@joseespinosa" - "@drpatelh" diff --git a/modules/nf-core/vardictjava/main.nf b/modules/nf-core/vardictjava/main.nf index 6329391c..a2c7666b 100644 --- a/modules/nf-core/vardictjava/main.nf +++ b/modules/nf-core/vardictjava/main.nf @@ -14,7 +14,7 @@ process VARDICTJAVA { output: tuple val(meta), path("*.vcf.gz"), emit: vcf - path "versions.yml" , emit: versions + path "versions.yml" , emit: versions when: task.ext.when == null || task.ext.when @@ -25,9 +25,12 @@ process VARDICTJAVA { def args3 = task.ext.args3 ?: '' def prefix = task.ext.prefix ?: "${meta.id}" + // Don't run test scripts when -fisher has been used by vardictjava + def run_test = !args.contains("-fisher") + def somatic = bams instanceof List && bams.size() == 2 ? true : false def input = somatic ? "-b \"${bams[0]}|${bams[1]}\"" : "-b ${bams}" - def filter = somatic ? "testsomatic.R" : "teststrandbias.R" + def test = run_test ? somatic ? "| testsomatic.R" : "| teststrandbias.R" : "" def convert_to_vcf = somatic ? "var2vcf_paired.pl" : "var2vcf_valid.pl" """ export JAVA_OPTS='"-Xms${task.memory.toMega()/4}m" "-Xmx${task.memory.toGiga()}g" "-Dsamjdk.reference_fasta=${fasta}"' @@ -37,7 +40,7 @@ process VARDICTJAVA { -th ${task.cpus} \\ -G ${fasta} \\ ${bed} \\ - | ${filter} \\ + ${test} \\ | ${convert_to_vcf} \\ ${args2} \\ | bgzip ${args3} --threads ${task.cpus} > ${prefix}.vcf.gz @@ -50,9 +53,6 @@ process VARDICTJAVA { """ stub: - def args = task.ext.args ?: '-c 1 -S 2 -E 3' - def args2 = task.ext.args2 ?: '' - def args3 = task.ext.args3 ?: '' def prefix = task.ext.prefix ?: "${meta.id}" """ diff --git a/modules/nf-core/vardictjava/meta.yml b/modules/nf-core/vardictjava/meta.yml index 5c2fc921..801db6fa 100644 --- a/modules/nf-core/vardictjava/meta.yml +++ b/modules/nf-core/vardictjava/meta.yml @@ -14,56 +14,60 @@ tools: tool_dev_url: "https://github.com/AstraZeneca-NGS/VarDictJava" doi: "10.1093/nar/gkw227 " licence: ["MIT"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - bams: - type: file - description: One or two BAM files. Supply two BAM files to run Vardict in paired mode. - pattern: "*.bam" - - bais: - type: file - description: Index/indices of the BAM file(s) - pattern: "*.bai" - - bed: - type: file - description: BED with the regions of interest - pattern: "*.bed" - - meta2: - type: map - description: | - Groovy Map containing fasta information - e.g. [ id:'test', single_end:false ] - - fasta: - type: file - description: FASTA of the reference genome - pattern: "*.{fa,fasta}" - - meta3: - type: map - description: | - Groovy Map containing fasta information - e.g. [ id:'test', single_end:false ] - - fasta_fai: - type: file - description: The index of the FASTA of the reference genome - pattern: "*.fai" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - bams: + type: file + description: One or two BAM files. Supply two BAM files to run Vardict in paired + mode. + pattern: "*.bam" + - bais: + type: file + description: Index/indices of the BAM file(s) + pattern: "*.bai" + - bed: + type: file + description: BED with the regions of interest + pattern: "*.bed" + - - meta2: + type: map + description: | + Groovy Map containing fasta information + e.g. [ id:'test', single_end:false ] + - fasta: + type: file + description: FASTA of the reference genome + pattern: "*.{fa,fasta}" + - - meta3: + type: map + description: | + Groovy Map containing fasta information + e.g. [ id:'test', single_end:false ] + - fasta_fai: + type: file + description: The index of the FASTA of the reference genome + pattern: "*.fai" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: VCF file output - pattern: "*.vcf.gz" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf.gz": + type: file + description: VCF file output + pattern: "*.vcf.gz" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@nvnieuwk" maintainers: diff --git a/modules/nf-core/vardictjava/tests/main.nf.test b/modules/nf-core/vardictjava/tests/main.nf.test index 8e5fec10..31e1058c 100644 --- a/modules/nf-core/vardictjava/tests/main.nf.test +++ b/modules/nf-core/vardictjava/tests/main.nf.test @@ -10,9 +10,6 @@ nextflow_process { test("homo_sapiens - [bam, bai, bed] - fasta - fai") { when { - params { - outdir = $outputDir - } process { """ input[0] = Channel.value([ @@ -47,9 +44,6 @@ nextflow_process { test("homo_sapiens - [[bam, bam], [bai, bai], bed] - fasta - fai") { when { - params { - outdir = $outputDir - } process { """ input[0] = Channel.value([ @@ -87,4 +81,75 @@ nextflow_process { } + test("homo_sapiens - [bam, bai, bed] - fasta - fai - fisher") { + + config "./nextflow.config" + when { + process { + """ + input[0] = Channel.value([ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true) + ]) + input[1] = [ + [id:"ref"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ] + input[2] = [ + [id:"ref"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + + + } + + } + + test("homo_sapiens - [bam, bai, bed] - fasta - fai - stub") { + + options "-stub" + + when { + process { + """ + input[0] = Channel.value([ + [ id:'test' ], // meta map + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/illumina/bam/test.paired_end.sorted.bam.bai', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.bed', checkIfExists: true) + ]) + input[1] = [ + [id:"ref"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta', checkIfExists: true) + ] + input[2] = [ + [id:"ref"], + file(params.modules_testdata_base_path + 'genomics/homo_sapiens/genome/genome.fasta.fai', checkIfExists: true) + ] + """ + } + } + + then { + assertAll( + { assert process.success }, + { assert snapshot(process.out).match() } + ) + + + } + + } + } diff --git a/modules/nf-core/vardictjava/tests/main.nf.test.snap b/modules/nf-core/vardictjava/tests/main.nf.test.snap index c32a68b7..35674ed1 100644 --- a/modules/nf-core/vardictjava/tests/main.nf.test.snap +++ b/modules/nf-core/vardictjava/tests/main.nf.test.snap @@ -27,10 +27,10 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-07-04T19:08:38.328190023" + "timestamp": "2024-10-07T16:05:15.117453312" }, "homo_sapiens - [[bam, bam], [bai, bai], bed] - fasta - fai": { "content": [ @@ -60,9 +60,75 @@ } ], "meta": { - "nf-test": "0.8.4", - "nextflow": "24.04.2" + "nf-test": "0.9.0", + "nextflow": "24.04.4" }, - "timestamp": "2024-07-04T19:08:54.416661915" + "timestamp": "2024-10-07T16:05:26.932438089" + }, + "homo_sapiens - [bam, bai, bed] - fasta - fai - fisher": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.vcf.gz:md5,e8411ecae49b4f6afa6ea0b681ea506e" + ] + ], + "1": [ + "versions.yml:md5,6bf7aa0cbaac4a6e2acab2c475ec2389" + ], + "vcf": [ + [ + { + "id": "test" + }, + "test.vcf.gz:md5,e8411ecae49b4f6afa6ea0b681ea506e" + ] + ], + "versions": [ + "versions.yml:md5,6bf7aa0cbaac4a6e2acab2c475ec2389" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-07T16:05:38.456816851" + }, + "homo_sapiens - [bam, bai, bed] - fasta - fai - stub": { + "content": [ + { + "0": [ + [ + { + "id": "test" + }, + "test.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "1": [ + "versions.yml:md5,6bf7aa0cbaac4a6e2acab2c475ec2389" + ], + "vcf": [ + [ + { + "id": "test" + }, + "test.vcf.gz:md5,68b329da9893e34099c7d8ad5cb9c940" + ] + ], + "versions": [ + "versions.yml:md5,6bf7aa0cbaac4a6e2acab2c475ec2389" + ] + } + ], + "meta": { + "nf-test": "0.9.0", + "nextflow": "24.04.4" + }, + "timestamp": "2024-10-07T16:05:48.440804849" } } \ No newline at end of file diff --git a/modules/nf-core/vardictjava/tests/nextflow.config b/modules/nf-core/vardictjava/tests/nextflow.config new file mode 100644 index 00000000..c6e8571b --- /dev/null +++ b/modules/nf-core/vardictjava/tests/nextflow.config @@ -0,0 +1,3 @@ +process { + ext.args = "-c 1 -S 2 -E 3 -fisher" +} \ No newline at end of file diff --git a/modules/nf-core/vcf2db/environment.yml b/modules/nf-core/vcf2db/environment.yml index 58c477f9..01fc2793 100644 --- a/modules/nf-core/vcf2db/environment.yml +++ b/modules/nf-core/vcf2db/environment.yml @@ -1,5 +1,16 @@ +--- +# yaml-language-server: $schema=https://raw.githubusercontent.com/nf-core/modules/master/modules/environment-schema.json channels: - conda-forge - bioconda dependencies: + # renovate: datasource=conda depName=conda-forge/python + - conda-forge::python=2.7 + # renovate: datasource=conda depName=conda-forge/python-snappy + - conda-forge::python-snappy=0.5.4 + # renovate: datasource=conda depName=conda-forge/snappy + - conda-forge::snappy=1.1.8 + # renovate: datasource=conda depName=bioconda/cyvcf2 + - bioconda::cyvcf2=0.20.9 + # renovate: datasource=conda depName=bioconda/vcf2db - bioconda::vcf2db=2020.02.24 diff --git a/modules/nf-core/vcf2db/main.nf b/modules/nf-core/vcf2db/main.nf index 56c26bb6..a23204b2 100644 --- a/modules/nf-core/vcf2db/main.nf +++ b/modules/nf-core/vcf2db/main.nf @@ -1,12 +1,12 @@ process VCF2DB { tag "$meta.id" - label 'process_medium' + label 'process_single' // WARN: Version information not provided by tool on CLI. Please update version string below when bumping container versions. conda "${moduleDir}/environment.yml" container "${ workflow.containerEngine == 'singularity' && !task.ext.singularity_pull_docker_container ? - 'https://depot.galaxyproject.org/singularity/vcf2db:2020.02.24--pl5321hdfd78af_3': - 'biocontainers/vcf2db:2020.02.24--pl5321hdfd78af_3' }" + 'https://community-cr-prod.seqera.io/docker/registry/v2/blobs/sha256/30/3013992b36b50c203acfd01b000d37f3753aee640238f6dd39d5e47f58e54d98/data': + 'community.wave.seqera.io/library/python_python-snappy_snappy_cyvcf2_vcf2db:9c1d7f361187f21a' }" input: tuple val(meta), path(vcf), path(ped) diff --git a/modules/nf-core/vcf2db/meta.yml b/modules/nf-core/vcf2db/meta.yml index b6529a6a..48795c21 100644 --- a/modules/nf-core/vcf2db/meta.yml +++ b/modules/nf-core/vcf2db/meta.yml @@ -11,34 +11,37 @@ tools: documentation: "https://github.com/quinlan-lab/vcf2db" tool_dev_url: "https://github.com/quinlan-lab/vcf2db" licence: ["MIT"] + identifier: "" input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: VCF file - pattern: "*.vcf.gz" - - ped: - type: file - description: PED file - pattern: "*.ped" + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: VCF file + pattern: "*.vcf.gz" + - ped: + type: file + description: PED file + pattern: "*.ped" output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - db: - type: file - description: Gemini-compatible database file - pattern: "*.db" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.db": + type: file + description: Gemini-compatible database file + pattern: "*.db" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@nvnieuwk" maintainers: diff --git a/modules/nf-core/vcf2db/vcf2db.diff b/modules/nf-core/vcf2db/vcf2db.diff new file mode 100644 index 00000000..45757135 --- /dev/null +++ b/modules/nf-core/vcf2db/vcf2db.diff @@ -0,0 +1,19 @@ +Changes in module 'nf-core/vcf2db' +'modules/nf-core/vcf2db/environment.yml' is unchanged +'modules/nf-core/vcf2db/meta.yml' is unchanged +Changes in 'vcf2db/main.nf': +--- modules/nf-core/vcf2db/main.nf ++++ modules/nf-core/vcf2db/main.nf +@@ -1,6 +1,6 @@ + process VCF2DB { + tag "$meta.id" +- label 'process_medium' ++ label 'process_single' + + // WARN: Version information not provided by tool on CLI. Please update version string below when bumping container versions. + conda "${moduleDir}/environment.yml" + +'modules/nf-core/vcf2db/tests/main.nf.test.snap' is unchanged +'modules/nf-core/vcf2db/tests/tags.yml' is unchanged +'modules/nf-core/vcf2db/tests/main.nf.test' is unchanged +************************************************************ diff --git a/modules/nf-core/vcfanno/meta.yml b/modules/nf-core/vcfanno/meta.yml index 89c781ad..18d27127 100644 --- a/modules/nf-core/vcfanno/meta.yml +++ b/modules/nf-core/vcfanno/meta.yml @@ -1,5 +1,6 @@ name: vcfanno -description: quickly annotate your VCF with any number of INFO fields from any number of VCFs or BED files +description: quickly annotate your VCF with any number of INFO fields from any number + of VCFs or BED files keywords: - vcf - bed @@ -14,48 +15,53 @@ tools: tool_dev_url: https://github.com/brentp/vcfanno doi: "10.1186/s13059-016-0973-5" license: ["MIT"] + identifier: biotools:vcfanno input: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - vcf: - type: file - description: query VCF file - pattern: "*.{vcf, vcf.gz}" - - vcf_tabix: - type: file - description: tabix index of query VCF - only needed if vcf is compressed - pattern: "*.vcf.gz.tbi" - - specific_resources: - type: map - description: A list of sample specific reference files defined in toml config, must also include indices if bgzipped. - - toml: - type: file - description: configuration file with reference file basenames - pattern: "*.toml" - - lua: - type: file - description: Lua file for custom annotations - pattern: "*.lua" - - resources: - type: map - description: List of reference files defined in toml config, must also include indices if bgzipped. + - - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - vcf: + type: file + description: query VCF file + pattern: "*.{vcf, vcf.gz}" + - tbi: + type: file + description: tabix index file for the query VCF + pattern: "*.tbi" + - specific_resources: + type: map + description: A list of sample specific reference files defined in toml config, + must also include indices if bgzipped. + - - toml: + type: file + description: configuration file with reference file basenames + pattern: "*.toml" + - - lua: + type: file + description: Lua file for custom annotations + pattern: "*.lua" + - - resources: + type: map + description: List of reference files defined in toml config, must also include + indices if bgzipped. output: - - meta: - type: map - description: | - Groovy Map containing sample information - e.g. [ id:'test', single_end:false ] - - versions: - type: file - description: File containing software versions - pattern: "versions.yml" - vcf: - type: file - description: Annotated VCF file - pattern: "*.vcf" + - meta: + type: map + description: | + Groovy Map containing sample information + e.g. [ id:'test', single_end:false ] + - "*.vcf": + type: file + description: Annotated VCF file + pattern: "*.vcf" + - versions: + - versions.yml: + type: file + description: File containing software versions + pattern: "versions.yml" authors: - "@projectoriented" - "@matthdsm" diff --git a/nextflow.config b/nextflow.config index ea1909cb..44c544b1 100644 --- a/nextflow.config +++ b/nextflow.config @@ -22,20 +22,18 @@ params { add_ped = false validate = false roi = null - project = null - skip_date_project = false only_call = false only_merge = false output_genomicsdb = false callers = "haplotypecaller" vardict_min_af = 0.1 // Minimum allele frequency for VarDict normalize = false - output_suffix = "" only_pass = false keep_alt_contigs = false updio = false automap = false hc_phasing = false + min_callable_coverage = 5 // Module specific parameters dragstr = false @@ -77,6 +75,7 @@ params { multiqc_methods_description = null // References + elsites = null cmgg_config_base = "/conf/" igenomes_base = null //'s3://ngi-igenomes/igenomes' igenomes_ignore = true @@ -91,14 +90,17 @@ params { email_on_fail = null plaintext_email = false monochrome_logs = false - hook_url = null + hook_url = System.getenv('HOOK_URL') help = false + help_full = false + show_hidden = false version = false pipelines_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/' // Config options config_profile_name = null config_profile_description = null + custom_config_version = 'master' custom_config_base = "https://raw.githubusercontent.com/nf-core/configs/${params.custom_config_version}" config_profile_contact = null @@ -132,6 +134,7 @@ profiles { podman.enabled = false shifter.enabled = false charliecloud.enabled = false + channels = ['conda-forge', 'bioconda'] apptainer.enabled = false } mamba { @@ -221,37 +224,25 @@ profiles { nf_test { includeConfig 'conf/nf_test.config' } seqplorer { includeConfig 'conf/seqplorer.config' } - seqcap { includeConfig 'conf/seqcap.config' } hypercap { includeConfig 'conf/hypercap.config' } + WES { includeConfig 'conf/wes.config' } + copgt { includeConfig 'conf/copgt.config' } } -// Set default registry for Apptainer, Docker, Podman and Singularity independent of -profile -// Will not be used unless Apptainer / Docker / Podman / Singularity are enabled +// Load nf-core custom profiles from different Institutions +includeConfig !System.getenv('NXF_OFFLINE') && params.custom_config_base ? "${params.custom_config_base}/nfcore_custom.config" : "/dev/null" + +// Load nf-cmgg/germline custom profiles from different institutions. + +// Set default registry for Apptainer, Docker, Podman, Charliecloud and Singularity independent of -profile +// Will not be used unless Apptainer / Docker / Podman / Charliecloud / Singularity are enabled // Set to your registry if you have a mirror of containers apptainer.registry = 'quay.io' docker.registry = 'quay.io' podman.registry = 'quay.io' singularity.registry = 'quay.io' -// Nextflow plugins -plugins { - id 'nf-schema@2.1.0' // Validation of pipeline parameters and creation of an input channel from a sample sheet -} - -validation { - failUnrecognisedParams = false - lenientMode = false - defaultIgnoreParams = ['genomes','igenomes_base','test_data'] - showHiddenParams = false - help { - enabled = true - //beforeText = TODO - //afterText = TODO - command = "nextflow run nf-cmgg/germline -profile docker --input --outdir " - } -} - includeConfig !params.igenomes_ignore ? "conf/igenomes.config" : !params.genomes_ignore ? "https://raw.githubusercontent.com/nf-cmgg/configs/main/conf/Hsapiens/genomes.config" : "conf/empty_genomes.config" // Export these variables to prevent local Python/R libraries from conflicting with those in the container @@ -265,15 +256,35 @@ env { JULIA_DEPOT_PATH = "/usr/local/share/julia" } -// Capture exit codes from upstream processes when piping -process.shell = ['/bin/bash', '-euo', 'pipefail'] +// Set bash options +process.shell = """\ +bash + +set -e # Exit if a tool returns a non-zero status/exit code +set -u # Treat unset variables and parameters as an error +set -o pipefail # Returns the status of the last command to exit with a non-zero status or zero if all successfully execute +set -C # No clobber - prevent output redirection from overwriting files. +""" // Disable process selector warnings by default. Use debug profile to enable warnings. nextflow.enable.configProcessNamesValidation = false +manifest { + name = 'nf-cmgg/germline' + author = """nvnieuwk""" + homePage = 'https://github.com/nf-cmgg/germline' + description = """A nextflow pipeline for calling and annotating small germline variants from short DNA reads for WES and WGS data""" + mainScript = 'main.nf' + nextflowVersion = '!>=24.10.0' + version = '1.9.0' + doi = '' +} + +params.unique_out = "v${manifest.version.replace('.', '_')}_${new Date().format("yyyy_MM_dd")}" + timeline { enabled = true - file = "${params.outdir}/pipeline_info/execution_timeline_${new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss')}.html" + file = "${params.outdir}/${params.unique_out}/execution_timeline_${new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss')}.html" } report { enabled = true @@ -288,16 +299,37 @@ dag { file = timeline.file.replace("execution_timeline", "pipeline_dag") } -manifest { - name = 'nf-cmgg/germline' - author = """nvnieuwk""" - homePage = 'https://github.com/nf-cmgg/germline' - description = """A nextflow pipeline for calling and annotating small germline variants from short DNA reads for WES and WGS data""" - mainScript = 'main.nf' - nextflowVersion = '!>=24.04.0' - version = '1.8.2' - doi = '' +// Nextflow plugins +plugins { + id 'nf-schema@2.2.0' // Validation of pipeline parameters and creation of an input channel from a sample sheet +} + +validation { + defaultIgnoreParams = ["genomes", "test_data", "igenomes_base"] + help { + enabled = true + command = "nextflow run $manifest.name -profile --input samplesheet.csv --outdir " + fullParameter = "help_full" + showHiddenParameter = "show_hidden" + beforeText = """ +-\033[2m----------------------------------------------------\033[0m- + \033[0;34m ///\033[0;32m/// \033[0m +\033[0;34m ___ __ _ _ __ __ \033[0;34m ///\033[0;32m///// \033[0m +\033[0;34m |\\ | |__ __ / ` | \\/ | / _` / _` \033[0;34m////\033[0;32m////// \033[0m +\033[0;34m | \\| | \\__, | | \\__| \\__| \033[0;34m///\033[0;32m///// \033[0m + \033[0;34m///\033[0;32m/// \033[0m +\033[0;35m ${manifest.name} ${manifest.version}\033[0m +-\033[2m----------------------------------------------------\033[0m- +""" + } + summary { + beforeText = validation.help.beforeText + hideParams = ["genomes"] + } } // Load modules.config for DSL2 module specific options includeConfig 'conf/modules.config' + +workflow.output.mode = params.publish_dir_mode +outputDir = params.outdir diff --git a/nextflow_schema.json b/nextflow_schema.json index 31c7b78e..4e061f2a 100644 --- a/nextflow_schema.json +++ b/nextflow_schema.json @@ -31,7 +31,7 @@ "watchdir": { "type": "string", "format": "directory-path", - "description": "A folder to watch for the creation of files that start with `watch:` in the samplesheet", + "description": "A folder to watch for the creation of files that start with `watch:` in the samplesheet.", "fa_icon": "fas fa-folder-open" }, "email": { @@ -46,7 +46,8 @@ "format": "file-path", "exists": true, "pattern": "^\\S+\\.ped$", - "description": "Path to a pedigree file for all samples in the run" + "description": "Path to a pedigree file for all samples in the run. All relational data will be fetched from this file.", + "help": "A PED file given in the samplesheet will be used above this PED file." } } }, @@ -59,7 +60,7 @@ "genome": { "type": "string", "default": "GRCh38", - "description": "Reference genome build", + "description": "Reference genome build. Used to fetch the right reference files.", "help_text": "Requires a Genome Reference Consortium reference ID (e.g. GRCh38)" }, "fasta": { @@ -83,27 +84,53 @@ "dict": { "type": "string", "pattern": "^\\S+\\.dict$", - "description": "Path to the sequence dictionary generated from the FASTA reference", + "description": "Path to the sequence dictionary generated from the FASTA reference. This is only used when `haplotypecaller` is one of the specified callers.", + "help": "The pipeline will autogenerate this file when missing.", "fa_icon": "far fa-file-code", "format": "file-path", "mimetype": "text/plain" }, "strtablefile": { "type": "string", - "description": "Path to the STR table file generated from the FASTA reference", + "description": "Path to the STR table file generated from the FASTA reference. This is only used when `--dragstr` has been given.", + "help": "The pipeline will autogenerate this file when missing.", "fa_icon": "fas fa-folder", "format": "path" }, "sdf": { "type": "string", - "description": "Path to the SDF folder generated from the reference FASTA file", + "description": "Path to the SDF folder generated from the reference FASTA file. This is only required when using `--validate`.", + "help": "The pipeline will autogenerate this file when missing.", "format": "path", "fa_icon": "fas fa-folder" }, + "elfasta": { + "type": "string", + "format": "file-path", + "exists": true, + "mimetype": "text/plain", + "pattern": "^\\S+\\.elfasta$", + "description": "Path to the ELFASTA genome file. This is used when `elprep` is part of the callers and will be automatically generated when missing.", + "fa_icon": "far fa-file-code" + }, + "elsites": { + "type": "string", + "format": "file-path", + "exists": true, + "mimetype": "text/plain", + "pattern": "^\\S+\\.elsites$", + "description": "Path to the elsites file. This is used when `elprep` is part of the callers.", + "fa_icon": "far fa-file-code" + }, + "genomes": { + "type": "object", + "hidden": true, + "description": "Object for genomes" + }, "genomes_base": { "type": "string", "default": "/references/", - "description": "Directory base for CMGG reference store (used when --genomes_ignore false is specified)", + "description": "Directory base for CMGG reference store (used when `--genomes_ignore false` is specified)", "fa_icon": "fas fa-download", "format": "directory-path" }, @@ -117,7 +144,7 @@ "genomes_ignore": { "type": "boolean", "hidden": true, - "description": "Do not load the local references from the path specified with --genomes_base", + "description": "Do not load the local references from the path specified with `--genomes_base`", "fa_icon": "fas fa-ban" }, "igenomes_base": { @@ -153,8 +180,9 @@ "merge_distance": { "type": "integer", "default": 100000, - "description": "The merge distance for genotype BED files", - "help_text": "Increase this parameter if GenomicsDBImport is running slow. This defines the maximum distance between intervals that should be merged. The less intervals GenomicsDBImport actually gets, the faster it will run." + "description": "The merge distance for family BED files", + "help_text": "Increase this parameter if GenomicsDBImport is running slow. This defines the maximum distance between intervals that should be merged. The less intervals GenomicsDBImport actually gets, the faster it will run.", + "minimum": 1 }, "dragstr": { "type": "boolean", @@ -164,135 +192,125 @@ "validate": { "type": "boolean", "description": "Validate the found variants", - "help_text": "This only validates individual sample GVCFs that have truth VCF supplied to them via the samplesheet (in row `truth_vcf`, with an optional index in the `truth_tbi` row)" + "help": "A sample should have at least a `truth_vcf` supplied along with it in the samplesheet for it be validated." }, "filter": { "type": "boolean", - "description": "Filter the found variants" + "description": "Filter the found variants." }, "annotate": { "type": "boolean", - "description": "Annotate the found variants" + "description": "Annotate the found variants using Ensembl VEP." }, "add_ped": { "type": "boolean", - "description": "Add PED INFO header lines to the final VCFs" + "description": "Add PED INFO header lines to the final VCFs." }, "gemini": { "type": "boolean", - "description": "Create a Gemini databases from the final VCFs" + "description": "Create a Gemini databases from the final VCFs." }, "mosdepth_slow": { "type": "boolean", "description": "Don't run mosdepth in fast-mode", - "help_text": "This is advised if you need exact coverage BED files as output" - }, - "project": { - "type": "string", - "description": "The name of the project.", - "help_text": "This will be used to specify the final output files folder in the output directory." - }, - "skip_date_project": { - "type": "boolean", - "description": "Don't add the current date to the output project folder" + "help_text": "This is advised if you need exact coverage BED files as output." }, "roi": { "type": "string", - "description": "Path to the default ROI (regions of interest) BED file to be used for WES analysis", + "description": "Path to the default ROI (regions of interest) BED file to be used for WES analysis.", "help_text": "This will be used for all samples that do not have a specific ROI file supplied to them through the samplesheet. Don't supply an ROI file to run the analysis as WGS.", "format": "file-path", "pattern": "^\\S+\\.bed(\\.gz)?$", - "mimetype": "text/plain" + "exists": true }, "dbsnp": { "type": "string", - "description": "Path to the dbSNP VCF file", + "description": "Path to the dbSNP VCF file. This will be used to set the variant IDs.", "fa_icon": "far fa-file-alt", "format": "file-path", "pattern": "^\\S+\\.vcf\\.gz$", - "mimetype": "text/plain" + "exists": true }, "dbsnp_tbi": { "type": "string", - "description": "Path to the index of the dbSNP VCF file", + "description": "Path to the index of the dbSNP VCF file.", "fa_icon": "far fa-file-alt", "format": "file-path", "pattern": "^\\S+\\.tbi$", - "mimetype": "text/plain" + "exists": true }, "somalier_sites": { "type": "string", "default": "https://github.com/brentp/somalier/files/3412456/sites.hg38.vcf.gz", "fa_icon": "far fa-file-alt", - "description": "Path to the VCF file with sites for Somalier to use", + "description": "Path to the VCF file with sites for Somalier to use.", "pattern": "^\\S+\\.vcf\\.gz", "format": "file-path", - "mimetype": "text/plain" + "exists": true }, "only_call": { "type": "boolean", - "description": "Only call the variants without doing any post-processing" + "description": "Only call the variants without doing any post-processing." }, "only_merge": { "type": "boolean", - "description": "Only run the pipeline until the creation of the genomicsdbs and output them" + "description": "Only run the pipeline until the creation of the genomicsdbs and output them." }, "output_genomicsdb": { "type": "boolean", - "description": "Output the genomicsDB together with the joint-genotyped VCF" + "description": "Output the genomicsDB together with the joint-genotyped VCF." }, "callers": { "type": "string", - "description": "A comma delimited string of the available callers. Current options are: 'haplotypecaller' and 'vardict'", + "description": "A comma delimited string of the available callers. Current options are: `haplotypecaller` and `vardict`.", "default": "haplotypecaller" }, "vardict_min_af": { "type": "number", - "description": "The minimum allele frequency for VarDict when no `vardict_min_af` is supplied in the samplesheet", - "default": 0.1 + "description": "The minimum allele frequency for VarDict when no `vardict_min_af` is supplied in the samplesheet.", + "default": 0.1, + "minimum": 0 }, "normalize": { "type": "boolean", - "description": "Normalize the VCFs" - }, - "output_suffix": { - "type": "string", - "description": "A custom suffix to add to the basename of the output files" + "description": "Normalize the variant in the final VCFs." }, "only_pass": { "type": "boolean", - "description": "Filter out all variants that don't have the PASS filter for vardict. This only works when --filter is also given" + "description": "Filter out all variants that don't have the PASS filter for vardict. This only works when `--filter` is also given." }, "keep_alt_contigs": { "type": "boolean", - "description": "Keep all aditional contigs for calling instead of filtering them out before" + "description": "Keep all aditional contigs for calling instead of filtering them out before." }, "updio": { "type": "boolean", - "description": "Run UPDio analysis on the resulting VCFs" + "description": "Run UPDio analysis on the final VCFs." }, "updio_common_cnvs": { "type": "string", - "description": "A TSV file containing common CNVs to be used by UPDio", + "description": "A TSV file containing common CNVs to be used by UPDio.", "format": "file-path", "exists": true, "pattern": "^\\S+\\.tsv$" }, "automap": { "type": "boolean", - "description": "Run AutoMap analysis on the resulting VCFs" + "description": "Run AutoMap analysis on the final VCFs." }, "automap_repeats": { "type": "string", "description": "BED file with repeat regions in the genome.", "help_text": "This file will be automatically generated for hg38/GRCh38 and hg19/GRCh37 when this parameter has not been given.", - "pattern": "^\\S+\\.bed$" + "pattern": "^\\S+\\.bed$", + "exists": true }, "automap_panel": { "type": "string", "description": "TXT file with gene panel regions to be used by AutoMap.", "help_text": "By default the CMGG gene panel list will be used.", - "pattern": "^\\S+\\.txt$" + "pattern": "^\\S+\\.txt$", + "exists": true }, "automap_panel_name": { "type": "string", @@ -301,7 +319,18 @@ }, "hc_phasing": { "type": "boolean", - "description": "Perform phasing with HaplotypeCaller" + "description": "Perform phasing with HaplotypeCaller." + }, + "min_callable_coverage": { + "type": "integer", + "description": "The lowest callable coverage to determine callable regions.", + "default": 5, + "minimum": 0 + }, + "unique_out": { + "type": "string", + "description": "Don't change this value", + "hidden": true } } }, @@ -360,11 +389,6 @@ "description": "Less common options for the pipeline, typically set in a config file.", "help_text": "These options are common to all nf-core pipelines and allow you to customise some of the core preferences for how the pipeline runs.\n\nTypically these options would be set in a Nextflow config file loaded for all pipeline runs, such as `~/.nextflow/config`.", "properties": { - "help": { - "type": "boolean", - "description": "Display help text.", - "fa_icon": "fas fa-question-circle" - }, "version": { "type": "boolean", "description": "Display version and exit.", @@ -458,187 +482,188 @@ "vep_chunk_size": { "type": "integer", "default": 50000, - "description": "The amount of sites per split VCF as input to VEP" + "description": "The amount of sites per split VCF as input to VEP.", + "minimum": 1 }, "species": { "type": "string", "default": "homo_sapiens", - "description": "The species of the samples", + "description": "The species of the samples.", "fa_icon": "fas fa-user-circle", "pattern": "^[a-z_]*$", - "help_text": "Must be lower case and have underscores as spaces" + "help_text": "Must be lower case and have underscores as spaces." }, "vep_merged": { "type": "boolean", "default": true, - "description": "Specify if the VEP cache is a merged cache" + "description": "Specify if the VEP cache is a merged cache." }, "vep_cache": { "type": "string", - "description": "The path to the VEP cache", + "description": "The path to the VEP cache.", "format": "path" }, "vep_dbnsfp": { "type": "boolean", - "description": "Use the dbNSFP plugin with Ensembl VEP", + "description": "Use the dbNSFP plugin with Ensembl VEP.", "fa_icon": "fas fa-question-circle", "help_text": "The '--dbnsfp' and '--dbnsfp_tbi' parameters need to be specified when using this parameter." }, "vep_spliceai": { "type": "boolean", - "description": "Use the SpliceAI plugin with Ensembl VEP", + "description": "Use the SpliceAI plugin with Ensembl VEP.", "fa_icon": "fas fa-question-circle", "help_text": "The '--spliceai_indel', '--spliceai_indel_tbi', '--spliceai_snv' and '--spliceai_snv_tbi' parameters need to be specified when using this parameter." }, "vep_spliceregion": { "type": "boolean", - "description": "Use the SpliceRegion plugin with Ensembl VEP", + "description": "Use the SpliceRegion plugin with Ensembl VEP.", "fa_icon": "fas fa-question-circle" }, "vep_mastermind": { "type": "boolean", - "description": "Use the Mastermind plugin with Ensembl VEP", + "description": "Use the Mastermind plugin with Ensembl VEP.", "fa_icon": "fas fa-question-circle", "help_text": "The '--mastermind' and '--mastermind_tbi' parameters need to be specified when using this parameter." }, "vep_maxentscan": { "type": "boolean", - "description": "Use the MaxEntScan plugin with Ensembl VEP", + "description": "Use the MaxEntScan plugin with Ensembl VEP.", "fa_icon": "fas fa-question-circle", "help_text": "The '--maxentscan' parameter need to be specified when using this parameter." }, "vep_eog": { "type": "boolean", - "description": "Use the custom EOG annotation with Ensembl VEP", + "description": "Use the custom EOG annotation with Ensembl VEP.", "fa_icon": "fas fa-question-circle", "help_text": "The '--eog' and '--eog_tbi' parameters need to be specified when using this parameter." }, "vep_alphamissense": { "type": "boolean", - "description": "Use the AlphaMissense plugin with Ensembl VEP", + "description": "Use the AlphaMissense plugin with Ensembl VEP.", "fa_icon": "fas fa-question-circle", "help_text": "The '--alphamissense' and '--alphamissense_tbi' parameters need to be specified when using this parameter." }, "vep_version": { "type": "number", "default": 105.0, - "description": "The version of the VEP tool to be used", + "description": "The version of the VEP tool to be used.", "fa_icon": "fas fa-code-branch" }, "vep_cache_version": { "type": "integer", "default": 105, - "description": "The version of the VEP cache to be used", + "description": "The version of the VEP cache to be used.", "fa_icon": "fas fa-code-branch" }, "dbnsfp": { "type": "string", - "description": "Path to the dbSNFP file", + "description": "Path to the dbSNFP file.", "format": "file-path", "fa_icon": "far fa-file-alt", - "mimetype": "text/plain", - "pattern": "^\\S+\\.gz$" + "pattern": "^\\S+\\.gz$", + "exists": true }, "dbnsfp_tbi": { "type": "string", "format": "file-path", - "description": "Path to the index of the dbSNFP file", + "description": "Path to the index of the dbSNFP file.", "fa_icon": "far fa-file-alt", "pattern": "^\\S+\\.(csi|tbi)$", - "mimetype": "text/plain" + "exists": true }, "spliceai_indel": { "type": "string", "format": "file-path", - "description": "Path to the VCF containing indels for spliceAI", + "description": "Path to the VCF containing indels for spliceAI.", "fa_icon": "far fa-file-alt", "pattern": "^\\S+\\.vcf\\.gz$", - "mimetype": "text/plain" + "exists": true }, "spliceai_indel_tbi": { "type": "string", "format": "file-path", - "description": "Path to the index of the VCF containing indels for spliceAI", + "description": "Path to the index of the VCF containing indels for spliceAI.", "pattern": "^\\S+\\.(csi|tbi)$", - "mimetype": "text/plain" + "exists": true }, "spliceai_snv": { "type": "string", "format": "file-path", - "description": "Path to the VCF containing SNVs for spliceAI", + "description": "Path to the VCF containing SNVs for spliceAI.", "pattern": "^\\S+\\.vcf\\.gz$", - "mimetype": "text/plain" + "exists": true }, "spliceai_snv_tbi": { "type": "string", "format": "file-path", - "description": "Path to the index of the VCF containing SNVs for spliceAI", + "description": "Path to the index of the VCF containing SNVs for spliceAI.", "pattern": "^\\S+\\.(csi|tbi)$", - "mimetype": "text/plain" + "exists": true }, "mastermind": { "type": "string", "format": "file-path", - "description": "Path to the VCF for Mastermind", + "description": "Path to the VCF for Mastermind.", "pattern": "^\\S+\\.vcf\\.gz$", - "mimetype": "text/plain" + "exists": true }, "mastermind_tbi": { "type": "string", "format": "file-path", - "description": "Path to the index of the VCF for Mastermind", + "description": "Path to the index of the VCF for Mastermind.", "pattern": "^\\S+\\.(csi|tbi)$", - "mimetype": "text/plain" + "exists": true }, "alphamissense": { "type": "string", "format": "file-path", - "description": "Path to the TSV for AlphaMissense", + "description": "Path to the TSV for AlphaMissense.", "pattern": "^\\S+\\.tsv\\.gz$", - "mimetype": "text/plain" + "exists": true }, "alphamissense_tbi": { "type": "string", "format": "file-path", - "description": "Path to the index of the TSV for AlphaMissense", + "description": "Path to the index of the TSV for AlphaMissense.", "pattern": "^\\S+\\.(csi|tbi)$", - "mimetype": "text/plain" + "exists": true }, "eog": { "type": "string", "format": "file-path", - "description": "Path to the VCF containing EOG annotations", + "description": "Path to the VCF containing EOG annotations.", "pattern": "^\\S+\\.vcf\\.gz$", - "mimetype": "text/plain" + "exists": true }, "eog_tbi": { "type": "string", "format": "file-path", - "description": "Path to the index of the VCF containing EOG annotations", + "description": "Path to the index of the VCF containing EOG annotations.", "pattern": "^\\S+\\.(csi|tbi)$", - "mimetype": "text/plain" + "exists": true }, "vcfanno": { "type": "boolean", - "description": "Run annotations with vcfanno" + "description": "Run annotations with vcfanno." }, "vcfanno_config": { "type": "string", - "description": "The path to the VCFanno config TOML", + "description": "The path to the VCFanno config TOML.", "pattern": "^\\S+\\.toml$", "format": "file-path", - "mimetype": "text/plain" + "exists": true }, "vcfanno_lua": { "type": "string", - "description": "The path to a Lua script to be used in VCFanno", + "description": "The path to a Lua script to be used in VCFanno.", "pattern": "^\\S+\\.lua$", "format": "file-path", - "mimetype": "text/plain" + "exists": true }, "vcfanno_resources": { "type": "string", - "description": "A semicolon-seperated list of resource files for VCFanno, please also supply their indices using this parameter" + "description": "A semicolon-seperated list of resource files for VCFanno, please also supply their indices using this parameter." } }, "help_text": "Annotation will only run when `--annotate true` is specified." diff --git a/nf-test.config b/nf-test.config index 6d58c41d..8b08d5cd 100644 --- a/nf-test.config +++ b/nf-test.config @@ -1,12 +1,15 @@ config { - testsDir "tests" + testsDir "." workDir ".nf-test" configFile "tests/nextflow.config" profile "nf_test,docker" plugins { - load "nft-bam@0.1.1" + load "nft-bam@0.4.0" + load "nft-vcf@1.0.7" } + triggers "conf/modules.config", "nextflow.config" + } diff --git a/subworkflows/local/bam_call_elprep/main.nf b/subworkflows/local/bam_call_elprep/main.nf new file mode 100644 index 00000000..11c8180c --- /dev/null +++ b/subworkflows/local/bam_call_elprep/main.nf @@ -0,0 +1,73 @@ +// +// Call the variants using Elprep +// + +include { ELPREP_FILTER } from '../../../modules/nf-core/elprep/filter/main' +include { BCFTOOLS_STATS } from '../../../modules/nf-core/bcftools/stats/main' + +include { VCF_CONCAT_BCFTOOLS } from '../vcf_concat_bcftools/main' +include { VCF_DBSNP_VCFANNO } from '../vcf_dbsnp_vcfanno/main' + +workflow BAM_CALL_ELPREP { + take: + ch_input // channel: [mandatory] [ val(meta), path(bam), path(bai), path(bed) ] => sample BAM files and their indexes with the split bed files + ch_elfasta // channel: [mandatory] [ val(meta), path(fasta) ] => fasta reference + ch_elsites // channel: [optional] [ val(meta), path(elsites) ] + ch_dbsnp // channel: [optional] [ path(dbsnp) ] => The VCF containing the dbsnp variants + ch_dbsnp_tbi // channel: [optional] [ path(dbsnp_tbi) ] => The index of the dbsnp VCF + + main: + + def ch_versions = Channel.empty() + + ELPREP_FILTER( + ch_input.map { meta, bam, bai, bed -> + def new_meta = meta + [caller:'elprep'] + [ new_meta, bam, bai, bed, [], [], [] ] + }, + [[],[]], + ch_elfasta, + ch_elsites, + true, // haplotypecaller + false, + false, + false, + false + ) + ch_versions = ch_versions.mix(ELPREP_FILTER.out.versions.first()) + + VCF_CONCAT_BCFTOOLS( + ELPREP_FILTER.out.gvcf, + true + ) + ch_versions = ch_versions.mix(VCF_CONCAT_BCFTOOLS.out.versions) + + def ch_annotated = Channel.empty() + if(!(ch_dbsnp instanceof List)) { + VCF_DBSNP_VCFANNO( + VCF_CONCAT_BCFTOOLS.out.vcfs, + ch_dbsnp, + ch_dbsnp_tbi + ) + ch_versions = ch_versions.mix(VCF_DBSNP_VCFANNO.out.versions) + ch_annotated = VCF_DBSNP_VCFANNO.out.vcfs + } else { + ch_annotated = VCF_CONCAT_BCFTOOLS.out.vcfs + } + + BCFTOOLS_STATS( + ch_annotated, + [[],[]], + [[],[]], + [[],[]], + [[],[]], + [[],[]] + ) + ch_versions = ch_versions.mix(BCFTOOLS_STATS.out.versions.first()) + + emit: + gvcfs = ch_annotated // channel: [ val(meta), path(vcf), path(tbi) ] + reports = BCFTOOLS_STATS.out.stats // channel: [ val(meta), path(stats) ] + versions = ch_versions // channel: [ versions.yml ] + +} diff --git a/subworkflows/local/bam_call_elprep/tests/main.nf.test b/subworkflows/local/bam_call_elprep/tests/main.nf.test new file mode 100644 index 00000000..a096fd32 --- /dev/null +++ b/subworkflows/local/bam_call_elprep/tests/main.nf.test @@ -0,0 +1,107 @@ +nextflow_workflow { + + name "Test Workflow BAM_CALL_ELPREP" + script "../main.nf" + workflow "BAM_CALL_ELPREP" + + tag "subworkflows" + tag "subworkflows_local" + tag "bam_call_elprep" + tag "vcf_dbsnp_vcfanno" + + test("bam_call_elprep - default") { + + + when { + params { + callers = "elprep" + } + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143.00001", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.bam1, checkIfExists:true), + file(params.bai1, checkIfExists:true), + file(params.split1, checkIfExists:true) + ],[ + [id:"NA24143.00002", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.bam1, checkIfExists:true), + file(params.bai1, checkIfExists:true), + file(params.split2, checkIfExists:true) + ],[ + [id:"NA24143.00003", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.bam1, checkIfExists:true), + file(params.bai1, checkIfExists:true), + file(params.split3, checkIfExists:true) + ]) + input[1] = Channel.value([ + [id:"fasta"], + file(params.elfasta, checkIfExists:true) + ]) + input[2] = [[],[]] + input[3] = [[],[]] + input[4] = [[],[]] + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.gvcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}", it[2][-12..-1]] }, + workflow.out.reports + ).match() } + ) + } + + } + + test("bam_call_elprep - dbsnp") { + + + when { + params { + callers = "elprep" + } + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143.00001", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.bam1, checkIfExists:true), + file(params.bai1, checkIfExists:true), + file(params.split1, checkIfExists:true) + ],[ + [id:"NA24143.00002", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.bam1, checkIfExists:true), + file(params.bai1, checkIfExists:true), + file(params.split2, checkIfExists:true) + ],[ + [id:"NA24143.00003", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.bam1, checkIfExists:true), + file(params.bai1, checkIfExists:true), + file(params.split3, checkIfExists:true) + ]) + input[1] = Channel.value([ + [id:"fasta"], + file(params.elfasta, checkIfExists:true) + ]) + input[2] = [[],[]] + input[3] = [[id:'dbsnp'], file(params.vcf1, checkIfExists:true)] + input[4] = [[id:'dbsnp'], file(params.tbi1, checkIfExists:true)] + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.gvcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}", it[2][-12..-1]] }, + workflow.out.reports + ).match() } + ) + } + + } +} diff --git a/subworkflows/local/bam_call_elprep/tests/main.nf.test.snap b/subworkflows/local/bam_call_elprep/tests/main.nf.test.snap new file mode 100644 index 00000000..2d85d7f1 --- /dev/null +++ b/subworkflows/local/bam_call_elprep/tests/main.nf.test.snap @@ -0,0 +1,70 @@ +{ + "bam_call_elprep - dbsnp": { + "content": [ + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "elprep" + }, + "variantsMD5:974ed65cfad6264db7c6589d6b7d7d74", + "g.vcf.gz.tbi" + ] + ], + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "elprep" + }, + "NA24143.elprep.bcftools_stats.txt:md5,36b9f979c03b24d87e2dc710baf3672b" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-06T16:23:49.669427501" + }, + "bam_call_elprep - default": { + "content": [ + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "elprep" + }, + "variantsMD5:974ed65cfad6264db7c6589d6b7d7d74", + "g.vcf.gz.tbi" + ] + ], + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "elprep" + }, + "NA24143.elprep.bcftools_stats.txt:md5,36b9f979c03b24d87e2dc710baf3672b" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-06T16:23:17.425264939" + } +} \ No newline at end of file diff --git a/subworkflows/local/bam_call_vardictjava/main.nf b/subworkflows/local/bam_call_vardictjava/main.nf new file mode 100644 index 00000000..d24fe6d2 --- /dev/null +++ b/subworkflows/local/bam_call_vardictjava/main.nf @@ -0,0 +1,63 @@ +include { VARDICTJAVA } from '../../../modules/nf-core/vardictjava/main' +include { TABIX_BGZIP } from '../../../modules/nf-core/tabix/bgzip/main' +include { BCFTOOLS_REHEADER } from '../../../modules/nf-core/bcftools/reheader/main' +include { VCFANNO } from '../../../modules/nf-core/vcfanno/main' +include { TABIX_TABIX } from '../../../modules/nf-core/tabix/tabix/main' +include { BCFTOOLS_STATS } from '../../../modules/nf-core/bcftools/stats/main' + +include { VCF_CONCAT_BCFTOOLS } from '../vcf_concat_bcftools/main' +include { VCF_FILTER_BCFTOOLS } from '../vcf_filter_bcftools/main' +include { VCF_DBSNP_VCFANNO } from '../vcf_dbsnp_vcfanno/main' + +workflow BAM_CALL_VARDICTJAVA { + take: + ch_input // channel: [mandatory] [ val(meta), path(bam), path(bai), path(bed) ] => sample CRAM files and their indexes + ch_fasta // channel: [mandatory] [ val(meta), path(fasta) ] => fasta reference + ch_fai // channel: [mandatory] [ val(meta), path(fai) ] => fasta reference index + ch_dbsnp // channel: [optional] [ path(vcf) ] => the dbnsp vcf file + ch_dbsnp_tbi // channel: [optional] [ path(tbi) ] => the dbsnp vcf index file + + main: + def ch_versions = Channel.empty() + + VARDICTJAVA( + ch_input.map { meta, bam, bai, bed -> + def new_meta = meta + [caller:'vardict'] + [ new_meta, bam, bai, bed ] + }, + ch_fasta, + ch_fai + ) + ch_versions = ch_versions.mix(VARDICTJAVA.out.versions.first()) + + VCF_CONCAT_BCFTOOLS( + VARDICTJAVA.out.vcf, + true + ) + ch_versions = ch_versions.mix(VCF_CONCAT_BCFTOOLS.out.versions) + + def ch_annotated = Channel.empty() + if(!(ch_dbsnp instanceof List)) { + VCF_DBSNP_VCFANNO( + VCF_CONCAT_BCFTOOLS.out.vcfs, + ch_dbsnp, + ch_dbsnp_tbi + ) + ch_versions = ch_versions.mix(VCF_DBSNP_VCFANNO.out.versions) + ch_annotated = VCF_DBSNP_VCFANNO.out.vcfs + } else { + ch_annotated = VCF_CONCAT_BCFTOOLS.out.vcfs + } + + def ch_vcfs = ch_annotated + .map { meta, vcf, tbi -> + def new_meta = meta + [family_samples: meta.sample] + [ new_meta, vcf, tbi ] + } + + emit: + vcfs = ch_vcfs // channel: [ val(meta), path(vcf), path(tbi) ] + + versions = ch_versions // channel: [ path(versions.yml) ] + +} diff --git a/subworkflows/local/bam_call_vardictjava/tests/main.nf.test b/subworkflows/local/bam_call_vardictjava/tests/main.nf.test new file mode 100644 index 00000000..46ec0aed --- /dev/null +++ b/subworkflows/local/bam_call_vardictjava/tests/main.nf.test @@ -0,0 +1,61 @@ +nextflow_workflow { + + name "Test Workflow BAM_CALL_VARDICTJAVA" + script "../main.nf" + workflow "BAM_CALL_VARDICTJAVA" + + tag "subworkflows" + tag "subworkflows_local" + tag "bam_call_vardictjava" + tag "vcf_concat_bcftools" + + test("bam_call_vardictjava - default") { + + when { + params { + callers = "vardict" + } + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143.00001", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.bam1, checkIfExists:true), + file(params.bai1, checkIfExists:true), + file(params.split1, checkIfExists:true) + ],[ + [id:"NA24143.00002", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.bam1, checkIfExists:true), + file(params.bai1, checkIfExists:true), + file(params.split2, checkIfExists:true) + ],[ + [id:"NA24143.00003", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.bam1, checkIfExists:true), + file(params.bai1, checkIfExists:true), + file(params.split3, checkIfExists:true) + ]) + input[1] = Channel.value([ + [id:"fasta"], + file(params.fasta, checkIfExists:true) + ]) + input[2] = Channel.value([ + [id:"fai"], + file(params.fai, checkIfExists:true) + ]) + input[3] = [[],[]] + input[4] = [[],[]] + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.vcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}", it[2][-10..-1]] } + ).match() } + ) + } + + } + +} diff --git a/subworkflows/local/bam_call_vardictjava/tests/main.nf.test.snap b/subworkflows/local/bam_call_vardictjava/tests/main.nf.test.snap new file mode 100644 index 00000000..5d37ef35 --- /dev/null +++ b/subworkflows/local/bam_call_vardictjava/tests/main.nf.test.snap @@ -0,0 +1,24 @@ +{ + "bam_call_vardictjava - default": { + "content": [ + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "vardict" + }, + "variantsMD5:98497d2c15c6e3781f5ddeb81bf6288f", + "vcf.gz.tbi" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-06T16:29:46.9755281" + } +} \ No newline at end of file diff --git a/subworkflows/local/cram_call_gatk4/main.nf b/subworkflows/local/cram_call_gatk4/main.nf index 308b69f0..a63564a2 100644 --- a/subworkflows/local/cram_call_gatk4/main.nf +++ b/subworkflows/local/cram_call_gatk4/main.nf @@ -4,9 +4,9 @@ include { GATK4_CALIBRATEDRAGSTRMODEL } from '../../../modules/nf-core/gatk4/calibratedragstrmodel/main' include { GATK4_HAPLOTYPECALLER } from '../../../modules/nf-core/gatk4/haplotypecaller/main' -include { BCFTOOLS_STATS as BCFTOOLS_STATS_SINGLE } from '../../../modules/nf-core/bcftools/stats/main' +include { BCFTOOLS_STATS } from '../../../modules/nf-core/bcftools/stats/main' -include { VCF_CONCAT_BCFTOOLS } from '../vcf_concat_bcftools/main' +include { VCF_CONCAT_BCFTOOLS } from '../vcf_concat_bcftools/main' workflow CRAM_CALL_GATK4 { take: @@ -21,13 +21,13 @@ workflow CRAM_CALL_GATK4 { main: - ch_versions = Channel.empty() + def ch_versions = Channel.empty() // // Generate DRAGSTR models (if --dragstr is specified) // - ch_cram_models = Channel.empty() + def ch_cram_models = Channel.empty() if (dragstr) { ch_input @@ -43,26 +43,24 @@ workflow CRAM_CALL_GATK4 { .set { ch_dragstr_input } GATK4_CALIBRATEDRAGSTRMODEL( - ch_dragstr_input.map { meta, cram, crai, beds -> [ meta, cram, crai ] }, - ch_fasta.map { meta, fasta -> fasta }, - ch_fai.map { meta, fai -> fai }, - ch_dict.map { meta, dict -> dict }, - ch_strtablefile.map { meta, str -> str } + ch_dragstr_input.map { meta, cram, crai, _beds -> [ meta, cram, crai ] }, + ch_fasta.map { _meta, fasta -> fasta }, + ch_fai.map { _meta, fai -> fai }, + ch_dict.map { _meta, dict -> dict }, + ch_strtablefile.map { _meta, str -> str } ) ch_versions = ch_versions.mix(GATK4_CALIBRATEDRAGSTRMODEL.out.versions.first()) - ch_original + ch_cram_models = ch_original .combine(GATK4_CALIBRATEDRAGSTRMODEL.out.dragstr_model, by: 0) .map { meta, cram, crai, bed, dragstr_model -> def new_meta = meta + [id:bed.baseName] [ new_meta, cram, crai, bed, dragstr_model ] } - .set { ch_cram_models } } else { - ch_input + ch_cram_models = ch_input .map { meta, cram, crai, bed -> [ meta, cram, crai, bed, [] ] } - .set { ch_cram_models } } GATK4_HAPLOTYPECALLER( @@ -89,7 +87,7 @@ workflow CRAM_CALL_GATK4 { ) ch_versions = ch_versions.mix(VCF_CONCAT_BCFTOOLS.out.versions) - BCFTOOLS_STATS_SINGLE( + BCFTOOLS_STATS( VCF_CONCAT_BCFTOOLS.out.vcfs, [[],[]], [[],[]], @@ -97,13 +95,12 @@ workflow CRAM_CALL_GATK4 { [[],[]], [[],[]] ) - ch_versions = ch_versions.mix(BCFTOOLS_STATS_SINGLE.out.versions.first()) + ch_versions = ch_versions.mix(BCFTOOLS_STATS.out.versions.first()) - reports = BCFTOOLS_STATS_SINGLE.out.stats.collect{ meta, report -> report} emit: gvcfs = VCF_CONCAT_BCFTOOLS.out.vcfs // channel: [ val(meta), path(vcf), path(tbi) ] - reports // channel: [ path(stats) ] + reports = BCFTOOLS_STATS.out.stats // channel: [ val(meta), path(stats) ] versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/local/cram_call_gatk4/tests/main.nf.test b/subworkflows/local/cram_call_gatk4/tests/main.nf.test new file mode 100644 index 00000000..1807948a --- /dev/null +++ b/subworkflows/local/cram_call_gatk4/tests/main.nf.test @@ -0,0 +1,129 @@ +nextflow_workflow { + + name "Test Workflow CRAM_CALL_GATK4" + script "../main.nf" + workflow "CRAM_CALL_GATK4" + + tag "subworkflows" + tag "subworkflows_local" + tag "cram_call_gatk4" + tag "vcf_concat_bcftools" + + test("cram_call_gatk4 - default") { + + + when { + params { + callers = "haplotypecaller" + } + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143.00001", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.cram1, checkIfExists:true), + file(params.crai1, checkIfExists:true), + file(params.split1, checkIfExists:true) + ],[ + [id:"NA24143.00002", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.cram1, checkIfExists:true), + file(params.crai1, checkIfExists:true), + file(params.split2, checkIfExists:true) + ],[ + [id:"NA24143.00003", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.cram1, checkIfExists:true), + file(params.crai1, checkIfExists:true), + file(params.split3, checkIfExists:true) + ]) + input[1] = Channel.value([ + [id:"fasta"], + file(params.fasta, checkIfExists:true) + ]) + input[2] = Channel.value([ + [id:"fai"], + file(params.fai, checkIfExists:true) + ]) + input[3] = Channel.value([ + [id:"dict"], + file(params.dict, checkIfExists:true) + ]) + input[4] = [[],[]] + input[5] = [[],[]] + input[6] = [[],[]] + input[7] = false + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.gvcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}", it[2][-12..-1]] }, + workflow.out.reports + ).match() } + ) + } + + } + + test("cram_call_gatk4 - dragstr") { + + + when { + params { + callers = "haplotypecaller" + } + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143.00001", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.cram1, checkIfExists:true), + file(params.crai1, checkIfExists:true), + file(params.split1, checkIfExists:true) + ],[ + [id:"NA24143.00002", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.cram1, checkIfExists:true), + file(params.crai1, checkIfExists:true), + file(params.split2, checkIfExists:true) + ],[ + [id:"NA24143.00003", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], + file(params.cram1, checkIfExists:true), + file(params.crai1, checkIfExists:true), + file(params.split3, checkIfExists:true) + ]) + input[1] = Channel.value([ + [id:"fasta"], + file(params.fasta, checkIfExists:true) + ]) + input[2] = Channel.value([ + [id:"fai"], + file(params.fai, checkIfExists:true) + ]) + input[3] = Channel.value([ + [id:"dict"], + file(params.dict, checkIfExists:true) + ]) + input[4] = Channel.value([ + [id:"str"], + file(params.strtablefile, checkIfExists:true) + ]) + input[5] = [[],[]] + input[6] = [[],[]] + input[7] = true + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.gvcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}", it[2][-12..-1]] }, + workflow.out.reports + ).match() } + ) + } + + } + +} diff --git a/subworkflows/local/cram_call_gatk4/tests/main.nf.test.snap b/subworkflows/local/cram_call_gatk4/tests/main.nf.test.snap new file mode 100644 index 00000000..51d0bde2 --- /dev/null +++ b/subworkflows/local/cram_call_gatk4/tests/main.nf.test.snap @@ -0,0 +1,70 @@ +{ + "cram_call_gatk4 - default": { + "content": [ + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "haplotypecaller" + }, + "variantsMD5:57a0b3ce429f38292730f965277d28d5", + "g.vcf.gz.tbi" + ] + ], + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "haplotypecaller" + }, + "NA24143.haplotypecaller.bcftools_stats.txt:md5,09b4e7674e0f5b98b1e548df3002250e" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-06T16:31:34.986729048" + }, + "cram_call_gatk4 - dragstr": { + "content": [ + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "haplotypecaller" + }, + "variantsMD5:69601e4deb53c65d30fff9d260e31bb9", + "g.vcf.gz.tbi" + ] + ], + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "haplotypecaller" + }, + "NA24143.haplotypecaller.bcftools_stats.txt:md5,c4dad5b8e05871dda66df42b1f6c89ff" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-06T16:32:34.211560941" + } +} \ No newline at end of file diff --git a/subworkflows/local/cram_call_genotype_gatk4/main.nf b/subworkflows/local/cram_call_genotype_gatk4/main.nf deleted file mode 100644 index 2a33b0ab..00000000 --- a/subworkflows/local/cram_call_genotype_gatk4/main.nf +++ /dev/null @@ -1,91 +0,0 @@ -// -// Call and genotype variants with GATK4 tooling -// - -include { CRAM_CALL_GATK4 } from '../cram_call_gatk4/main' -include { GVCF_JOINT_GENOTYPE_GATK4 } from '../gvcf_joint_genotype_gatk4/main' -include { VCF_FILTER_BCFTOOLS } from '../vcf_filter_bcftools/main' - -workflow CRAM_CALL_GENOTYPE_GATK4 { - take: - ch_input // channel: [mandatory] [ val(meta), path(cram), path(crai), path(bed) ] => sample CRAM files and their indexes with the split bed files - ch_gvcfs // channel: [mandatory] [ val(meta), path(gvcf), path(tbi) ] => earlier called GVCFs with their indices - ch_fasta // channel: [mandatory] [ val(meta), path(fasta) ] => fasta reference - ch_fai // channel: [mandatory] [ val(meta), path(fai) ] => fasta reference index - ch_dict // channel: [mandatory] [ val(meta), path(dict) ] => sequence dictionary - ch_strtablefile // channel: [optional] [ path(strtablefile) ] => STR table file - ch_dbsnp // channel: [optional] [ path(dbsnp) ] => The VCF containing the dbsnp variants - ch_dbsnp_tbi // channel: [optional] [ path(dbsnp_tbi) ] => The index of the dbsnp VCF - dragstr // boolean: create a DragSTR model and run haplotypecaller with it - only_call // boolean: only run the variant calling - only_merge // boolean: run until the family merging - filter // boolean: filter the VCFs - scatter_count // integer: the amount of times the VCFs should be scattered - - main: - - ch_versions = Channel.empty() - ch_vcfs = Channel.empty() - ch_reports = Channel.empty() - - CRAM_CALL_GATK4( - ch_input, - ch_fasta, - ch_fai, - ch_dict, - ch_strtablefile, - ch_dbsnp, - ch_dbsnp_tbi, - dragstr - ) - ch_versions = ch_versions.mix(CRAM_CALL_GATK4.out.versions) - ch_reports = ch_reports.mix(CRAM_CALL_GATK4.out.reports) - - ch_gvcfs_ready = ch_gvcfs - .map { meta, gvcf, tbi -> - def new_meta = meta + [caller:"haplotypecaller"] - [ new_meta, gvcf, tbi ] - } - .mix(CRAM_CALL_GATK4.out.gvcfs) - - if(!only_call) { - - GVCF_JOINT_GENOTYPE_GATK4( - ch_gvcfs_ready, - ch_fasta, - ch_fai, - ch_dict, - ch_dbsnp, - ch_dbsnp_tbi, - only_merge, - scatter_count - ) - ch_versions = ch_versions.mix(GVCF_JOINT_GENOTYPE_GATK4.out.versions) - - } - - if(!only_call && !only_merge) { - - if(filter) { - VCF_FILTER_BCFTOOLS( - GVCF_JOINT_GENOTYPE_GATK4.out.vcfs, - true - ) - ch_versions = ch_versions.mix(VCF_FILTER_BCFTOOLS.out.versions) - - VCF_FILTER_BCFTOOLS.out.vcfs - .set { ch_vcfs } - } else { - GVCF_JOINT_GENOTYPE_GATK4.out.vcfs - .set { ch_vcfs } - } - - } - - emit: - vcfs = ch_vcfs // channel: [ val(meta), path(vcf), path(tbi) ] - - reports = ch_reports // channel: [ path(reports) ] - versions = ch_versions // channel: [ versions.yml ] - -} diff --git a/subworkflows/local/cram_call_vardictjava/main.nf b/subworkflows/local/cram_call_vardictjava/main.nf deleted file mode 100644 index 203bdc7c..00000000 --- a/subworkflows/local/cram_call_vardictjava/main.nf +++ /dev/null @@ -1,148 +0,0 @@ -include { SAMTOOLS_CONVERT } from '../../../modules/nf-core/samtools/convert/main' -include { VARDICTJAVA } from '../../../modules/nf-core/vardictjava/main' -include { TABIX_TABIX as TABIX_SPLIT } from '../../../modules/nf-core/tabix/tabix/main' -include { TABIX_BGZIPTABIX as TABIX_VCFANNO } from '../../../modules/nf-core/tabix/bgziptabix/main' -include { BCFTOOLS_REHEADER } from '../../../modules/nf-core/bcftools/reheader/main' -include { VCFANNO } from '../../../modules/nf-core/vcfanno/main' -include { TABIX_TABIX } from '../../../modules/nf-core/tabix/tabix/main' -include { BCFTOOLS_STATS } from '../../../modules/nf-core/bcftools/stats/main' - -include { VCF_CONCAT_BCFTOOLS } from '../vcf_concat_bcftools/main' -include { VCF_FILTER_BCFTOOLS } from '../vcf_filter_bcftools/main' - -workflow CRAM_CALL_VARDICTJAVA { - take: - ch_crams // channel: [mandatory] [ val(meta), path(cram), path(crai) ] => sample CRAM files and their indexes - ch_input // channel: [mandatory] [ val(meta), path(cram), path(crai), path(bed) ] => sample CRAM files and their indexes - ch_fasta // channel: [mandatory] [ val(meta), path(fasta) ] => fasta reference - ch_fai // channel: [mandatory] [ val(meta), path(fai) ] => fasta reference index - ch_dbsnp // channel: [optional] [ path(vcf) ] => the dbnsp vcf file - ch_dbsnp_tbi // channel: [optional] [ path(tbi) ] => the dbsnp vcf index file - filter // boolean: filter the VCFs - - main: - ch_versions = Channel.empty() - - ch_crams - .map { meta, cram, crai -> - def new_meta = meta + [caller:"vardict"] - [ new_meta, cram, crai ] - } - .set { ch_crams } - - ch_crams - .branch { meta, cram, crai -> - bam: cram.extension == "bam" - cram: cram.extension == "cram" - } - .set { ch_cram_bam } - - SAMTOOLS_CONVERT( - ch_cram_bam.cram, - ch_fasta, - ch_fai - ) - ch_versions = ch_versions.mix(SAMTOOLS_CONVERT.out.versions.first()) - - ch_input - .map { meta, cram, crai, bed -> - def new_meta = meta - meta.subMap("split_count") + [caller:"vardict", id:meta.sample] - [ new_meta, cram, crai, bed, meta.split_count ] - } - .set { ch_vardict_crams } - - ch_cram_bam.bam - .mix(SAMTOOLS_CONVERT.out.bam.join(SAMTOOLS_CONVERT.out.bai, failOnMismatch:true, failOnDuplicate:true)) - .combine(ch_vardict_crams, by:0) - .map { meta, bam, bai, cram, crai, bed, split_count -> - def new_meta = meta + [id:bed.baseName, split_count:split_count] - [ new_meta, bam, bai, bed ] - } - .set { ch_vardict_input } - - VARDICTJAVA( - ch_vardict_input, - ch_fasta, - ch_fai - ) - ch_versions = ch_versions.mix(VARDICTJAVA.out.versions.first()) - - TABIX_SPLIT( - VARDICTJAVA.out.vcf - ) - ch_versions = ch_versions.mix(TABIX_SPLIT.out.versions.first()) - - VCF_CONCAT_BCFTOOLS( - VARDICTJAVA.out.vcf.join(TABIX_SPLIT.out.tbi, failOnMismatch:true, failOnDuplicate:true), - false - ) - ch_versions = ch_versions.mix(VCF_CONCAT_BCFTOOLS.out.versions) - - ch_dbsnp_annotated = Channel.empty() - if(ch_dbsnp != [[],[]]) { - ch_dbsnp - .map { meta, dbsnp -> [ get_vcfanno_config(dbsnp) ] } - .collect() - .set { ch_vcfanno_toml } - - ch_dbsnp - .combine(ch_dbsnp_tbi) - .collect() - .set { ch_vcfanno_resources } - - VCFANNO( - VCF_CONCAT_BCFTOOLS.out.vcfs.map { meta, vcf -> [ meta, vcf, [], [] ] }, - ch_vcfanno_toml, - [], - ch_vcfanno_resources - ) - ch_versions = ch_versions.mix(VCFANNO.out.versions.first()) - - TABIX_VCFANNO( - VCFANNO.out.vcf - ) - ch_versions = ch_versions.mix(TABIX_VCFANNO.out.versions.first()) - - TABIX_VCFANNO.out.gz_tbi.set { ch_dbsnp_annotated } - } else { - VCF_CONCAT_BCFTOOLS.out.vcfs.set { ch_dbsnp_annotated } - } - - if(filter) { - VCF_FILTER_BCFTOOLS( - ch_dbsnp_annotated, - false - ) - ch_versions = ch_versions.mix(VCF_FILTER_BCFTOOLS.out.versions) - ch_filter_output = VCF_FILTER_BCFTOOLS.out.vcfs - } else { - ch_filter_output = ch_dbsnp_annotated - } - - TABIX_TABIX( - ch_filter_output - ) - ch_versions = ch_versions.mix(TABIX_TABIX.out.versions.first()) - - ch_filter_output - .join(TABIX_TABIX.out.tbi, failOnDuplicate: true, failOnMismatch: true) - .map { meta, vcf, tbi -> - def new_meta = meta + [samples: meta.sample] - [ new_meta, vcf, tbi ] - } - .set { ch_vcfs } - - emit: - vcfs = ch_vcfs // channel: [ val(meta), path(vcf), path(tbi) ] - - versions = ch_versions // channel: [ path(versions.yml) ] - -} - -def get_vcfanno_config(vcf) { - def old_toml = file("${projectDir}/assets/dbsnp.toml", checkIfExists: true) - old_toml.copyTo("${workDir}/vcfanno/dbsnp.toml") - def new_toml = file("${workDir}/vcfanno/dbsnp.toml") - new_toml.text = old_toml.text.replace("DBSNP_FILE", vcf.getName()) - return new_toml -} diff --git a/subworkflows/local/cram_prepare_samtools_bedtools/main.nf b/subworkflows/local/cram_prepare_samtools_bedtools/main.nf index 94e24a7c..62b241ee 100644 --- a/subworkflows/local/cram_prepare_samtools_bedtools/main.nf +++ b/subworkflows/local/cram_prepare_samtools_bedtools/main.nf @@ -8,6 +8,7 @@ include { FILTER_BEDS } from '../../../modules/local/filte include { SAMTOOLS_MERGE } from '../../../modules/nf-core/samtools/merge/main' include { SAMTOOLS_INDEX } from '../../../modules/nf-core/samtools/index/main' +include { SAMTOOLS_CONVERT } from '../../../modules/nf-core/samtools/convert/main' include { TABIX_TABIX } from '../../../modules/nf-core/tabix/tabix/main' include { TABIX_BGZIP as UNZIP_ROI } from '../../../modules/nf-core/tabix/bgzip/main' include { BEDTOOLS_INTERSECT } from '../../../modules/nf-core/bedtools/intersect/main' @@ -20,17 +21,18 @@ workflow CRAM_PREPARE_SAMTOOLS_BEDTOOLS { ch_fasta // channel: [mandatory] [ path(fasta) ] => fasta reference ch_fai // channel: [mandatory] [ path(fai) ] => fasta reference index ch_default_roi // channel: [optional] [ path(roi) ] => bed containing regions of interest to be used as default + output_bam // boolean: Also output BAM files main: - ch_versions = Channel.empty() - ch_reports = Channel.empty() + def ch_versions = Channel.empty() + def ch_reports = Channel.empty() // // Merge the CRAM files if there are multiple per sample // - ch_crams + def ch_cram_branch = ch_crams .map { meta, cram, crai -> [ groupKey(meta, meta.duplicate_count), cram, crai] } @@ -41,10 +43,6 @@ workflow CRAM_PREPARE_SAMTOOLS_BEDTOOLS { single: cram.size() == 1 return [meta.target, cram[0], crai[0]] } - .set { ch_cram_branch } - - ch_cram_branch.multiple.dump(tag:'cram_branch_multiple', pretty:true) - ch_cram_branch.single.dump(tag:'cram_branch_single', pretty:true) SAMTOOLS_MERGE( ch_cram_branch.multiple, @@ -57,7 +55,7 @@ workflow CRAM_PREPARE_SAMTOOLS_BEDTOOLS { // Index the CRAM files which have no index // - SAMTOOLS_MERGE.out.cram + def ch_merged_crams = SAMTOOLS_MERGE.out.cram .mix(ch_cram_branch.single) .branch { meta, cram, crai=[] -> not_indexed: crai == [] @@ -65,27 +63,37 @@ workflow CRAM_PREPARE_SAMTOOLS_BEDTOOLS { indexed: crai != [] return [ meta, cram, crai ] } - .set { ch_merged_crams } - - ch_merged_crams.not_indexed.dump(tag:'merged_crams_not_indexed', pretty:true) - ch_merged_crams.indexed.dump(tag:'merged_crams_indexed', pretty:true) SAMTOOLS_INDEX( ch_merged_crams.not_indexed ) ch_versions = ch_versions.mix(SAMTOOLS_INDEX.out.versions.first()) - ch_merged_crams.not_indexed + def ch_ready_crams = ch_merged_crams.not_indexed .join(SAMTOOLS_INDEX.out.crai, failOnDuplicate: true, failOnMismatch: true) .mix(ch_merged_crams.indexed) - .dump(tag:'ready_crams', pretty:true) - .set { ch_ready_crams } + + // + // Optionally convert the CRAM files to BAM + // + + def ch_ready_bams = Channel.empty() + if(output_bam) { + SAMTOOLS_CONVERT( + ch_ready_crams, + ch_fasta, + ch_fai + ) + ch_versions = ch_versions.mix(SAMTOOLS_CONVERT.out.versions.first()) + + ch_ready_bams = SAMTOOLS_CONVERT.out.bam.join(SAMTOOLS_CONVERT.out.bai, failOnDuplicate:true, failOnMismatch:true) + } // // Preprocess the ROI BED files => sort and merge overlapping regions // - ch_roi + def ch_roi_branch = ch_roi .map { meta, roi -> [ groupKey(meta, meta.duplicate_count), roi ] } @@ -95,20 +103,12 @@ workflow CRAM_PREPARE_SAMTOOLS_BEDTOOLS { // It's possible that a sample is given multiple times in the samplesheet, in which // case they have been merged earlier. This code checks if at least one entry of the same // sample contains an ROI file - def is_present = false - def output_roi = [] - roi.each { entry -> - if(entry != []){ - output_roi.add(entry) - is_present = true - } - } - found: is_present + def output_roi = roi.findAll { entry -> entry != [] } + found: output_roi.size() > 0 return [ meta.target, output_roi ] - missing: !is_present + missing: output_roi.size() == 0 return [ meta.target, [] ] } - .set { ch_roi_branch } MERGE_ROI_SAMPLE( ch_roi_branch.found, @@ -118,7 +118,7 @@ workflow CRAM_PREPARE_SAMTOOLS_BEDTOOLS { // Add the default ROI file to all samples without an ROI file // if an ROI BED file has been given through the --roi parameter - ch_missing_rois = Channel.empty() + def ch_missing_rois = Channel.empty() if (ch_default_roi) { MERGE_ROI_PARAMS( ch_default_roi.map { bed -> @@ -128,31 +128,23 @@ workflow CRAM_PREPARE_SAMTOOLS_BEDTOOLS { ) ch_versions = ch_versions.mix(MERGE_ROI_PARAMS.out.versions) - ch_roi_branch.missing - .map { meta, bed -> - [ groupKey(meta, meta.duplicate_count), bed ] - } - .groupTuple() - .combine(MERGE_ROI_PARAMS.out.bed.map { meta, bed -> bed }) - .map { meta, missing, default_roi -> - [ meta.target, default_roi ] + ch_missing_rois = ch_roi_branch.missing + .combine(MERGE_ROI_PARAMS.out.bed.map { _meta, bed -> bed }) + .map { meta, _missing, default_roi -> + [ meta, default_roi ] } - .set { ch_missing_rois } } else { - ch_roi_branch.missing.set { ch_missing_rois } + ch_missing_rois = ch_roi_branch.missing } - ch_missing_rois - .mix(MERGE_ROI_SAMPLE.out.bed) - .set { ch_ready_rois } + def ch_ready_rois = ch_missing_rois.mix(MERGE_ROI_SAMPLE.out.bed) // // Create callable regions // - ch_ready_crams + def ch_mosdepth_input = ch_ready_crams .join(ch_ready_rois, failOnDuplicate:true, failOnMismatch:true) - .set { ch_mosdepth_input } MOSDEPTH( ch_mosdepth_input, @@ -160,25 +152,23 @@ workflow CRAM_PREPARE_SAMTOOLS_BEDTOOLS { ) ch_versions = ch_versions.mix(MOSDEPTH.out.versions.first()) - ch_ready_rois + def ch_beds_to_filter = ch_ready_rois .join(MOSDEPTH.out.quantized_bed, failOnDuplicate:true, failOnMismatch:true) - .set { ch_beds_to_filter } // Filter out the regions with no coverage FILTER_BEDS( - ch_beds_to_filter.map { meta, roi, callable -> [ meta, callable ]} + ch_beds_to_filter.map { meta, _roi, callable -> [ meta, callable ]} ) ch_versions = ch_versions.mix(FILTER_BEDS.out.versions) - FILTER_BEDS.out.bed + def ch_beds_to_intersect = FILTER_BEDS.out.bed .join(ch_beds_to_filter, failOnDuplicate:true, failOnMismatch:true) - .branch { meta, filtered_callable, roi, callable -> + .branch { meta, filtered_callable, roi, _callable -> roi: roi return [ meta, roi, filtered_callable ] no_roi: !roi return [ meta, filtered_callable ] } - .set { ch_beds_to_intersect } // Intersect the ROI with the callable regions BEDTOOLS_INTERSECT( @@ -187,12 +177,12 @@ workflow CRAM_PREPARE_SAMTOOLS_BEDTOOLS { ) ch_versions = ch_versions.mix(BEDTOOLS_INTERSECT.out.versions) - ch_beds_to_intersect.no_roi + def ch_ready_beds = ch_beds_to_intersect.no_roi .mix(BEDTOOLS_INTERSECT.out.intersect) - .set { ch_ready_beds } emit: ready_crams = ch_ready_crams // [ val(meta), path(cram), path(crai) ] + ready_bams = ch_ready_bams // [ val(meta), path(bam), path(bai) ] ready_beds = ch_ready_beds // [ val(meta), path(bed) ] versions = ch_versions // [ path(versions) ] reports = ch_reports // [ path(reports) ] diff --git a/tests/subworkflows/local/cram_prepare_samtools_bedtools/main.nf.test b/subworkflows/local/cram_prepare_samtools_bedtools/tests/main.nf.test similarity index 74% rename from tests/subworkflows/local/cram_prepare_samtools_bedtools/main.nf.test rename to subworkflows/local/cram_prepare_samtools_bedtools/tests/main.nf.test index 0a69a066..cdb50e44 100644 --- a/tests/subworkflows/local/cram_prepare_samtools_bedtools/main.nf.test +++ b/subworkflows/local/cram_prepare_samtools_bedtools/tests/main.nf.test @@ -1,7 +1,7 @@ nextflow_workflow { name "Test Workflow CRAM_PREPARE_SAMTOOLS_BEDTOOLS" - script "subworkflows/local/cram_prepare_samtools_bedtools/main.nf" + script "../main.nf" workflow "CRAM_PREPARE_SAMTOOLS_BEDTOOLS" tag "subworkflows" @@ -31,18 +31,25 @@ nextflow_workflow { file(params.fai, checkIfExists:true) ]) input[4] = [] + input[5] = false """ } } then { + def fasta = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000.fasta" assertAll( { assert workflow.success }, { assert snapshot( - workflow.out.ready_crams.collect { it.collect { it instanceof Map ? it : file(it).name } }, + workflow.out.ready_crams.collect { + [ it[0], it[1], file(it[2]).name ] + }, + workflow.out.ready_bams.collect { + [ it[0], "${file(it[1]).name},readsMD5:${bam(it[1]).getReadsMD5()}", file(it[2]).name ] + }, workflow.out.ready_beds, workflow.out.reports - ).match("default - WGS") } + ).match() } ) } @@ -71,18 +78,25 @@ nextflow_workflow { file(params.fai, checkIfExists:true) ]) input[4] = Channel.fromPath(params.bed, checkIfExists:true) + input[5] = true """ } } then { + def fasta = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000.fasta" assertAll( { assert workflow.success }, { assert snapshot( - workflow.out.ready_crams.collect { it.collect { it instanceof Map ? it : file(it).name } }, + workflow.out.ready_crams.collect { + [ it[0], it[1], file(it[2]).name ] + }, + workflow.out.ready_bams.collect { + [ it[0], "${file(it[1]).name},readsMD5:${bam(it[1]).getReadsMD5()}", file(it[2]).name ] + }, workflow.out.ready_beds, workflow.out.reports - ).match("default - WES common ROI") } + ).match() } ) } @@ -111,18 +125,25 @@ nextflow_workflow { file(params.fai, checkIfExists:true) ]) input[4] = [] + input[5] = false """ } } then { + def fasta = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000.fasta" assertAll( { assert workflow.success }, { assert snapshot( - workflow.out.ready_crams.collect { it.collect { it instanceof Map ? it : file(it).name } }, + workflow.out.ready_crams.collect { + [ it[0], it[1], file(it[2]).name ] + }, + workflow.out.ready_bams.collect { + [ it[0], "${file(it[1]).name},readsMD5:${bam(it[1]).getReadsMD5()}", file(it[2]).name ] + }, workflow.out.ready_beds, workflow.out.reports - ).match("default - WES") } + ).match() } ) } @@ -158,6 +179,7 @@ nextflow_workflow { file(params.fai, checkIfExists:true) ]) input[4] = [] + input[5] = true """ } } @@ -168,11 +190,14 @@ nextflow_workflow { { assert workflow.success }, { assert snapshot( workflow.out.ready_crams.collect { - [ it[0], cram(it[1], fasta).reads.size(), file(it[2]).name ] + [ it[0], "${file(it[1]).name},readsMD5:${cram(it[1], fasta).getReadsMD5()}", file(it[2]).name ] + }, + workflow.out.ready_bams.collect { + [ it[0], "${file(it[1]).name},readsMD5:${bam(it[1]).getReadsMD5()}", file(it[2]).name ] }, workflow.out.ready_beds, workflow.out.reports - ).match("default - merge") } + ).match() } ) } diff --git a/tests/subworkflows/local/cram_prepare_samtools_bedtools/main.nf.test.snap b/subworkflows/local/cram_prepare_samtools_bedtools/tests/main.nf.test.snap similarity index 59% rename from tests/subworkflows/local/cram_prepare_samtools_bedtools/main.nf.test.snap rename to subworkflows/local/cram_prepare_samtools_bedtools/tests/main.nf.test.snap index cbd7535e..64b29194 100644 --- a/tests/subworkflows/local/cram_prepare_samtools_bedtools/main.nf.test.snap +++ b/subworkflows/local/cram_prepare_samtools_bedtools/tests/main.nf.test.snap @@ -1,5 +1,5 @@ { - "default - merge": { + "cram_prepare_samtools_bedtools - default - WES common ROI": { "content": [ [ [ @@ -8,9 +8,9 @@ "sample": "NA24143", "family": "Ashkenazim", "family_samples": "NA24143", - "duplicate_count": 2 + "duplicate_count": 1 }, - 798258, + "/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24143.cram", "NA24143.cram.crai" ] ], @@ -21,9 +21,22 @@ "sample": "NA24143", "family": "Ashkenazim", "family_samples": "NA24143", - "duplicate_count": 2 + "duplicate_count": 1 }, - "NA24143_intersect.bed:md5,b87069698afefb15282d069e56110046" + "NA24143.bam,readsMD5:77afffb023e537869c5c6ebf31187ded", + "NA24143.bam.bai" + ] + ], + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "duplicate_count": 1 + }, + "NA24143.intersect.bed:md5,b87069698afefb15282d069e56110046" ] ], [ @@ -31,12 +44,12 @@ ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-23T11:46:20.577603661" + "timestamp": "2024-11-14T16:07:03.620603158" }, - "default - WGS": { + "cram_prepare_samtools_bedtools - default - WES": { "content": [ [ [ @@ -47,9 +60,12 @@ "family_samples": "NA24143", "duplicate_count": 1 }, - "NA24143.cram", + "/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24143.cram", "NA24143.cram.crai" ] + ], + [ + ], [ [ @@ -60,7 +76,7 @@ "family_samples": "NA24143", "duplicate_count": 1 }, - "NA24143.filter.bed:md5,85a5568a6976ed455caa712991b30ac2" + "NA24143.intersect.bed:md5,b87069698afefb15282d069e56110046" ] ], [ @@ -68,12 +84,12 @@ ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-23T14:56:11.141634788" + "timestamp": "2024-11-14T16:07:24.688580575" }, - "default - WES": { + "cram_prepare_samtools_bedtools - default - WGS": { "content": [ [ [ @@ -84,9 +100,12 @@ "family_samples": "NA24143", "duplicate_count": 1 }, - "NA24143.cram", + "/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24143.cram", "NA24143.cram.crai" ] + ], + [ + ], [ [ @@ -97,7 +116,7 @@ "family_samples": "NA24143", "duplicate_count": 1 }, - "NA24143_intersect.bed:md5,b87069698afefb15282d069e56110046" + "NA24143.filter.bed:md5,85a5568a6976ed455caa712991b30ac2" ] ], [ @@ -105,12 +124,12 @@ ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-06T14:33:15.988619426" + "timestamp": "2024-11-14T16:06:42.311563291" }, - "default - WES common ROI": { + "cram_prepare_samtools_bedtools - default - merge": { "content": [ [ [ @@ -119,9 +138,9 @@ "sample": "NA24143", "family": "Ashkenazim", "family_samples": "NA24143", - "duplicate_count": 1 + "duplicate_count": 2 }, - "NA24143.cram", + "NA24143.cram,readsMD5:be28f434d6f7bcfa398488a6611d89c1", "NA24143.cram.crai" ] ], @@ -132,9 +151,22 @@ "sample": "NA24143", "family": "Ashkenazim", "family_samples": "NA24143", - "duplicate_count": 1 + "duplicate_count": 2 + }, + "NA24143.bam,readsMD5:be28f434d6f7bcfa398488a6611d89c1", + "NA24143.bam.bai" + ] + ], + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "duplicate_count": 2 }, - "NA24143_intersect.bed:md5,b87069698afefb15282d069e56110046" + "NA24143.intersect.bed:md5,b87069698afefb15282d069e56110046" ] ], [ @@ -142,9 +174,9 @@ ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-06T14:32:49.756585296" + "timestamp": "2024-11-14T16:08:20.712943795" } } \ No newline at end of file diff --git a/subworkflows/local/gvcf_joint_genotype_gatk4/main.nf b/subworkflows/local/gvcf_joint_genotype_gatk4/main.nf index c9114b75..c9b4b7e2 100644 --- a/subworkflows/local/gvcf_joint_genotype_gatk4/main.nf +++ b/subworkflows/local/gvcf_joint_genotype_gatk4/main.nf @@ -16,7 +16,7 @@ include { VCF_CONCAT_BCFTOOLS } from '../vcf_concat_bcftools/main' workflow GVCF_JOINT_GENOTYPE_GATK4 { take: - ch_gvcfs // channel: [mandatory] [ val(meta), path(gvcf), path(tbi) ] => The GVCFs called with HaplotypeCaller + ch_gvcfs // channel: [mandatory] [ val(meta), path(gvcf), path(tbi) ] => The GVCFs ch_fasta // channel: [mandatory] [ path(fasta) ] => fasta reference ch_fai // channel: [mandatory] [ path(fai) ] => fasta reference index ch_dict // channel: [mandatory] [ path(dict) ] => sequence dictionary @@ -27,8 +27,8 @@ workflow GVCF_JOINT_GENOTYPE_GATK4 { main: - ch_versions = Channel.empty() - ch_vcfs = Channel.empty() + def ch_versions = Channel.empty() + def ch_vcfs = Channel.empty() // // Get a BED file containing all contigs @@ -44,18 +44,17 @@ workflow GVCF_JOINT_GENOTYPE_GATK4 { // Create GenomicDBs for each family for each BED file // - ch_gvcfs + def ch_genomicsdbimport_input = ch_gvcfs .map { meta, gvcf, tbi -> // Create the family meta def new_meta = meta.subMap(["family", "family_samples", "caller"]) + [id:meta.family] [ groupKey(new_meta, meta.family_samples.tokenize(",").size()), gvcf, tbi ] } .groupTuple() - .combine(GAWK.out.output.map { meta, bed -> bed }) + .combine(GAWK.out.output.map { _meta, bed -> bed }) .map { meta, gvcfs, tbis, bed -> [ meta, gvcfs, tbis, bed, [], [] ] } - .set { ch_genomicsdbimport_input } GATK4_GENOMICSDBIMPORT( ch_genomicsdbimport_input, @@ -65,6 +64,7 @@ workflow GVCF_JOINT_GENOTYPE_GATK4 { ) ch_versions = ch_versions.mix(GATK4_GENOMICSDBIMPORT.out.versions.first()) + def ch_beds = Channel.empty() if(!only_merge) { BCFTOOLS_QUERY( @@ -75,28 +75,27 @@ workflow GVCF_JOINT_GENOTYPE_GATK4 { ) ch_versions = ch_versions.mix(BCFTOOLS_QUERY.out.versions.first()) - BCFTOOLS_QUERY.out.output + def ch_merge_beds_input = BCFTOOLS_QUERY.out.output .map { meta, bed -> // Create the family meta def new_meta = meta.subMap(["family", "family_samples", "caller"]) + [id:meta.family] [ groupKey(new_meta, meta.family_samples.tokenize(",").size()), bed ] } .groupTuple() - .dump(tag:'merge_beds_input', pretty: true) - .set { ch_merge_beds_input } MERGE_BEDS( ch_merge_beds_input, ch_fai ) ch_versions = ch_versions.mix(MERGE_BEDS.out.versions.first()) + ch_beds = MERGE_BEDS.out.bed // // Split BED file into multiple BEDs specified by --scatter_count // INPUT_SPLIT_BEDTOOLS( - MERGE_BEDS.out.bed.map { meta, bed -> + ch_beds.map { meta, bed -> // Multiply the scatter count by the family size to better scatter big families [meta, bed, (scatter_count * meta.family_samples.tokenize(",").size())] }, @@ -104,11 +103,10 @@ workflow GVCF_JOINT_GENOTYPE_GATK4 { ) ch_versions = ch_versions.mix(INPUT_SPLIT_BEDTOOLS.out.versions) - INPUT_SPLIT_BEDTOOLS.out.split - .map { meta, genomicsdb, extra, bed -> + def ch_genotypegvcfs_input = INPUT_SPLIT_BEDTOOLS.out.split + .map { meta, genomicsdb, _extra, bed -> [ meta, genomicsdb, [], bed, [] ] } - .set { ch_genotypegvcfs_input } // // Genotype the genomicsDBs @@ -124,9 +122,8 @@ workflow GVCF_JOINT_GENOTYPE_GATK4 { ) ch_versions = ch_versions.mix(GATK4_GENOTYPEGVCFS.out.versions.first()) - GATK4_GENOTYPEGVCFS.out.vcf + def ch_gather_inputs = GATK4_GENOTYPEGVCFS.out.vcf .join(GATK4_GENOTYPEGVCFS.out.tbi, failOnDuplicate: true, failOnMismatch: true) - .set { ch_gather_inputs } // // Combine the genotyped VCFs from each family back together @@ -138,13 +135,14 @@ workflow GVCF_JOINT_GENOTYPE_GATK4 { ) ch_versions = ch_versions.mix(VCF_CONCAT_BCFTOOLS.out.versions) - VCF_CONCAT_BCFTOOLS.out.vcfs - .set { ch_vcfs } + ch_vcfs = VCF_CONCAT_BCFTOOLS.out.vcfs } emit: - vcfs = ch_vcfs // [ val(meta), path(vcf) ] - versions = ch_versions // [ path(versions) ] + vcfs = ch_vcfs // [ val(meta), path(vcf), path(tbi) ] + genomicsdb = GATK4_GENOMICSDBIMPORT.out.genomicsdb // [ val(meta), path(genomicsdb) ] + beds = ch_beds // [ val(meta), path(bed) ] + versions = ch_versions // [ path(versions) ] } diff --git a/subworkflows/local/gvcf_joint_genotype_gatk4/tests/main.nf.test b/subworkflows/local/gvcf_joint_genotype_gatk4/tests/main.nf.test new file mode 100644 index 00000000..4e68db59 --- /dev/null +++ b/subworkflows/local/gvcf_joint_genotype_gatk4/tests/main.nf.test @@ -0,0 +1,190 @@ +nextflow_workflow { + + name "Test Workflow GVCF_JOINT_GENOTYPE_GATK4" + script "../main.nf" + workflow "GVCF_JOINT_GENOTYPE_GATK4" + + tag "subworkflows" + tag "subworkflows_local" + tag "gvcf_joint_genotype_gatk4" + tag "vcf_concat_bcftools" + tag "input_split_bedtools" + + test("gvcf_joint_genotype_gatk4 - single_sample") { + + when { + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", caller:"haplotypecaller"], + file(params.gvcf1, checkIfExists:true), + file(params.gtbi1, checkIfExists:true) + ]) + input[1] = Channel.value([ + [id:"fasta"], + file(params.fasta, checkIfExists:true) + ]) + input[2] = Channel.value([ + [id:"fai"], + file(params.fai, checkIfExists:true) + ]) + input[3] = Channel.value([ + [id:"dict"], + file(params.dict, checkIfExists:true) + ]) + input[4] = [[],[]] + input[5] = [[],[]] + input[6] = false + input[7] = 2 + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.vcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}", it[2][-10..-1]] } + ).match() } + ) + } + + } + + test("gvcf_joint_genotype_gatk4 - family") { + + when { + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143,NA24149", caller:"haplotypecaller"], + file(params.gvcf1, checkIfExists:true), + file(params.gtbi1, checkIfExists:true) + ],[ + [id:"NA24149", sample:"NA24149", family:"Ashkenazim", family_samples:"NA24143,NA24149", caller:"haplotypecaller"], + file(params.gvcf2, checkIfExists:true), + file(params.gtbi2, checkIfExists:true) + ] + ) + input[1] = Channel.value([ + [id:"fasta"], + file(params.fasta, checkIfExists:true) + ]) + input[2] = Channel.value([ + [id:"fai"], + file(params.fai, checkIfExists:true) + ]) + input[3] = Channel.value([ + [id:"dict"], + file(params.dict, checkIfExists:true) + ]) + input[4] = [[],[]] + input[5] = [[],[]] + input[6] = false + input[7] = 2 + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.vcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}", it[2][-10..-1]] } + ).match() } + ) + } + + } + + test("gvcf_joint_genotype_gatk4 - only_merge") { + + when { + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", caller:"haplotypecaller"], + file(params.gvcf1, checkIfExists:true), + file(params.gtbi1, checkIfExists:true) + ]) + input[1] = Channel.value([ + [id:"fasta"], + file(params.fasta, checkIfExists:true) + ]) + input[2] = Channel.value([ + [id:"fai"], + file(params.fai, checkIfExists:true) + ]) + input[3] = Channel.value([ + [id:"dict"], + file(params.dict, checkIfExists:true) + ]) + input[4] = [[],[]] + input[5] = [[],[]] + input[6] = true + input[7] = 2 + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.vcfs + ).match() } + ) + } + + } + + test("gvcf_joint_genotype_gatk4 - single_sample + family") { + + when { + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143,NA24149", caller:"haplotypecaller"], + file(params.gvcf1, checkIfExists:true), + file(params.gtbi1, checkIfExists:true) + ],[ + [id:"NA24149", sample:"NA24149", family:"Ashkenazim", family_samples:"NA24143,NA24149", caller:"haplotypecaller"], + file(params.gvcf2, checkIfExists:true), + file(params.gtbi2, checkIfExists:true) + ],[ + [id:"NA24385", sample:"NA24385", family:"NA24385", family_samples:"NA24385", caller:"haplotypecaller"], + file(params.gvcf3, checkIfExists:true), + file(params.gtbi3, checkIfExists:true) + ] + ) + input[1] = Channel.value([ + [id:"fasta"], + file(params.fasta, checkIfExists:true) + ]) + input[2] = Channel.value([ + [id:"fai"], + file(params.fai, checkIfExists:true) + ]) + input[3] = Channel.value([ + [id:"dict"], + file(params.dict, checkIfExists:true) + ]) + input[4] = [[],[]] + input[5] = [[],[]] + input[6] = false + input[7] = 2 + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.vcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}", it[2][-10..-1]] } + ).match() } + ) + } + + } +} diff --git a/subworkflows/local/gvcf_joint_genotype_gatk4/tests/main.nf.test.snap b/subworkflows/local/gvcf_joint_genotype_gatk4/tests/main.nf.test.snap new file mode 100644 index 00000000..c35e117a --- /dev/null +++ b/subworkflows/local/gvcf_joint_genotype_gatk4/tests/main.nf.test.snap @@ -0,0 +1,87 @@ +{ + "gvcf_joint_genotype_gatk4 - single_sample + family": { + "content": [ + [ + [ + { + "family": "Ashkenazim", + "family_samples": "NA24143,NA24149", + "caller": "haplotypecaller", + "id": "Ashkenazim" + }, + "variantsMD5:4dea305eb71decb122709e75af9c833f", + "vcf.gz.tbi" + ], + [ + { + "family": "NA24385", + "family_samples": "NA24385", + "caller": "haplotypecaller", + "id": "NA24385" + }, + "variantsMD5:4ffd515511f59e3561e3fb1b046d7675", + "vcf.gz.tbi" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-06T16:40:54.696361238" + }, + "gvcf_joint_genotype_gatk4 - single_sample": { + "content": [ + [ + [ + { + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "haplotypecaller", + "id": "Ashkenazim" + }, + "variantsMD5:4c6db9171912bcbbaefeec2a24968a", + "vcf.gz.tbi" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-06T16:38:28.514998644" + }, + "gvcf_joint_genotype_gatk4 - only_merge": { + "content": [ + [ + + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-05T11:43:25.386070284" + }, + "gvcf_joint_genotype_gatk4 - family": { + "content": [ + [ + [ + { + "family": "Ashkenazim", + "family_samples": "NA24143,NA24149", + "caller": "haplotypecaller", + "id": "Ashkenazim" + }, + "variantsMD5:4dea305eb71decb122709e75af9c833f", + "vcf.gz.tbi" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-06T16:39:15.421025343" + } +} \ No newline at end of file diff --git a/subworkflows/local/input_split_bedtools/main.nf b/subworkflows/local/input_split_bedtools/main.nf index b70140b5..4aab4987 100644 --- a/subworkflows/local/input_split_bedtools/main.nf +++ b/subworkflows/local/input_split_bedtools/main.nf @@ -11,28 +11,30 @@ workflow INPUT_SPLIT_BEDTOOLS { main: - ch_versions = Channel.empty() + def ch_versions = Channel.empty() BEDTOOLS_SPLIT( ch_beds ) ch_versions = ch_versions.mix(BEDTOOLS_SPLIT.out.versions.first()) - ch_inputs + def ch_split_output = ch_inputs .join(BEDTOOLS_SPLIT.out.beds, failOnDuplicate: true, failOnMismatch: true) - .map { meta, input, input_index, beds -> + .map { row -> + def meta = row[0] + def beds = row[-1] // Determine the amount of BED files per sample - bed_is_list = beds instanceof ArrayList + def bed_is_list = beds instanceof ArrayList def new_meta = meta + [split_count: bed_is_list ? beds.size() : 1] - [ new_meta, input, input_index, bed_is_list ? beds : [beds] ] + def bed_output = bed_is_list ? [beds] : [[beds]] + return [new_meta] + bed_output + row[1..-2] } - .transpose(by:3) // Create one channel entry for each BED file per sample - .map { meta, input, input_index, bed -> + .transpose(by:1) // Create one channel entry for each BED file per sample + .map { row -> // Set the base name of the BED file as the ID (this will look like sample_id.xxxx, where xxxx are numbers) - def new_meta = meta + [id:bed.baseName] - [ new_meta, input, input_index, bed ] + def new_meta = row[0] + [id:row[1].baseName] + return [ new_meta ] + row[2..-1] + [ row[1] ] } - .set { ch_split_output } emit: split = ch_split_output // channel: [ val(meta), path(input), path(input_index), path(bed) ] diff --git a/tests/subworkflows/local/input_split_bedtools/main.nf.test b/subworkflows/local/input_split_bedtools/tests/main.nf.test similarity index 95% rename from tests/subworkflows/local/input_split_bedtools/main.nf.test rename to subworkflows/local/input_split_bedtools/tests/main.nf.test index 87cc55a6..e03c02d6 100644 --- a/tests/subworkflows/local/input_split_bedtools/main.nf.test +++ b/subworkflows/local/input_split_bedtools/tests/main.nf.test @@ -1,7 +1,7 @@ nextflow_workflow { name "Test Workflow INPUT_SPLIT_BEDTOOLS" - script "subworkflows/local/input_split_bedtools/main.nf" + script "../main.nf" workflow "INPUT_SPLIT_BEDTOOLS" tag "subworkflows" diff --git a/tests/subworkflows/local/input_split_bedtools/main.nf.test.snap b/subworkflows/local/input_split_bedtools/tests/main.nf.test.snap similarity index 100% rename from tests/subworkflows/local/input_split_bedtools/main.nf.test.snap rename to subworkflows/local/input_split_bedtools/tests/main.nf.test.snap diff --git a/subworkflows/local/utils_cmgg_germline_pipeline/main.nf b/subworkflows/local/utils_cmgg_germline_pipeline/main.nf index 6124181e..c7dca893 100644 --- a/subworkflows/local/utils_cmgg_germline_pipeline/main.nf +++ b/subworkflows/local/utils_cmgg_germline_pipeline/main.nf @@ -18,18 +18,16 @@ include { UTILS_NFCORE_PIPELINE } from '../../nf-core/utils_nfcore_pipeline' include { WATCHPATH_HANDLING } from '../watchpath_handling' /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW TO INITIALISE PIPELINE -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow PIPELINE_INITIALISATION { take: version // boolean: Display version and exit - help // boolean: Display help text validate_params // boolean: Boolean whether to validate parameters against the schema at runtime - monochrome_logs // boolean: Do not use coloured log outputs nextflow_cli_args // array: List of positional nextflow CLI args outdir // string: The output directory where the results will be saved input // string: Path to input samplesheet @@ -40,7 +38,7 @@ workflow PIPELINE_INITIALISATION { main: - ch_versions = Channel.empty() + def ch_versions = Channel.empty() // // Print version and exit if required and dump pipeline parameters to JSON file @@ -52,6 +50,7 @@ workflow PIPELINE_INITIALISATION { workflow.profile.tokenize(',').intersect(['conda', 'mamba']).size() >= 1 ) + // // Validate parameters and generate parameter summary to stdout // @@ -61,6 +60,7 @@ workflow PIPELINE_INITIALISATION { null ) + // // Check config provided to the pipeline // @@ -84,7 +84,7 @@ workflow PIPELINE_INITIALISATION { ) // Output the samplesheet - file(input).copyTo("${outdir}/samplesheet.csv") + file(input).copyTo("${outdir}/${params.unique_out}/samplesheet.${file(input).extension}") emit: samplesheet = WATCHPATH_HANDLING.out.samplesheet @@ -92,9 +92,9 @@ workflow PIPELINE_INITIALISATION { } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW FOR PIPELINE COMPLETION -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow PIPELINE_COMPLETION { @@ -103,25 +103,32 @@ workflow PIPELINE_COMPLETION { email // string: email address email_on_fail // string: email address sent on pipeline failure plaintext_email // boolean: Send plain-text email instead of HTML + outdir // path: Path to output directory where results will be published monochrome_logs // boolean: Disable ANSI colour codes in log output hook_url // string: hook URL for notifications multiqc_report // string: Path to MultiQC report main: - - summary_params = paramsSummaryMap(workflow, parameters_schema: "nextflow_schema.json") + def summary_params = paramsSummaryMap(workflow, parameters_schema: "nextflow_schema.json") // // Completion email and summary // workflow.onComplete { if (email || email_on_fail) { - completionEmail(summary_params, email, email_on_fail, plaintext_email, outdir, monochrome_logs, multiqc_report.toList()) + completionEmail( + summary_params, + email, + email_on_fail, + plaintext_email, + outdir, + monochrome_logs, + multiqc_report.toList() + ) } completionSummary(monochrome_logs) - if (hook_url) { imNotification(summary_params, hook_url) } @@ -133,9 +140,9 @@ workflow PIPELINE_COMPLETION { } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ // // Check and validate pipeline parameters @@ -169,7 +176,6 @@ def genomeExistsError(genomesMap, genome) { error(error_string) } } - // // Generate methods description for MultiQC // @@ -185,8 +191,7 @@ def toolCitationText() { def toolBibliographyText() { def reference_text = [ - "
  • Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. doi: /10.1093/bioinformatics/btw354
  • ", - "..." + "
  • Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. doi: /10.1093/bioinformatics/btw354
  • " ].join(' ').trim() return reference_text @@ -205,8 +210,8 @@ def methodsDescriptionText(mqc_methods_yaml) { // Removing ` ` since the manifest.doi is a string and not a proper list def temp_doi_ref = "" def manifest_doi = meta.manifest_map.doi.tokenize(",") - manifest_doi.each { doi_ref -> - temp_doi_ref += "(doi: ${doi_ref.replace("https://doi.org/", "").replace(" ", "")}), " + temp_doi_ref = manifest_doi.collect { doi_ref -> + return "(doi: ${doi_ref.replace("https://doi.org/", "").replace(" ", "")}), " } meta["doi_text"] = temp_doi_ref.substring(0, temp_doi_ref.length() - 2) } else meta["doi_text"] = "" @@ -227,3 +232,4 @@ def methodsDescriptionText(mqc_methods_yaml) { return description_html.toString() } + diff --git a/subworkflows/nf-core/vcf_annotate_ensemblvep_snpeff/main.nf b/subworkflows/local/vcf_annotate_ensemblvep/main.nf similarity index 65% rename from subworkflows/nf-core/vcf_annotate_ensemblvep_snpeff/main.nf rename to subworkflows/local/vcf_annotate_ensemblvep/main.nf index 91cace4d..abb04d99 100644 --- a/subworkflows/nf-core/vcf_annotate_ensemblvep_snpeff/main.nf +++ b/subworkflows/local/vcf_annotate_ensemblvep/main.nf @@ -1,15 +1,15 @@ // -// Run VEP and/or SNPEFF to annotate VCF files +// Run VEP to annotate VCF files // include { ENSEMBLVEP_VEP } from '../../../modules/nf-core/ensemblvep/vep/main' -include { SNPEFF_SNPEFF } from '../../../modules/nf-core/snpeff/snpeff/main' include { TABIX_TABIX } from '../../../modules/nf-core/tabix/tabix/main' +include { TABIX_BGZIP } from '../../../modules/nf-core/tabix/bgzip/main' include { BCFTOOLS_PLUGINSCATTER } from '../../../modules/nf-core/bcftools/pluginscatter/main' include { BCFTOOLS_CONCAT } from '../../../modules/nf-core/bcftools/concat/main' include { BCFTOOLS_SORT } from '../../../modules/nf-core/bcftools/sort/main' -workflow VCF_ANNOTATE_ENSEMBLVEP_SNPEFF { +workflow VCF_ANNOTATE_ENSEMBLVEP { take: ch_vcf // channel: [ val(meta), path(vcf), path(tbi), [path(file1), path(file2)...] ] ch_fasta // channel: [ val(meta2), path(fasta) ] (optional) @@ -18,13 +18,12 @@ workflow VCF_ANNOTATE_ENSEMBLVEP_SNPEFF { val_vep_cache_version // value: cache version to use ch_vep_cache // channel: [ path(cache) ] (optional) ch_vep_extra_files // channel: [ path(file1), path(file2)... ] (optional) - val_snpeff_db // value: the db version to use for snpEff - ch_snpeff_cache // channel: [ path(cache) ] (optional) - val_tools_to_use // value: a list of tools to use options are: ["ensemblvep", "snpeff"] val_sites_per_chunk // value: the amount of variants per scattered VCF main: - ch_versions = Channel.empty() + def ch_versions = Channel.empty() + def ch_vep_input = Channel.empty() + def ch_scatter = Channel.empty() // Check if val_sites_per_chunk is set and scatter if it is if(val_sites_per_chunk) { @@ -32,7 +31,7 @@ workflow VCF_ANNOTATE_ENSEMBLVEP_SNPEFF { // Prepare the input VCF channel for scattering (split VCFs from custom files) // - ch_input = ch_vcf + def ch_input = ch_vcf .multiMap { meta, vcf, tbi, custom_files -> vcf: [ meta, vcf, tbi ] custom: [ meta, custom_files ] @@ -63,7 +62,7 @@ workflow VCF_ANNOTATE_ENSEMBLVEP_SNPEFF { // If multiple files are created, a list will be made as output of the process // So if the output isn't a list, there is always one file and if there is a list, // the amount of files in the list gets counted by .size() - is_list = vcfs instanceof ArrayList + def is_list = vcfs instanceof ArrayList count = is_list ? vcfs.size() : 1 [ meta, is_list ? vcfs : [vcfs], count ] // Channel containing the list of VCFs and the size of this list @@ -72,74 +71,51 @@ workflow VCF_ANNOTATE_ENSEMBLVEP_SNPEFF { .combine(ch_input.custom, by: 0) // Re-add the sample specific custom files .multiMap { meta, vcf, count, custom_files -> // Define the new ID. The `_annotated` is to disambiguate the VEP output with its input - new_id = "${meta.id}${vcf.name.replace(meta.id,"").tokenize(".")[0]}_annotated" as String - new_meta = meta + [id:new_id] + def new_id = "${meta.id}${vcf.name.replace(meta.id,"").tokenize(".")[0]}_annotated" as String // Create channels: one with the VEP input and one with the original ID and count of scattered VCFs - input: [ new_meta, vcf, custom_files ] - count: [ new_meta, meta.id, count ] + input: [ meta + [id:new_id], vcf, custom_files ] + count: [ meta + [id:new_id], meta.id, count ] } ch_vep_input = ch_scatter.input } else { // Use the normal input when no scattering has to be performed - ch_vep_input = ch_vcf.map { meta, vcf, tbi, files -> [ meta, vcf, files ] } + ch_vep_input = ch_vcf.map { meta, vcf, _tbi, files -> [ meta, vcf, files ] } } // Annotate with ensemblvep if it's part of the requested tools - if("ensemblvep" in val_tools_to_use){ - ENSEMBLVEP_VEP( - ch_vep_input, - val_vep_genome, - val_vep_species, - val_vep_cache_version, - ch_vep_cache, - ch_fasta, - ch_vep_extra_files - ) - ch_versions = ch_versions.mix(ENSEMBLVEP_VEP.out.versions.first()) - - ch_vep_output = ENSEMBLVEP_VEP.out.vcf - ch_vep_reports = ENSEMBLVEP_VEP.out.report - } else { - ch_vep_output = ch_vep_input.map { meta, vcf, files -> [ meta, vcf ] } - ch_vep_reports = Channel.empty() - } - - // Annotate with snpeff if it's part of the requested tools - if("snpeff" in val_tools_to_use){ - SNPEFF_SNPEFF( - ch_vep_output, - val_snpeff_db, - ch_snpeff_cache - ) - ch_versions = ch_versions.mix(SNPEFF_SNPEFF.out.versions.first()) + ENSEMBLVEP_VEP( + ch_vep_input, + val_vep_genome, + val_vep_species, + val_vep_cache_version, + ch_vep_cache, + ch_fasta, + ch_vep_extra_files + ) + ch_versions = ch_versions.mix(ENSEMBLVEP_VEP.out.versions.first()) - ch_snpeff_output = SNPEFF_SNPEFF.out.vcf - ch_snpeff_reports = SNPEFF_SNPEFF.out.report - ch_snpeff_html = SNPEFF_SNPEFF.out.summary_html - ch_snpeff_genes = SNPEFF_SNPEFF.out.genes_txt - } else { - ch_snpeff_output = ch_vep_output - ch_snpeff_reports = Channel.empty() - ch_snpeff_html = Channel.empty() - ch_snpeff_genes = Channel.empty() - } + def ch_vep_output = ENSEMBLVEP_VEP.out.vcf + def ch_vep_reports = ENSEMBLVEP_VEP.out.report // Gather the files back together if they were scattered + def ch_ready_vcfs = Channel.empty() if(val_sites_per_chunk) { // // Concatenate the VCFs back together with bcftools concat // - ch_concat_input = ch_snpeff_output + def ch_concat_input = ch_vep_output .join(ch_scatter.count, failOnDuplicate:true, failOnMismatch:true) .map { meta, vcf, id, count -> - new_meta = meta + [id:id] + def new_meta = meta + [id:id] [ groupKey(new_meta, count), vcf ] } .groupTuple() // Group the VCFs which need to be concatenated - .map { it + [[]] } + .map { meta, vcf -> + [ meta, vcf, [] ] + } BCFTOOLS_CONCAT( ch_concat_input @@ -157,14 +133,14 @@ workflow VCF_ANNOTATE_ENSEMBLVEP_SNPEFF { ch_ready_vcfs = BCFTOOLS_SORT.out.vcf } else { - ch_ready_vcfs = ch_snpeff_output + ch_ready_vcfs = ch_vep_output } // // Index the resulting bgzipped VCFs // - ch_tabix_input = ch_ready_vcfs + def ch_tabix_input = ch_ready_vcfs .branch { meta, vcf -> // Split the bgzipped VCFs from the unzipped VCFs (only bgzipped VCFs should be indexed) bgzip: vcf.extension == "gz" @@ -177,15 +153,12 @@ workflow VCF_ANNOTATE_ENSEMBLVEP_SNPEFF { ) ch_versions = ch_versions.mix(TABIX_TABIX.out.versions) - ch_vcf_tbi = ch_tabix_input.bgzip + def ch_vcf_tbi = ch_tabix_input.bgzip .join(TABIX_TABIX.out.tbi, failOnDuplicate: true, failOnMismatch: true) .mix(ch_tabix_input.unzip) emit: vcf_tbi = ch_vcf_tbi // channel: [ val(meta), path(vcf), path(tbi) ] vep_reports = ch_vep_reports // channel: [ path(html) ] - snpeff_reports = ch_snpeff_reports // channel: [ path(csv) ] - snpeff_html = ch_snpeff_html // channel: [ path(html) ] - snpeff_genes = ch_snpeff_genes // channel: [ path(genes) ] versions = ch_versions // channel: [ versions.yml ] } diff --git a/subworkflows/nf-core/vcf_annotate_ensemblvep_snpeff/meta.yml b/subworkflows/local/vcf_annotate_ensemblvep/meta.yml similarity index 77% rename from subworkflows/nf-core/vcf_annotate_ensemblvep_snpeff/meta.yml rename to subworkflows/local/vcf_annotate_ensemblvep/meta.yml index 6fe096c0..b0541d66 100644 --- a/subworkflows/nf-core/vcf_annotate_ensemblvep_snpeff/meta.yml +++ b/subworkflows/local/vcf_annotate_ensemblvep/meta.yml @@ -8,9 +8,12 @@ keywords: - ensemblvep - snpeff components: + - ensemblvep/download - ensemblvep/vep + - snpeff/download - snpeff/snpeff - tabix/tabix + - tabix/bgzip - bcftools/pluginscatter - bcftools/concat - bcftools/sort @@ -42,7 +45,9 @@ input: Structure: [ path(file1), path(file2)... ] - val_snpeff_db: type: string - description: database to use for snpeff + description: | + database to use for snpeff, usually consists of the genome and the database version + e.g. WBcel235.105 - ch_snpeff_cache: description: | the root cache folder for snpeff (optional) @@ -60,10 +65,23 @@ output: description: | Compressed vcf file + tabix index Structure: [ val(meta), path(vcf), path(tbi) ] - - reports: + - vep_reports: type: file - description: html reports + description: html reports generated by Ensembl VEP pattern: "*.html" + - snpeff_reports: + description: | + csv reports generated by snpeff + Structure: [ val(meta), path(csv) ] + - snpeff_html: + description: | + html reports generated by snpeff + Structure: [ val(meta), path(html) ] + - snpeff_genes: + description: | + txt (tab separated) file having counts of the number of variants + affecting each transcript and gene + Structure: [ val(meta), path(txt) ] - versions: type: file description: File containing software versions diff --git a/subworkflows/local/vcf_annotate_ensemblvep/tests/main.nf.test b/subworkflows/local/vcf_annotate_ensemblvep/tests/main.nf.test new file mode 100644 index 00000000..60b16a6e --- /dev/null +++ b/subworkflows/local/vcf_annotate_ensemblvep/tests/main.nf.test @@ -0,0 +1,178 @@ +nextflow_workflow { + + name "Test Subworkflow VCF_ANNOTATE_ENSEMBLVEP" + script "../main.nf" + workflow "VCF_ANNOTATE_ENSEMBLVEP" + + tag "subworkflows" + tag "subworkflows_nfcore" + tag "subworkflows/vcf_annotate_ensemblvep" + tag "ensemblvep/download" + tag "ensemblvep/vep" + tag "tabix/tabix" + tag "tabix/bgzip" + tag "bcftools/pluginscatter" + tag "bcftools/concat" + tag "bcftools/sort" + + config "./nextflow.config" + + test("sarscov2 - ensemblvep") { + + setup { + run("ENSEMBLVEP_DOWNLOAD") { + script "../../../../modules/nf-core/ensemblvep/download" + process { + """ + input[0] = [ + [id:"reference"], + "WBcel235", + "caenorhabditis_elegans", + "110" + ] + """ + } + } + } + + when { + workflow { + """ + input[0] = Channel.of([ + [ id:'custom_test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test3.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test3.vcf.gz.tbi', checkIfExists: true) + ] + ]) + input[1] = [[],[]] + input[2] = "WBcel235" + input[3] = "caenorhabditis_elegans" + input[4] = "110" + input[5] = ENSEMBLVEP_DOWNLOAD.out.cache.map { meta, cache -> cache } + input[6] = [] + input[7] = 5 + """ + } + } + + then { + assertAll( + { assert workflow.success}, + { assert snapshot( + workflow.out.vcf_tbi.collect { [it[0], "${file(it[1]).name},variantsMD5:${path(it[1]).vcf.variantsMD5}", file(it[2]).name] }, + workflow.out.vep_reports.collect { it instanceof String ? file(it).name : it }, + workflow.out.versions.collect { it instanceof String ? file(it).name : it } + ).match()} + ) + } + } + + test("sarscov2 - ensemblvep - large chunks") { + + setup { + run("ENSEMBLVEP_DOWNLOAD") { + script "../../../../modules/nf-core/ensemblvep/download" + process { + """ + input[0] = [ + [id:"reference"], + "WBcel235", + "caenorhabditis_elegans", + "110" + ] + """ + } + } + } + + when { + workflow { + """ + input[0] = Channel.of( [ + [ id:'custom_test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test3.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test3.vcf.gz.tbi', checkIfExists: true) + ] + ]) + input[1] = [[],[]] + input[2] = "WBcel235" + input[3] = "caenorhabditis_elegans" + input[4] = "110" + input[5] = ENSEMBLVEP_DOWNLOAD.out.cache.map { meta, cache -> cache } + input[6] = [] + input[7] = 100 + """ + } + } + + then { + assertAll( + { assert workflow.success}, + { assert snapshot( + workflow.out.vcf_tbi.collect { [it[0], "${file(it[1]).name},variantsMD5:${path(it[1]).vcf.variantsMD5}", file(it[2]).name] }, + workflow.out.vep_reports.collect { it instanceof String ? file(it).name : it }, + workflow.out.versions.collect { it instanceof String ? file(it).name : it } + ).match()} + ) + } + } + + test("sarscov2 - ensemblvep - no scatter") { + + setup { + run("ENSEMBLVEP_DOWNLOAD") { + script "../../../../modules/nf-core/ensemblvep/download" + process { + """ + input[0] = [ + [id:"reference"], + "WBcel235", + "caenorhabditis_elegans", + "110" + ] + """ + } + } + } + + when { + workflow { + """ + input[0] = Channel.of([ + [ id:'custom_test', single_end:false ], // meta map + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test2.vcf.gz.tbi', checkIfExists: true), + [ + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test3.vcf.gz', checkIfExists: true), + file(params.modules_testdata_base_path + 'genomics/sarscov2/illumina/vcf/test3.vcf.gz.tbi', checkIfExists: true) + ] + ]) + input[1] = [[],[]] + input[2] = "WBcel235" + input[3] = "caenorhabditis_elegans" + input[4] = "110" + input[5] = ENSEMBLVEP_DOWNLOAD.out.cache.map { meta, cache -> cache } + input[6] = [] + input[7] = [] + """ + } + } + + then { + assertAll( + { assert workflow.success}, + { assert snapshot( + workflow.out.vcf_tbi.collect { [it[0], "${file(it[1]).name},variantsMD5:${path(it[1]).vcf.variantsMD5}", file(it[2]).name] }, + workflow.out.vep_reports.collect { it instanceof String ? file(it).name : it }, + workflow.out.versions.collect { it instanceof String ? file(it).name : it } + ).match()} + ) + } + } +} diff --git a/subworkflows/local/vcf_annotate_ensemblvep/tests/main.nf.test.snap b/subworkflows/local/vcf_annotate_ensemblvep/tests/main.nf.test.snap new file mode 100644 index 00000000..0d73abc0 --- /dev/null +++ b/subworkflows/local/vcf_annotate_ensemblvep/tests/main.nf.test.snap @@ -0,0 +1,93 @@ +{ + "sarscov2 - ensemblvep - large chunks": { + "content": [ + [ + [ + { + "groupSize": 1, + "groupTarget": { + "id": "custom_test", + "single_end": false + } + }, + "custom_test.vcf.gz,variantsMD5:44ed24c4dc4223670a78ffea3c7459e", + "custom_test.vcf.gz.tbi" + ] + ], + [ + "custom_test0_annotated.vep.vcf.gz_summary.html" + ], + [ + "versions.yml", + "versions.yml", + "versions.yml", + "versions.yml", + "versions.yml" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-20T14:31:50.569767686" + }, + "sarscov2 - ensemblvep - no scatter": { + "content": [ + [ + [ + { + "id": "custom_test", + "single_end": false + }, + "custom_test.vep.vcf.gz,variantsMD5:44ed24c4dc4223670a78ffea3c7459e", + "custom_test.vep.vcf.gz.tbi" + ] + ], + [ + "custom_test.vep.vcf.gz_summary.html" + ], + [ + "versions.yml", + "versions.yml" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-20T14:32:06.437006334" + }, + "sarscov2 - ensemblvep": { + "content": [ + [ + [ + { + "groupSize": 2, + "groupTarget": { + "id": "custom_test", + "single_end": false + } + }, + "custom_test.vcf.gz,variantsMD5:44ed24c4dc4223670a78ffea3c7459e", + "custom_test.vcf.gz.tbi" + ] + ], + [ + "custom_test0_annotated.vep.vcf.gz_summary.html", + "custom_test1_annotated.vep.vcf.gz_summary.html" + ], + [ + "versions.yml", + "versions.yml", + "versions.yml", + "versions.yml", + "versions.yml" + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-20T14:31:32.997494596" + } +} \ No newline at end of file diff --git a/subworkflows/local/vcf_annotate_ensemblvep/tests/nextflow.config b/subworkflows/local/vcf_annotate_ensemblvep/tests/nextflow.config new file mode 100644 index 00000000..634ec18a --- /dev/null +++ b/subworkflows/local/vcf_annotate_ensemblvep/tests/nextflow.config @@ -0,0 +1,15 @@ +process { + withName: BCFTOOLS_CONCAT { + ext.prefix = { "${meta.id}_concat" } + } + withName: ENSEMBLVEP_DOWNLOAD { + ext.args = '--AUTO c --CONVERT --NO_BIOPERL --NO_HTSLIB --NO_TEST --NO_UPDATE' + } + withName: ENSEMBLVEP_VEP { + ext.prefix = { "${meta.id}.vep" } + ext.args = {[ + "--vcf --offline", + meta.id.startsWith("custom_test") ? "--custom test3.vcf.gz,,vcf,exact,0,TOPMED" : "" + ].join(" ")} + } +} diff --git a/subworkflows/local/vcf_annotation/main.nf b/subworkflows/local/vcf_annotation/main.nf index 94698157..df3b1925 100644 --- a/subworkflows/local/vcf_annotation/main.nf +++ b/subworkflows/local/vcf_annotation/main.nf @@ -8,13 +8,12 @@ include { TABIX_BGZIP as BGZIP_ANNOTATED_VCFS } from '../../../modules/nf-core/t include { TABIX_TABIX as TABIX_ENSEMBLVEP } from '../../../modules/nf-core/tabix/tabix/main' include { BCFTOOLS_CONCAT } from '../../../modules/nf-core/bcftools/concat/main' -include { VCF_ANNOTATE_ENSEMBLVEP_SNPEFF as VCF_ANNOTATE_ENSEMBLVEP } from '../../../subworkflows/nf-core/vcf_annotate_ensemblvep_snpeff/main' +include { VCF_ANNOTATE_ENSEMBLVEP } from '../../../subworkflows/local/vcf_annotate_ensemblvep/main' workflow VCF_ANNOTATION { take: ch_vcfs // channel: [mandatory] [ val(meta), path(vcf) ] => The post-processed VCFs ch_fasta // channel: [mandatory] [ val(meta2), path(fasta) ] => fasta reference - ch_fai // channel: [mandatory] [ val(meta3), path(fai) ] => fasta index ch_vep_cache // channel: [optional] [ path(vep_cache) ] => The VEP cache to use ch_vep_extra_files // channel: [optional] [ path(file_1, file_2, file_3, ...) ] => All files necessary for using the desired plugins ch_vcfanno_config // channel: [mandatory if params.vcfanno == true] [ path(toml_config_file) ] => The TOML config file for VCFanno @@ -28,30 +27,28 @@ workflow VCF_ANNOTATION { main: - ch_annotated_vcfs = Channel.empty() - ch_reports = Channel.empty() - ch_versions = Channel.empty() + def ch_annotated_vcfs = Channel.empty() + def ch_reports = Channel.empty() + def ch_versions = Channel.empty() - ch_vcfs + def ch_tabix_input = ch_vcfs .branch { meta, vcf, tbi=[] -> tbi: tbi no_tbi: !tbi return [ meta, vcf ] } - .set { ch_tabix_input } TABIX_ENSEMBLVEP( ch_tabix_input.no_tbi ) ch_versions = ch_versions.mix(TABIX_ENSEMBLVEP.out.versions.first()) - ch_tabix_input.no_tbi + def ch_vep_input = ch_tabix_input.no_tbi .join(TABIX_ENSEMBLVEP.out.tbi, failOnDuplicate:true, failOnMismatch:true) .mix(ch_tabix_input.tbi) .map { meta, vcf, tbi -> [ meta, vcf, tbi, [] ] } - .set { ch_vep_input } // // Do the VEP annotation @@ -65,8 +62,6 @@ workflow VCF_ANNOTATION { vep_cache_version, ch_vep_cache, ch_vep_extra_files, - [], [], - ["ensemblvep"], vep_chunk_size ) @@ -79,12 +74,11 @@ workflow VCF_ANNOTATION { if (vcfanno) { - VCF_ANNOTATE_ENSEMBLVEP.out.vcf_tbi + def ch_vcfanno_input = VCF_ANNOTATE_ENSEMBLVEP.out.vcf_tbi .map { meta, vcf, tbi -> [ meta, vcf, tbi, [] ] } .dump(tag:'vcfanno_input', pretty:true) - .set { ch_vcfanno_input } VCFANNO( ch_vcfanno_input, @@ -99,12 +93,11 @@ workflow VCF_ANNOTATION { ) ch_versions = ch_versions.mix(BGZIP_ANNOTATED_VCFS.out.versions.first()) - BGZIP_ANNOTATED_VCFS.out.output.set { ch_annotated_vcfs } + ch_annotated_vcfs = BGZIP_ANNOTATED_VCFS.out.output } else { - VCF_ANNOTATE_ENSEMBLVEP.out.vcf_tbi - .map { meta, vcf, tbi -> [ meta, vcf ]} - .set { ch_annotated_vcfs } + ch_annotated_vcfs = VCF_ANNOTATE_ENSEMBLVEP.out.vcf_tbi + .map { meta, vcf, _tbi -> [ meta, vcf ]} } emit: diff --git a/tests/subworkflows/local/vcf_annotation/main.nf.test b/subworkflows/local/vcf_annotation/tests/main.nf.test similarity index 56% rename from tests/subworkflows/local/vcf_annotation/main.nf.test rename to subworkflows/local/vcf_annotation/tests/main.nf.test index 99cedb8e..48757069 100644 --- a/tests/subworkflows/local/vcf_annotation/main.nf.test +++ b/subworkflows/local/vcf_annotation/tests/main.nf.test @@ -1,7 +1,7 @@ nextflow_workflow { name "Test Workflow VCF_ANNOTATION" - script "subworkflows/local/vcf_annotation/main.nf" + script "../main.nf" workflow "VCF_ANNOTATION" tag "subworkflows" @@ -26,20 +26,16 @@ nextflow_workflow { [id:"fasta"], file(params.fasta, checkIfExists:true) ]) - input[2] = Channel.value([ - [id:"fai"], - file(params.fai, checkIfExists:true) - ]) - input[3] = Channel.value(file("vep_cache")) - input[4] = Channel.value([file("file1.txt"), file("file2.txt")]) + input[2] = Channel.value(file("vep_cache")) + input[3] = Channel.value([file("file1.txt"), file("file2.txt")]) + input[4] = [] input[5] = [] input[6] = [] - input[7] = [] - input[8] = "GRCh38" - input[9] = "homo_sapiens" - input[10] = 105 - input[11] = 50000 - input[12] = false + input[7] = "GRCh38" + input[8] = "homo_sapiens" + input[9] = 105 + input[10] = 50000 + input[11] = false """ } } @@ -48,9 +44,9 @@ nextflow_workflow { assertAll( { assert workflow.success }, { assert snapshot( - workflow.out.annotated_vcfs.collect { it.collect { it instanceof Map ? it.groupTarget : file(it).name } }, + workflow.out.annotated_vcfs.collect { [ it[0].groupTarget, it[1][-7..-1] ]}, workflow.out.reports - ).match("default") } + ).match() } ) } @@ -73,20 +69,16 @@ nextflow_workflow { [id:"fasta"], file(params.fasta, checkIfExists:true) ]) - input[2] = Channel.value([ - [id:"fai"], - file(params.fai, checkIfExists:true) - ]) - input[3] = Channel.value(file("vep_cache")) - input[4] = Channel.value([file("file1.txt"), file("file2.txt")]) - input[5] = Channel.value(file("vcfanno.toml")) - input[6] = [] - input[7] = Channel.value([file("file1.txt"), file("file2.txt")]) - input[8] = "GRCh38" - input[9] = "homo_sapiens" - input[10] = 105 - input[11] = 50000 - input[12] = true + input[2] = Channel.value(file("vep_cache")) + input[3] = Channel.value([file("file1.txt"), file("file2.txt")]) + input[4] = Channel.value(file("vcfanno.toml")) + input[5] = [] + input[6] = Channel.value([file("file1.txt"), file("file2.txt")]) + input[7] = "GRCh38" + input[8] = "homo_sapiens" + input[9] = 105 + input[10] = 50000 + input[11] = true """ } } @@ -95,9 +87,9 @@ nextflow_workflow { assertAll( { assert workflow.success }, { assert snapshot( - workflow.out.annotated_vcfs.collect { it.collect { it instanceof Map ? it.groupTarget : file(it).name } }, + workflow.out.annotated_vcfs.collect { [ it[0].groupTarget, it[1][-7..-1] ] }, workflow.out.reports - ).match("vcfanno") } + ).match() } ) } diff --git a/tests/subworkflows/local/vcf_annotation/main.nf.test.snap b/subworkflows/local/vcf_annotation/tests/main.nf.test.snap similarity index 77% rename from tests/subworkflows/local/vcf_annotation/main.nf.test.snap rename to subworkflows/local/vcf_annotation/tests/main.nf.test.snap index d3ebc487..1e808b5d 100644 --- a/tests/subworkflows/local/vcf_annotation/main.nf.test.snap +++ b/subworkflows/local/vcf_annotation/tests/main.nf.test.snap @@ -1,5 +1,5 @@ { - "default": { + "vcf_annotation - default": { "content": [ [ [ @@ -9,7 +9,7 @@ "family_samples": "NA24143", "caller": "haplotypecaller" }, - "NA24143.haplotypecaller.vcf.gz" + ".vcf.gz" ] ], [ @@ -19,12 +19,12 @@ ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T17:39:23.107888975" + "timestamp": "2024-11-14T16:27:10.638204373" }, - "vcfanno": { + "vcf_annotation - vcfanno": { "content": [ [ [ @@ -34,7 +34,7 @@ "family_samples": "NA24143", "caller": "haplotypecaller" }, - "NA24143.haplotypecaller.vcf.gz" + ".vcf.gz" ] ], [ @@ -44,9 +44,9 @@ ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T17:39:39.640344999" + "timestamp": "2024-11-14T16:27:36.089100362" } } \ No newline at end of file diff --git a/subworkflows/local/vcf_concat_bcftools/main.nf b/subworkflows/local/vcf_concat_bcftools/main.nf index c88aebe8..d2f3b186 100644 --- a/subworkflows/local/vcf_concat_bcftools/main.nf +++ b/subworkflows/local/vcf_concat_bcftools/main.nf @@ -12,14 +12,17 @@ workflow VCF_CONCAT_BCFTOOLS { main: - ch_versions = Channel.empty() + def ch_versions = Channel.empty() ch_vcfs - .map { meta, vcf, tbi -> + .map { meta, vcf, tbi=[] -> def new_meta = meta + [id:meta.sample ?: meta.family] [ groupKey(new_meta, meta.split_count), vcf, tbi ] } .groupTuple() + .map { meta, vcfs, tbis -> + [ meta, vcfs, tbis.findAll { tbi -> tbi != [] }] + } .set { ch_concat_input } BCFTOOLS_CONCAT( @@ -27,29 +30,27 @@ workflow VCF_CONCAT_BCFTOOLS { ) ch_versions = ch_versions.mix(BCFTOOLS_CONCAT.out.versions.first()) - ch_vcf_tbi = Channel.empty() + def ch_vcf_tbi = Channel.empty() if(val_tabix) { TABIX_TABIX( BCFTOOLS_CONCAT.out.vcf ) ch_versions = ch_versions.mix(TABIX_TABIX.out.versions.first()) - BCFTOOLS_CONCAT.out.vcf + ch_vcf_tbi = BCFTOOLS_CONCAT.out.vcf .join(TABIX_TABIX.out.tbi, failOnDuplicate: true, failOnMismatch: true) .map { meta, vcf, tbi -> // Remove the bed counter from the meta field def new_meta = meta - meta.subMap("split_count") [ new_meta, vcf, tbi ] } - .set { ch_vcf_tbi } } else { - BCFTOOLS_CONCAT.out.vcf + ch_vcf_tbi = BCFTOOLS_CONCAT.out.vcf .map { meta, vcf -> // Remove the bed counter from the meta field def new_meta = meta - meta.subMap("split_count") [ new_meta, vcf ] } - .set { ch_vcf_tbi } } emit: diff --git a/subworkflows/local/vcf_concat_bcftools/tests/main.nf.test b/subworkflows/local/vcf_concat_bcftools/tests/main.nf.test new file mode 100644 index 00000000..deb66854 --- /dev/null +++ b/subworkflows/local/vcf_concat_bcftools/tests/main.nf.test @@ -0,0 +1,75 @@ +nextflow_workflow { + + name "Test Workflow VCF_CONCAT_BCFTOOLS" + script "../main.nf" + workflow "VCF_CONCAT_BCFTOOLS" + + tag "subworkflows" + tag "subworkflows_local" + tag "vcf_concat_bcftools" + + test("vcf_concat_bcftools - no_tabix") { + + config "./nextflow.config" + + when { + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", caller:"haplotypecaller", split_count:2], + file(params.gvcf1, checkIfExists:true), + file(params.gtbi1, checkIfExists:true) + ],[ + [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", caller:"haplotypecaller", split_count:2], + file(params.vcf1, checkIfExists:true), + file(params.tbi1, checkIfExists:true) + ]) + input[1] = false + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.vcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}"] } + ).match() } + ) + } + + } + + test("vcf_concat_bcftools - tabix") { + + config "./nextflow.config" + + when { + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", caller:"haplotypecaller", split_count:2], + file(params.gvcf1, checkIfExists:true), + file(params.gtbi1, checkIfExists:true) + ],[ + [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", caller:"haplotypecaller", split_count:2], + file(params.vcf1, checkIfExists:true), + file(params.tbi1, checkIfExists:true) + ]) + input[1] = true + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.vcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}", it[2][-10..-1]] } + ).match() } + ) + } + + } + +} diff --git a/subworkflows/local/vcf_concat_bcftools/tests/main.nf.test.snap b/subworkflows/local/vcf_concat_bcftools/tests/main.nf.test.snap new file mode 100644 index 00000000..cfbc2f82 --- /dev/null +++ b/subworkflows/local/vcf_concat_bcftools/tests/main.nf.test.snap @@ -0,0 +1,45 @@ +{ + "vcf_concat_bcftools - tabix": { + "content": [ + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "haplotypecaller" + }, + "variantsMD5:843352db8fe3f441ffa026dc72a30c35", + "vcf.gz.tbi" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-14T13:41:32.794902067" + }, + "vcf_concat_bcftools - no_tabix": { + "content": [ + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "haplotypecaller" + }, + "variantsMD5:843352db8fe3f441ffa026dc72a30c35" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-14T13:42:04.623824246" + } +} \ No newline at end of file diff --git a/subworkflows/local/vcf_concat_bcftools/tests/nextflow.config b/subworkflows/local/vcf_concat_bcftools/tests/nextflow.config new file mode 100644 index 00000000..654ac56e --- /dev/null +++ b/subworkflows/local/vcf_concat_bcftools/tests/nextflow.config @@ -0,0 +1,6 @@ +process { + withName: "BCFTOOLS_CONCAT" { + ext.args = "--allow-overlaps --output-type z" + ext.prefix = { "${meta.id}.concat" } + } +} diff --git a/subworkflows/local/vcf_dbsnp_vcfanno/main.nf b/subworkflows/local/vcf_dbsnp_vcfanno/main.nf new file mode 100644 index 00000000..718b3067 --- /dev/null +++ b/subworkflows/local/vcf_dbsnp_vcfanno/main.nf @@ -0,0 +1,46 @@ +include { VCFANNO } from '../../../modules/nf-core/vcfanno/main' +include { TABIX_BGZIPTABIX } from '../../../modules/nf-core/tabix/bgziptabix/main' + +workflow VCF_DBSNP_VCFANNO { + take: + ch_input // channel: [mandatory] [ val(meta), path(vcf), path(tbi), ] => VCF files to be annotated + ch_dbsnp // channel: [optional] [ val(meta), path(vcf) ] => the dbnsp vcf file + ch_dbsnp_tbi // channel: [optional] [ val(meta), path(tbi) ] => the dbsnp vcf index file + + main: + def ch_versions = Channel.empty() + + def ch_vcfanno_toml = ch_dbsnp.map { _meta, dbsnp -> [ get_vcfanno_config(dbsnp) ] } + .collect() + + def ch_vcfanno_resources = ch_dbsnp.map { _meta, dbsnp -> dbsnp } + .combine(ch_dbsnp_tbi.map { _meta, tbi -> tbi }) + .collect() + + VCFANNO( + ch_input.map { meta, vcf, tbi -> [ meta, vcf, tbi, [] ] }, + ch_vcfanno_toml, + [], + ch_vcfanno_resources + ) + ch_versions = ch_versions.mix(VCFANNO.out.versions.first()) + + TABIX_BGZIPTABIX( + VCFANNO.out.vcf + ) + ch_versions = ch_versions.mix(TABIX_BGZIPTABIX.out.versions.first()) + + emit: + vcfs = TABIX_BGZIPTABIX.out.gz_tbi // channel: [ val(meta), path(vcf), path(tbi) ] + + versions = ch_versions // channel: [ path(versions.yml) ] + +} + +def get_vcfanno_config(vcf) { + def old_toml = file("${projectDir}/assets/dbsnp.toml", checkIfExists: true) + old_toml.copyTo("${workDir}/vcfanno/dbsnp.toml") + def new_toml = file("${workDir}/vcfanno/dbsnp.toml") + new_toml.text = old_toml.text.replace("DBSNP_FILE", vcf.getName()) + return new_toml +} diff --git a/subworkflows/local/vcf_dbsnp_vcfanno/tests/main.nf.test b/subworkflows/local/vcf_dbsnp_vcfanno/tests/main.nf.test new file mode 100644 index 00000000..3461e963 --- /dev/null +++ b/subworkflows/local/vcf_dbsnp_vcfanno/tests/main.nf.test @@ -0,0 +1,47 @@ +nextflow_workflow { + + name "Test Workflow VCF_DBSNP_VCFANNO" + script "../main.nf" + workflow "VCF_DBSNP_VCFANNO" + + tag "subworkflows" + tag "subworkflows_local" + tag "vcf_dbsnp_vcfanno" + + test("vcf_dbsnp_vcfanno - default") { + + when { + params { + annotate = true + } + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143", family:"NA24143", family_samples:"NA24143", caller:"haplotypecaller"], + file(params.vcf1, checkIfExists:true), + file(params.tbi1, checkIfExists:true) + ]) + input[1] = Channel.value([ + [id:"dbnsp"], + file(params.vcf2, checkIfExists:true) + ]) + input[2] = Channel.value([ + [id:"dbnsp"], + file(params.tbi2, checkIfExists:true) + ]) + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.vcfs.collect { [ it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}", it[2][-4..-1] ] } + ).match() } + ) + } + + } + +} diff --git a/subworkflows/local/vcf_dbsnp_vcfanno/tests/main.nf.test.snap b/subworkflows/local/vcf_dbsnp_vcfanno/tests/main.nf.test.snap new file mode 100644 index 00000000..64683b8a --- /dev/null +++ b/subworkflows/local/vcf_dbsnp_vcfanno/tests/main.nf.test.snap @@ -0,0 +1,23 @@ +{ + "vcf_dbsnp_vcfanno - default": { + "content": [ + [ + [ + { + "id": "NA24143", + "family": "NA24143", + "family_samples": "NA24143", + "caller": "haplotypecaller" + }, + "variantsMD5:b4f76bc67ba0e159489393d4788349b3", + ".tbi" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-14T16:28:22.613197067" + } +} \ No newline at end of file diff --git a/subworkflows/local/vcf_extract_relate_somalier/main.nf b/subworkflows/local/vcf_extract_relate_somalier/main.nf index d78e4915..2d5ec5b9 100644 --- a/subworkflows/local/vcf_extract_relate_somalier/main.nf +++ b/subworkflows/local/vcf_extract_relate_somalier/main.nf @@ -11,7 +11,7 @@ workflow VCF_EXTRACT_RELATE_SOMALIER { main: - ch_versions = Channel.empty() + def ch_versions = Channel.empty() SOMALIER_EXTRACT( ch_vcfs, @@ -22,12 +22,11 @@ workflow VCF_EXTRACT_RELATE_SOMALIER { ch_versions = ch_versions.mix(SOMALIER_EXTRACT.out.versions.first()) - SOMALIER_EXTRACT.out.extract + def ch_somalierrelate_input = SOMALIER_EXTRACT.out.extract .join(ch_peds, failOnDuplicate:true, failOnMismatch:true) .map { meta, extract, ped -> [ meta, extract, ped ] } - .set { ch_somalierrelate_input } SOMALIER_RELATE( ch_somalierrelate_input, diff --git a/tests/subworkflows/local/vcf_extract_relate_somalier/main.nf.test b/subworkflows/local/vcf_extract_relate_somalier/tests/main.nf.test similarity index 94% rename from tests/subworkflows/local/vcf_extract_relate_somalier/main.nf.test rename to subworkflows/local/vcf_extract_relate_somalier/tests/main.nf.test index 3e277b57..2e14eda3 100644 --- a/tests/subworkflows/local/vcf_extract_relate_somalier/main.nf.test +++ b/subworkflows/local/vcf_extract_relate_somalier/tests/main.nf.test @@ -1,7 +1,7 @@ nextflow_workflow { name "Test Workflow VCF_EXTRACT_RELATE_SOMALIER" - script "subworkflows/local/vcf_extract_relate_somalier/main.nf" + script "../main.nf" workflow "VCF_EXTRACT_RELATE_SOMALIER" tag "subworkflows" @@ -38,7 +38,7 @@ nextflow_workflow { workflow.out.pairs_tsv, workflow.out.samples_tsv, workflow.out.peds - ).match("default - peds") } + ).match() } ) } @@ -74,7 +74,7 @@ nextflow_workflow { workflow.out.pairs_tsv, workflow.out.samples_tsv, workflow.out.peds - ).match("default - no peds") } + ).match() } ) } diff --git a/tests/subworkflows/local/vcf_extract_relate_somalier/main.nf.test.snap b/subworkflows/local/vcf_extract_relate_somalier/tests/main.nf.test.snap similarity index 93% rename from tests/subworkflows/local/vcf_extract_relate_somalier/main.nf.test.snap rename to subworkflows/local/vcf_extract_relate_somalier/tests/main.nf.test.snap index 49d14e7c..002b354a 100644 --- a/tests/subworkflows/local/vcf_extract_relate_somalier/main.nf.test.snap +++ b/subworkflows/local/vcf_extract_relate_somalier/tests/main.nf.test.snap @@ -1,5 +1,5 @@ { - "default - peds": { + "vcf_extract_relate_somalier - default - peds": { "content": [ [ [ @@ -62,12 +62,12 @@ ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T17:39:52.526057891" + "timestamp": "2024-11-14T16:29:26.72882236" }, - "default - no peds": { + "vcf_extract_relate_somalier - default - no peds": { "content": [ [ [ @@ -130,9 +130,9 @@ ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T17:40:03.603049023" + "timestamp": "2024-11-14T16:29:46.729601248" } } \ No newline at end of file diff --git a/subworkflows/local/vcf_filter_bcftools/main.nf b/subworkflows/local/vcf_filter_bcftools/main.nf index 5b397d6c..cf2cdc1a 100644 --- a/subworkflows/local/vcf_filter_bcftools/main.nf +++ b/subworkflows/local/vcf_filter_bcftools/main.nf @@ -13,31 +13,29 @@ workflow VCF_FILTER_BCFTOOLS { main: - ch_versions = Channel.empty() + def ch_versions = Channel.empty() FILTER_1( - ch_vcfs.map { meta, vcf, tbi=[] -> [ meta, vcf ]} + ch_vcfs.map { meta, vcf, tbi=[] -> [ meta, vcf, tbi ]} ) ch_versions = ch_versions.mix(FILTER_1.out.versions.first()) FILTER_2( - FILTER_1.out.vcf + FILTER_1.out.vcf.map { meta, vcf -> [ meta, vcf, [] ]} ) ch_versions = ch_versions.mix(FILTER_2.out.versions.first()) - ch_filter_vcfs = Channel.empty() + def ch_filter_vcfs = Channel.empty() if(val_tabix) { TABIX_TABIX( FILTER_2.out.vcf ) ch_versions = ch_versions.mix(TABIX_TABIX.out.versions.first()) - FILTER_2.out.vcf + ch_filter_vcfs = FILTER_2.out.vcf .join(TABIX_TABIX.out.tbi, failOnDuplicate: true, failOnMismatch: true) - .set { ch_filter_vcfs } } else { - FILTER_2.out.vcf - .set { ch_filter_vcfs } + ch_filter_vcfs = FILTER_2.out.vcf } diff --git a/subworkflows/local/vcf_filter_bcftools/tests/main.nf.test b/subworkflows/local/vcf_filter_bcftools/tests/main.nf.test new file mode 100644 index 00000000..317c60bf --- /dev/null +++ b/subworkflows/local/vcf_filter_bcftools/tests/main.nf.test @@ -0,0 +1,67 @@ +nextflow_workflow { + + name "Test Workflow VCF_FILTER_BCFTOOLS" + script "../main.nf" + workflow "VCF_FILTER_BCFTOOLS" + + tag "subworkflows" + tag "subworkflows_local" + tag "vcf_filter_bcftools" + + test("vcf_filter_bcftools - no_tabix") { + + config "./nextflow.config" + + when { + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", caller:"haplotypecaller"], + file(params.vcf1, checkIfExists:true), + file(params.tbi1, checkIfExists:true) + ]) + input[1] = false + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.vcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}"] } + ).match() } + ) + } + + } + + test("vcf_filter_bcftools - tabix") { + + config "./nextflow.config" + + when { + workflow { + """ + input[0] = Channel.of([ + [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", caller:"haplotypecaller"], + file(params.vcf1, checkIfExists:true), + file(params.tbi1, checkIfExists:true) + ]) + input[1] = true + """ + } + } + + then { + assertAll( + { assert workflow.success }, + { assert snapshot( + workflow.out.vcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}", it[2][-10..-1]] } + ).match() } + ) + } + + } + +} diff --git a/subworkflows/local/vcf_filter_bcftools/tests/main.nf.test.snap b/subworkflows/local/vcf_filter_bcftools/tests/main.nf.test.snap new file mode 100644 index 00000000..9bc09fa6 --- /dev/null +++ b/subworkflows/local/vcf_filter_bcftools/tests/main.nf.test.snap @@ -0,0 +1,45 @@ +{ + "vcf_filter_bcftools - tabix": { + "content": [ + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "haplotypecaller" + }, + "variantsMD5:2ce8bc96a9b3afbf060cdd89e74c4c82", + "vcf.gz.tbi" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-14T13:46:42.107550656" + }, + "vcf_filter_bcftools - no_tabix": { + "content": [ + [ + [ + { + "id": "NA24143", + "sample": "NA24143", + "family": "Ashkenazim", + "family_samples": "NA24143", + "caller": "haplotypecaller" + }, + "variantsMD5:2ce8bc96a9b3afbf060cdd89e74c4c82" + ] + ] + ], + "meta": { + "nf-test": "0.9.1", + "nextflow": "24.10.0" + }, + "timestamp": "2024-11-14T13:46:30.33756839" + } +} \ No newline at end of file diff --git a/subworkflows/local/vcf_filter_bcftools/tests/nextflow.config b/subworkflows/local/vcf_filter_bcftools/tests/nextflow.config new file mode 100644 index 00000000..451724b7 --- /dev/null +++ b/subworkflows/local/vcf_filter_bcftools/tests/nextflow.config @@ -0,0 +1,10 @@ +process { + withName: "FILTER_1" { + ext.args = "--output-type z --soft-filter 'GATKCutoffSNP' -e 'TYPE=\"snp\" && (MQRankSum < -12.5 || ReadPosRankSum < -8.0 || QD < 2.0 || FS > 60.0 || MQ < 30.0)' -m '+'" + ext.prefix = { "${meta.id}.filter1" } + } + + withName: "FILTER_2" { + ext.args = '--output-type z --soft-filter \'GATKCutoffIndel\' -e \'TYPE="indel" && (ReadPosRankSum < -20.0 || QD < 2.0 || FS > 200.0 || SOR > 10.0 )\' -m \'+\'' + } +} diff --git a/subworkflows/local/vcf_ped_rtgtools/main.nf b/subworkflows/local/vcf_ped_rtgtools/main.nf index 2f762737..f4eba7b7 100644 --- a/subworkflows/local/vcf_ped_rtgtools/main.nf +++ b/subworkflows/local/vcf_ped_rtgtools/main.nf @@ -12,7 +12,7 @@ workflow VCF_PED_RTGTOOLS { main: - ch_versions = Channel.empty() + def ch_versions = Channel.empty() // // Remove extra columns from the samples TSV and convert to a VCF header @@ -27,12 +27,11 @@ workflow VCF_PED_RTGTOOLS { // Add the PED headers to the VCF using bcftools annotate --header-lines // - ch_vcfs + def ch_annotate_input = ch_vcfs .join(RTGTOOLS_PEDFILTER.out.output, failOnDuplicate:true, failOnMismatch:true) - .map { meta, vcf, tbi, ped_vcf -> + .map { meta, vcf, _tbi, ped_vcf -> [ meta, vcf, [], [], [], ped_vcf ] } - .set { ch_annotate_input } BCFTOOLS_ANNOTATE( ch_annotate_input diff --git a/tests/subworkflows/local/vcf_ped_rtgtools/main.nf.test b/subworkflows/local/vcf_ped_rtgtools/tests/main.nf.test similarity index 84% rename from tests/subworkflows/local/vcf_ped_rtgtools/main.nf.test rename to subworkflows/local/vcf_ped_rtgtools/tests/main.nf.test index 24398ed4..48025437 100644 --- a/tests/subworkflows/local/vcf_ped_rtgtools/main.nf.test +++ b/subworkflows/local/vcf_ped_rtgtools/tests/main.nf.test @@ -1,7 +1,7 @@ nextflow_workflow { name "Test Workflow VCF_PED_RTGTOOLS" - script "subworkflows/local/vcf_ped_rtgtools/main.nf" + script "../main.nf" workflow "VCF_PED_RTGTOOLS" tag "subworkflows" @@ -33,8 +33,8 @@ nextflow_workflow { assertAll( { assert workflow.success }, { assert snapshot( - workflow.out.ped_vcfs.collect { it.collect { it instanceof Map ? it : file(it).name } } - ).match("default") } + workflow.out.ped_vcfs.collect { [it[0], "variantsMD5:${path(it[1]).vcf.variantsMD5}"] } + ).match() } ) } diff --git a/tests/subworkflows/local/vcf_ped_rtgtools/main.nf.test.snap b/subworkflows/local/vcf_ped_rtgtools/tests/main.nf.test.snap similarity index 63% rename from tests/subworkflows/local/vcf_ped_rtgtools/main.nf.test.snap rename to subworkflows/local/vcf_ped_rtgtools/tests/main.nf.test.snap index f988cc9f..c6806115 100644 --- a/tests/subworkflows/local/vcf_ped_rtgtools/main.nf.test.snap +++ b/subworkflows/local/vcf_ped_rtgtools/tests/main.nf.test.snap @@ -1,5 +1,5 @@ { - "default": { + "vcf_ped_rtgtools - default": { "content": [ [ [ @@ -9,14 +9,14 @@ "family_samples": "NA24143,NA24149,NA24385", "caller": "haplotypecaller" }, - "Ashkenazim.haplotypecaller.vcf.gz" + "variantsMD5:4367d1c96e701a1cc1a96b615381b278" ] ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T17:40:15.516745396" + "timestamp": "2024-11-14T15:54:56.709943272" } } \ No newline at end of file diff --git a/subworkflows/local/vcf_roh_automap/main.nf b/subworkflows/local/vcf_roh_automap/main.nf index 24998f31..2941fd4e 100644 --- a/subworkflows/local/vcf_roh_automap/main.nf +++ b/subworkflows/local/vcf_roh_automap/main.nf @@ -13,7 +13,7 @@ workflow VCF_ROH_AUTOMAP { val_genome // value: [mandatory] => The genome to be used by automap main: - ch_versions = Channel.empty() + def ch_versions = Channel.empty() def hg_genome = val_genome == "GRCh38" ? "hg38" : val_genome == "GRCh37" ? "hg19" : val_genome @@ -22,15 +22,15 @@ workflow VCF_ROH_AUTOMAP { } // Merge the repeat BED files from the container if no container has been given - ch_valid_repeats = Channel.empty() + def ch_valid_repeats = Channel.empty() if (!ch_repeats) { AUTOMAP_REPEATS( Channel.value([[id:"${val_genome}_repeats"], val_genome]) ) ch_versions = ch_versions.mix(AUTOMAP_REPEATS.out.versions) - AUTOMAP_REPEATS.out.repeats.collect().set { ch_valid_repeats } + ch_valid_repeats = AUTOMAP_REPEATS.out.repeats.collect() } else { - ch_repeats.set { ch_valid_repeats } + ch_valid_repeats = ch_repeats } AUTOMAP_AUTOMAP( diff --git a/tests/subworkflows/local/vcf_roh_automap/main.nf.test b/subworkflows/local/vcf_roh_automap/tests/main.nf.test similarity index 97% rename from tests/subworkflows/local/vcf_roh_automap/main.nf.test rename to subworkflows/local/vcf_roh_automap/tests/main.nf.test index 724b1757..a58e50a7 100644 --- a/tests/subworkflows/local/vcf_roh_automap/main.nf.test +++ b/subworkflows/local/vcf_roh_automap/tests/main.nf.test @@ -1,7 +1,7 @@ nextflow_workflow { name "Test Workflow VCF_ROH_AUTOMAP" - script "subworkflows/local/vcf_roh_automap/main.nf" + script "../main.nf" workflow "VCF_ROH_AUTOMAP" tag "subworkflows" diff --git a/tests/subworkflows/local/vcf_roh_automap/main.nf.test.snap b/subworkflows/local/vcf_roh_automap/tests/main.nf.test.snap similarity index 100% rename from tests/subworkflows/local/vcf_roh_automap/main.nf.test.snap rename to subworkflows/local/vcf_roh_automap/tests/main.nf.test.snap diff --git a/subworkflows/local/vcf_upd_updio/main.nf b/subworkflows/local/vcf_upd_updio/main.nf index c7c96d2a..1d0cf66b 100644 --- a/subworkflows/local/vcf_upd_updio/main.nf +++ b/subworkflows/local/vcf_upd_updio/main.nf @@ -2,7 +2,8 @@ // Run UPDio analysis // -include { UPDIO } from '../../../modules/local/updio/main' +include { UPDIO } from '../../../modules/local/updio/main' +include { BCFTOOLS_FILTER } from '../../../modules/nf-core/bcftools/filter' workflow VCF_UPD_UPDIO { take: @@ -12,36 +13,41 @@ workflow VCF_UPD_UPDIO { main: - ch_versions = Channel.empty() + def ch_versions = Channel.empty() // Filter out all families that have less than 3 samples - ch_vcfs - .filter { meta, vcf, tbi -> + def ch_trio_vcfs = ch_vcfs + .filter { meta, _vcf, _tbi -> meta.family_samples.tokenize(",").size() >= 3 } - .set { ch_trio_vcfs } - ch_peds - .filter { meta, ped -> + BCFTOOLS_FILTER( + ch_trio_vcfs + ) + ch_versions = ch_versions.mix(BCFTOOLS_FILTER.out.versions.first()) + + def ch_filter_output = BCFTOOLS_FILTER.out.vcf + .join(BCFTOOLS_FILTER.out.tbi, failOnDuplicate:true, failOnMismatch:true) + + def ch_trio_peds = ch_peds + .filter { meta, _ped -> meta.family_samples.tokenize(",").size() >= 3 } - .set { ch_trio_peds } - CustomChannelOperators.joinOnKeys( - [failOnDuplicate:true, failOnMismatch:true], - ch_trio_vcfs, - ch_trio_peds, - ["id", "family", "family_samples", "caller"] - ) + def ch_trio_vcfs_family = CustomChannelOperators.joinOnKeys( + [failOnDuplicate:true, failOnMismatch:true], + ch_filter_output, + ch_trio_peds, + ["id", "family", "family_samples", "caller"] + ) .map { meta, vcf, tbi, ped -> def meta_list = get_family_data_from_ped(meta, ped) [ meta_list, vcf, tbi ] } - .filter { meta, vcf, tbi -> + .filter { meta, _vcf, _tbi -> meta } .transpose(by:0) - .set { ch_trio_vcfs_family } UPDIO( ch_trio_vcfs_family, @@ -58,9 +64,6 @@ def get_family_data_from_ped(meta, ped) { def output = [] ped.readLines().each { line -> if(line.startsWith("#")) { return } - def child = null - def mother = null - def father = null def split_line = line.split("\t") if(split_line[1] != "0" && split_line[2] != "0" && split_line[3] != "0") { output.add(meta + [child:split_line[1], father:split_line[2], mother:split_line[3]]) diff --git a/tests/subworkflows/local/vcf_upd_updio/main.nf.test b/subworkflows/local/vcf_upd_updio/tests/main.nf.test similarity index 97% rename from tests/subworkflows/local/vcf_upd_updio/main.nf.test rename to subworkflows/local/vcf_upd_updio/tests/main.nf.test index 1507c48a..8e4ff3f5 100644 --- a/tests/subworkflows/local/vcf_upd_updio/main.nf.test +++ b/subworkflows/local/vcf_upd_updio/tests/main.nf.test @@ -1,7 +1,7 @@ nextflow_workflow { name "Test Workflow VCF_UPD_UPDIO" - script "subworkflows/local/vcf_upd_updio/main.nf" + script "../main.nf" workflow "VCF_UPD_UPDIO" tag "subworkflows" @@ -12,6 +12,7 @@ nextflow_workflow { when { params { + outdir = "${outputDir}" updio = true } workflow { diff --git a/tests/subworkflows/local/vcf_upd_updio/main.nf.test.snap b/subworkflows/local/vcf_upd_updio/tests/main.nf.test.snap similarity index 89% rename from tests/subworkflows/local/vcf_upd_updio/main.nf.test.snap rename to subworkflows/local/vcf_upd_updio/tests/main.nf.test.snap index 09731d58..fa31d40c 100644 --- a/tests/subworkflows/local/vcf_upd_updio/main.nf.test.snap +++ b/subworkflows/local/vcf_upd_updio/tests/main.nf.test.snap @@ -14,7 +14,7 @@ }, [ "NA24385.events_list:md5,c61379c7a4bb61cbe9612b5cda773cf4", - "NA24385.log:md5,a40a24f379127a9cde7e40a1ce1032ec", + "NA24385.log:md5,ea4c71f55c4a53ecdc74f00659ea161d", "NA24385.table:md5,ca8165fc7869a113ca034396de7cf579", "NA24385.upd:md5,d41d8cd98f00b204e9800998ecf8427e" ] @@ -25,7 +25,7 @@ "nf-test": "0.9.0", "nextflow": "24.04.4" }, - "timestamp": "2024-09-05T17:40:47.828005254" + "timestamp": "2024-10-08T16:42:16.968323682" }, "default - sample": { "content": [ diff --git a/subworkflows/local/vcf_validate_small_variants/main.nf b/subworkflows/local/vcf_validate_small_variants/main.nf index 63ca83fd..1464caaf 100644 --- a/subworkflows/local/vcf_validate_small_variants/main.nf +++ b/subworkflows/local/vcf_validate_small_variants/main.nf @@ -5,16 +5,13 @@ workflow VCF_VALIDATE_SMALL_VARIANTS { take: ch_vcf // [mandatory] channel: [ meta, vcf, tbi, truth_vcf, truth_tbi ] - ch_beds // [mandatory] channel: [ meta, regions_bed, targets_bed ] - ch_fasta // [happy only] channel: [ meta, fasta ] - ch_fasta_fai // [happy only] channel: [ meta, fasta_fai ] + ch_beds // [mandatory] channel: [ meta, truth_bed, region_bed ] ch_vcfeval_sdf // [vcfeval only] channel: [ meta, sdf ] main: - ch_versions = Channel.empty() - - ch_input = ch_vcf.join(ch_beds, failOnDuplicate: true, failOnMismatch: true) + def ch_versions = Channel.empty() + def ch_input = ch_vcf.join(ch_beds, failOnDuplicate: true, failOnMismatch: true) RTGTOOLS_VCFEVAL( ch_input, @@ -22,7 +19,7 @@ workflow VCF_VALIDATE_SMALL_VARIANTS { ) ch_versions = ch_versions.mix(RTGTOOLS_VCFEVAL.out.versions.first()) - ch_rocplot_input = RTGTOOLS_VCFEVAL.out.snp_roc + def ch_rocplot_input = RTGTOOLS_VCFEVAL.out.snp_roc .map { meta, tsv -> [ meta + [roc_type:'snp'], tsv ] } @@ -35,19 +32,19 @@ workflow VCF_VALIDATE_SMALL_VARIANTS { } ) - vcfeval_true_positive_vcf = RTGTOOLS_VCFEVAL.out.tp_vcf - vcfeval_true_positive_vcf_tbi = RTGTOOLS_VCFEVAL.out.tp_tbi - vcfeval_false_negative_vcf = RTGTOOLS_VCFEVAL.out.fn_vcf - vcfeval_false_negative_vcf_tbi = RTGTOOLS_VCFEVAL.out.fn_tbi - vcfeval_false_positive_vcf = RTGTOOLS_VCFEVAL.out.fp_vcf - vcfeval_false_positive_vcf_tbi = RTGTOOLS_VCFEVAL.out.fp_tbi - vcfeval_true_positive_baseline_vcf = RTGTOOLS_VCFEVAL.out.baseline_vcf - vcfeval_true_positive_baseline_vcf_tbi = RTGTOOLS_VCFEVAL.out.baseline_tbi - vcfeval_summary = RTGTOOLS_VCFEVAL.out.summary - vcfeval_phasing = RTGTOOLS_VCFEVAL.out.phasing - vcfeval_snp_roc = RTGTOOLS_VCFEVAL.out.snp_roc - vcfeval_non_snp_roc = RTGTOOLS_VCFEVAL.out.non_snp_roc - vcfeval_weighted_roc = RTGTOOLS_VCFEVAL.out.weighted_roc + def ch_vcfeval_true_positive_vcf = RTGTOOLS_VCFEVAL.out.tp_vcf + def ch_vcfeval_true_positive_vcf_tbi = RTGTOOLS_VCFEVAL.out.tp_tbi + def ch_vcfeval_false_negative_vcf = RTGTOOLS_VCFEVAL.out.fn_vcf + def ch_vcfeval_false_negative_vcf_tbi = RTGTOOLS_VCFEVAL.out.fn_tbi + def ch_vcfeval_false_positive_vcf = RTGTOOLS_VCFEVAL.out.fp_vcf + def ch_vcfeval_false_positive_vcf_tbi = RTGTOOLS_VCFEVAL.out.fp_tbi + def ch_vcfeval_true_positive_baseline_vcf = RTGTOOLS_VCFEVAL.out.baseline_vcf + def ch_vcfeval_true_positive_baseline_vcf_tbi = RTGTOOLS_VCFEVAL.out.baseline_tbi + def ch_vcfeval_summary = RTGTOOLS_VCFEVAL.out.summary + def ch_vcfeval_phasing = RTGTOOLS_VCFEVAL.out.phasing + def ch_vcfeval_snp_roc = RTGTOOLS_VCFEVAL.out.snp_roc + def ch_vcfeval_non_snp_roc = RTGTOOLS_VCFEVAL.out.non_snp_roc + def ch_vcfeval_weighted_roc = RTGTOOLS_VCFEVAL.out.weighted_roc RTGTOOLS_ROCPLOT( ch_rocplot_input @@ -55,55 +52,52 @@ workflow VCF_VALIDATE_SMALL_VARIANTS { ch_versions = ch_versions.mix(RTGTOOLS_ROCPLOT.out.versions.first()) - rocplot_out_png = RTGTOOLS_ROCPLOT.out.png - .branch { meta, png -> - roc_type = meta.roc_type - def new_meta = meta - meta.subMap("roc_type") + def rocplot_out_png = RTGTOOLS_ROCPLOT.out.png + .branch { meta, _png -> + def roc_type = meta.roc_type snp: roc_type == "snp" non_snp: roc_type == "non_snp" weighted: roc_type == "weighted" } - rocplot_out_svg = RTGTOOLS_ROCPLOT.out.svg - .branch { meta, svg -> - roc_type = meta.roc_type - def new_meta = meta - meta.subMap("roc_type") + def rocplot_out_svg = RTGTOOLS_ROCPLOT.out.svg + .branch { meta, _svg -> + def roc_type = meta.roc_type snp: roc_type == "snp" non_snp: roc_type == "non_snp" weighted: roc_type == "weighted" } - rtgtools_snp_png_rocplot = rocplot_out_png.snp - rtgtools_non_snp_png_rocplot = rocplot_out_png.non_snp - rtgtools_weighted_png_rocplot = rocplot_out_png.weighted + def ch_rtgtools_snp_png_rocplot = rocplot_out_png.snp + def ch_rtgtools_non_snp_png_rocplot = rocplot_out_png.non_snp + def ch_rtgtools_weighted_png_rocplot = rocplot_out_png.weighted - rtgtools_snp_svg_rocplot = rocplot_out_svg.snp - rtgtools_non_snp_svg_rocplot = rocplot_out_svg.non_snp - rtgtools_weighted_svg_rocplot = rocplot_out_svg.weighted + def ch_rtgtools_snp_svg_rocplot = rocplot_out_svg.snp + def ch_rtgtools_non_snp_svg_rocplot = rocplot_out_svg.non_snp + def ch_rtgtools_weighted_svg_rocplot = rocplot_out_svg.weighted emit: - vcfeval_true_positive_vcf // channel: [ meta, vcf ] - vcfeval_true_positive_vcf_tbi // channel: [ meta, tbi ] - vcfeval_false_negative_vcf // channel: [ meta, vcf ] - vcfeval_false_negative_vcf_tbi // channel: [ meta, tbi ] - vcfeval_false_positive_vcf // channel: [ meta, vcf ] - vcfeval_false_positive_vcf_tbi // channel: [ meta, tbi ] - vcfeval_true_positive_baseline_vcf // channel: [ meta, vcf ] - vcfeval_true_positive_baseline_vcf_tbi // channel: [ meta, tbi ] - vcfeval_summary // channel: [ meta, summary ] - vcfeval_phasing // channel: [ meta, phasing ] - vcfeval_snp_roc // channel: [ meta, tsv ] - vcfeval_non_snp_roc // channel: [ meta, tsv ] - vcfeval_weighted_roc // channel: [ meta, tsv ] - - rtgtools_snp_png_rocplot // channel: [ meta, png ] - rtgtools_non_snp_png_rocplot // channel: [ meta, png ] - rtgtools_weighted_png_rocplot // channel: [ meta, png ] - rtgtools_snp_svg_rocplot // channel: [ meta, svg ] - rtgtools_non_snp_svg_rocplot // channel: [ meta, svg ] - rtgtools_weighted_svg_rocplot // channel: [ meta, svg ] + vcfeval_true_positive_vcf = ch_vcfeval_true_positive_vcf // channel: [ meta, vcf ] + vcfeval_true_positive_vcf_tbi = ch_vcfeval_true_positive_vcf_tbi // channel: [ meta, tbi ] + vcfeval_false_negative_vcf = ch_vcfeval_false_negative_vcf // channel: [ meta, vcf ] + vcfeval_false_negative_vcf_tbi = ch_vcfeval_false_negative_vcf_tbi // channel: [ meta, tbi ] + vcfeval_false_positive_vcf = ch_vcfeval_false_positive_vcf // channel: [ meta, vcf ] + vcfeval_false_positive_vcf_tbi = ch_vcfeval_false_positive_vcf_tbi // channel: [ meta, tbi ] + vcfeval_true_positive_baseline_vcf = ch_vcfeval_true_positive_baseline_vcf // channel: [ meta, vcf ] + vcfeval_true_positive_baseline_vcf_tbi = ch_vcfeval_true_positive_baseline_vcf_tbi // channel: [ meta, tbi ] + vcfeval_summary = ch_vcfeval_summary // channel: [ meta, summary ] + vcfeval_phasing = ch_vcfeval_phasing // channel: [ meta, phasing ] + vcfeval_snp_roc = ch_vcfeval_snp_roc // channel: [ meta, tsv ] + vcfeval_non_snp_roc = ch_vcfeval_non_snp_roc // channel: [ meta, tsv ] + vcfeval_weighted_roc = ch_vcfeval_weighted_roc // channel: [ meta, tsv ] + rtgtools_snp_png_rocplot = ch_rtgtools_snp_png_rocplot // channel: [ meta, png ] + rtgtools_non_snp_png_rocplot = ch_rtgtools_non_snp_png_rocplot // channel: [ meta, png ] + rtgtools_weighted_png_rocplot = ch_rtgtools_weighted_png_rocplot // channel: [ meta, png ] + rtgtools_snp_svg_rocplot = ch_rtgtools_snp_svg_rocplot // channel: [ meta, svg ] + rtgtools_non_snp_svg_rocplot = ch_rtgtools_non_snp_svg_rocplot // channel: [ meta, svg ] + rtgtools_weighted_svg_rocplot = ch_rtgtools_weighted_svg_rocplot // channel: [ meta, svg ] versions = ch_versions // channel: [ versions.yml ] } diff --git a/tests/subworkflows/local/vcf_validate_small_variants/main.nf.test b/subworkflows/local/vcf_validate_small_variants/tests/main.nf.test similarity index 94% rename from tests/subworkflows/local/vcf_validate_small_variants/main.nf.test rename to subworkflows/local/vcf_validate_small_variants/tests/main.nf.test index 6714fe38..e947b04d 100644 --- a/tests/subworkflows/local/vcf_validate_small_variants/main.nf.test +++ b/subworkflows/local/vcf_validate_small_variants/tests/main.nf.test @@ -1,7 +1,7 @@ nextflow_workflow { name "Test Workflow VCF_VALIDATE_SMALL_VARIANTS" - script "subworkflows/local/vcf_validate_small_variants/main.nf" + script "../main.nf" workflow "VCF_VALIDATE_SMALL_VARIANTS" tag "subworkflows" @@ -42,9 +42,7 @@ nextflow_workflow { file(params.bed, checkIfExists:true), [] ]) - input[2] = [[],[]] - input[3] = [[],[]] - input[4] = UNTAR.out.untar + input[2] = UNTAR.out.untar """ } } diff --git a/tests/subworkflows/local/vcf_validate_small_variants/main.nf.test.snap b/subworkflows/local/vcf_validate_small_variants/tests/main.nf.test.snap similarity index 88% rename from tests/subworkflows/local/vcf_validate_small_variants/main.nf.test.snap rename to subworkflows/local/vcf_validate_small_variants/tests/main.nf.test.snap index 7d04aca4..e70830c3 100644 --- a/tests/subworkflows/local/vcf_validate_small_variants/main.nf.test.snap +++ b/subworkflows/local/vcf_validate_small_variants/tests/main.nf.test.snap @@ -9,7 +9,7 @@ "family": "Ashkenazim", "family_samples": "NA24143,NA24149,NA24385" }, - "NA24143.tp.vcf.gz:md5,b28d3e84a6efa95bfbf673de310f62c1" + "NA24143.tp.vcf.gz:md5,0449f4185b63d0b988b7f9de6a6b2fb3" ] ], [ @@ -20,7 +20,7 @@ "family": "Ashkenazim", "family_samples": "NA24143,NA24149,NA24385" }, - "NA24143.tp.vcf.gz.tbi:md5,8f8356fd93e093e9deb1c2c4c72cb65e" + "NA24143.tp.vcf.gz.tbi:md5,637a1a87502b3764ad22017b16ccb3ac" ] ], [ @@ -31,7 +31,7 @@ "family": "Ashkenazim", "family_samples": "NA24143,NA24149,NA24385" }, - "NA24143.fn.vcf.gz:md5,99830701ae0191110b706b20abe56857" + "NA24143.fn.vcf.gz:md5,688c27c6785313c1a0ac9fdcf430e0eb" ] ], [ @@ -53,7 +53,7 @@ "family": "Ashkenazim", "family_samples": "NA24143,NA24149,NA24385" }, - "NA24143.fp.vcf.gz:md5,2769bf65afefc821fe54a8b314b5d2be" + "NA24143.fp.vcf.gz:md5,d11b30ab764bd9c3f8294b51043a3e1a" ] ], [ @@ -75,7 +75,7 @@ "family": "Ashkenazim", "family_samples": "NA24143,NA24149,NA24385" }, - "NA24143.tp-baseline.vcf.gz:md5,6de63c7e2fa458ca88ea78c36f579056" + "NA24143.tp-baseline.vcf.gz:md5,748b2b41e2f5916fd0e5fbc2aea60d3d" ] ], [ @@ -86,7 +86,7 @@ "family": "Ashkenazim", "family_samples": "NA24143,NA24149,NA24385" }, - "NA24143.tp-baseline.vcf.gz.tbi:md5,8d399004d8c45fa795b70da32e23a6d5" + "NA24143.tp-baseline.vcf.gz.tbi:md5,8c1a12430a7714b000c11e471175d6e1" ] ], [ @@ -108,7 +108,7 @@ "family": "Ashkenazim", "family_samples": "NA24143,NA24149,NA24385" }, - "NA24143.phasing.txt:md5,38920536b8c3e241e873c07ba61762e6" + "NA24143.phasing.txt:md5,dfcf0fc6ad7e98c737a7a0afaf2c2733" ] ], [ @@ -119,7 +119,7 @@ "family": "Ashkenazim", "family_samples": "NA24143,NA24149,NA24385" }, - "NA24143.snp_roc.tsv.gz:md5,d29e7796d37dada43378e466af98c35a" + "NA24143.snp_roc.tsv.gz:md5,306d33ea1cf1d797d0f52234d2ff901e" ] ], [ @@ -130,7 +130,7 @@ "family": "Ashkenazim", "family_samples": "NA24143,NA24149,NA24385" }, - "NA24143.non_snp_roc.tsv.gz:md5,121b8b67d40efcaf1aad5064c6f3bfe9" + "NA24143.non_snp_roc.tsv.gz:md5,31e6719897dd4f7d3d0a9cb4943693cf" ] ], [ @@ -141,7 +141,7 @@ "family": "Ashkenazim", "family_samples": "NA24143,NA24149,NA24385" }, - "NA24143.weighted_roc.tsv.gz:md5,f016f9a10e793836b9b53d702e591454" + "NA24143.weighted_roc.tsv.gz:md5,e1e87f61f55f976162be76fe993423e1" ] ], [ @@ -221,6 +221,6 @@ "nf-test": "0.9.0", "nextflow": "24.04.4" }, - "timestamp": "2024-09-05T17:41:19.06545291" + "timestamp": "2024-10-08T15:56:00.946806992" } } \ No newline at end of file diff --git a/subworkflows/local/watchpath_handling/main.nf b/subworkflows/local/watchpath_handling/main.nf index eca869b1..09a8f55e 100644 --- a/subworkflows/local/watchpath_handling/main.nf +++ b/subworkflows/local/watchpath_handling/main.nf @@ -37,6 +37,11 @@ workflow WATCHPATH_HANDLING { def samplesheet_list = samplesheetToList(input_samplesheet, samplesheet_schema) // Do some calculations and manipulations here .collect { row -> + // Replace dots with underscores in sample and family names to prevent breaking the multiqc report + row[0].id = row[0].id.replace(".", "_") + row[0].sample = row[0].sample.replace(".", "_") + row[0].family = row[0].family ? row[0].family.replace(".", "_") : row[0].family + // Watchpath logic def is_watch = false row = row.collect { input -> diff --git a/subworkflows/nf-core/utils_nextflow_pipeline/main.nf b/subworkflows/nf-core/utils_nextflow_pipeline/main.nf index 28e32b20..2f9bdfc1 100644 --- a/subworkflows/nf-core/utils_nextflow_pipeline/main.nf +++ b/subworkflows/nf-core/utils_nextflow_pipeline/main.nf @@ -3,13 +3,12 @@ // /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW DEFINITION -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow UTILS_NEXTFLOW_PIPELINE { - take: print_version // boolean: print version dump_parameters // boolean: dump parameters @@ -22,7 +21,7 @@ workflow UTILS_NEXTFLOW_PIPELINE { // Print workflow version and exit on --version // if (print_version) { - log.info "${workflow.manifest.name} ${getWorkflowVersion()}" + log.info("${workflow.manifest.name} ${getWorkflowVersion()}") System.exit(0) } @@ -45,9 +44,9 @@ workflow UTILS_NEXTFLOW_PIPELINE { } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ // @@ -72,13 +71,13 @@ def getWorkflowVersion() { // Dump pipeline parameters to a JSON file // def dumpParametersToJSON(outdir) { - def timestamp = new java.util.Date().format( 'yyyy-MM-dd_HH-mm-ss') - def filename = "params_${timestamp}.json" - def temp_pf = new File(workflow.launchDir.toString(), ".${filename}") - def jsonStr = groovy.json.JsonOutput.toJson(params) - temp_pf.text = groovy.json.JsonOutput.prettyPrint(jsonStr) + def timestamp = new java.util.Date().format('yyyy-MM-dd_HH-mm-ss') + def filename = "params_${timestamp}.json" + def temp_pf = new File(workflow.launchDir.toString(), ".${filename}") + def jsonStr = groovy.json.JsonOutput.toJson(params) + temp_pf.text = groovy.json.JsonOutput.prettyPrint(jsonStr) - nextflow.extension.FilesEx.copyTo(temp_pf.toPath(), "${outdir}/pipeline_info/params_${timestamp}.json") + nextflow.extension.FilesEx.copyTo(temp_pf.toPath(), "${outdir}/${params.unique_out}/params_${timestamp}.json") temp_pf.delete() } @@ -91,9 +90,14 @@ def checkCondaChannels() { try { def config = parser.load("conda config --show channels".execute().text) channels = config.channels - } catch(NullPointerException | IOException e) { - log.warn "Could not verify conda channel configuration." - return + } + catch (NullPointerException e) { + log.warn("Could not verify conda channel configuration.") + return null + } + catch (IOException e) { + log.warn("Could not verify conda channel configuration.") + return null } // Check that all channels are present @@ -102,23 +106,19 @@ def checkCondaChannels() { def channels_missing = ((required_channels_in_order as Set) - (channels as Set)) as Boolean // Check that they are in the right order - def channel_priority_violation = false - - required_channels_in_order.eachWithIndex { channel, index -> - if (index < required_channels_in_order.size() - 1) { - channel_priority_violation |= !(channels.indexOf(channel) < channels.indexOf(required_channels_in_order[index+1])) - } - } + def channel_priority_violation = required_channels_in_order != channels.findAll { ch -> ch in required_channels_in_order } if (channels_missing | channel_priority_violation) { - log.warn "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~\n" + - " There is a problem with your Conda configuration!\n\n" + - " You will need to set-up the conda-forge and bioconda channels correctly.\n" + - " Please refer to https://bioconda.github.io/\n" + - " The observed channel order is \n" + - " ${channels}\n" + - " but the following channel order is required:\n" + - " ${required_channels_in_order}\n" + - "~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" + log.warn """\ + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ + There is a problem with your Conda configuration! + You will need to set-up the conda-forge and bioconda channels correctly. + Please refer to https://bioconda.github.io/ + The observed channel order is + ${channels} + but the following channel order is required: + ${required_channels_in_order} + ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~" + """.stripIndent(true) } } diff --git a/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.workflow.nf.test b/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.workflow.nf.test index ca964ce8..02dbf094 100644 --- a/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.workflow.nf.test +++ b/subworkflows/nf-core/utils_nextflow_pipeline/tests/main.workflow.nf.test @@ -52,10 +52,12 @@ nextflow_workflow { } then { - assertAll( - { assert workflow.success }, - { assert workflow.stdout.contains("nextflow_workflow v9.9.9") } - ) + expect { + with(workflow) { + assert success + assert "nextflow_workflow v9.9.9" in stdout + } + } } } diff --git a/subworkflows/nf-core/utils_nfcore_pipeline/main.nf b/subworkflows/nf-core/utils_nfcore_pipeline/main.nf index cbd8495b..5cb7bafe 100644 --- a/subworkflows/nf-core/utils_nfcore_pipeline/main.nf +++ b/subworkflows/nf-core/utils_nfcore_pipeline/main.nf @@ -3,13 +3,12 @@ // /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ SUBWORKFLOW DEFINITION -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ workflow UTILS_NFCORE_PIPELINE { - take: nextflow_cli_args @@ -22,9 +21,9 @@ workflow UTILS_NFCORE_PIPELINE { } /* -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ FUNCTIONS -======================================================================================== +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ */ // @@ -33,12 +32,9 @@ workflow UTILS_NFCORE_PIPELINE { def checkConfigProvided() { def valid_config = true as Boolean if (workflow.profile == 'standard' && workflow.configFiles.size() <= 1) { - log.warn "[$workflow.manifest.name] You are attempting to run the pipeline without any custom configuration!\n\n" + - "This will be dependent on your local compute environment but can be achieved via one or more of the following:\n" + - " (1) Using an existing pipeline profile e.g. `-profile docker` or `-profile singularity`\n" + - " (2) Using an existing nf-core/configs for your Institution e.g. `-profile crick` or `-profile uppmax`\n" + - " (3) Using your own local custom config e.g. `-c /path/to/your/custom.config`\n\n" + - "Please refer to the quick start section and usage docs for the pipeline.\n " + log.warn( + "[${workflow.manifest.name}] You are attempting to run the pipeline without any custom configuration!\n\n" + "This will be dependent on your local compute environment but can be achieved via one or more of the following:\n" + " (1) Using an existing pipeline profile e.g. `-profile docker` or `-profile singularity`\n" + " (2) Using an existing nf-core/configs for your Institution e.g. `-profile crick` or `-profile uppmax`\n" + " (3) Using your own local custom config e.g. `-c /path/to/your/custom.config`\n\n" + "Please refer to the quick start section and usage docs for the pipeline.\n " + ) valid_config = false } return valid_config @@ -49,12 +45,14 @@ def checkConfigProvided() { // def checkProfileProvided(nextflow_cli_args) { if (workflow.profile.endsWith(',')) { - error "The `-profile` option cannot end with a trailing comma, please remove it and re-run the pipeline!\n" + - "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + error( + "The `-profile` option cannot end with a trailing comma, please remove it and re-run the pipeline!\n" + "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + ) } if (nextflow_cli_args[0]) { - log.warn "nf-core pipelines do not accept positional arguments. The positional argument `${nextflow_cli_args[0]}` has been detected.\n" + - "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + log.warn( + "nf-core pipelines do not accept positional arguments. The positional argument `${nextflow_cli_args[0]}` has been detected.\n" + "HINT: A common mistake is to provide multiple values separated by spaces e.g. `-profile test, docker`.\n" + ) } } @@ -64,19 +62,13 @@ def checkProfileProvided(nextflow_cli_args) { def workflowCitation() { def temp_doi_ref = "" def manifest_doi = workflow.manifest.doi.tokenize(",") - // Using a loop to handle multiple DOIs + // Handling multiple DOIs // Removing `https://doi.org/` to handle pipelines using DOIs vs DOI resolvers // Removing ` ` since the manifest.doi is a string and not a proper list manifest_doi.each { doi_ref -> temp_doi_ref += " https://doi.org/${doi_ref.replace('https://doi.org/', '').replace(' ', '')}\n" } - return "If you use ${workflow.manifest.name} for your analysis please cite:\n\n" + - "* The pipeline\n" + - temp_doi_ref + "\n" + - "* The nf-core framework\n" + - " https://doi.org/10.1038/s41587-020-0439-x\n\n" + - "* Software dependencies\n" + - " https://github.com/${workflow.manifest.name}/blob/master/CITATIONS.md" + return "If you use ${workflow.manifest.name} for your analysis please cite:\n\n" + "* The pipeline\n" + temp_doi_ref + "\n" + "* The nf-core framework\n" + " https://doi.org/10.1038/s41587-020-0439-x\n\n" + "* Software dependencies\n" + " https://github.com/${workflow.manifest.name}/blob/master/CITATIONS.md" } // @@ -102,7 +94,7 @@ def getWorkflowVersion() { // def processVersionsFromYAML(yaml_file) { def yaml = new org.yaml.snakeyaml.Yaml() - def versions = yaml.load(yaml_file).collectEntries { k, v -> [ k.tokenize(':')[-1], v ] } + def versions = yaml.load(yaml_file).collectEntries { k, v -> [k.tokenize(':')[-1], v] } return yaml.dumpAsMap(versions).trim() } @@ -112,8 +104,8 @@ def processVersionsFromYAML(yaml_file) { def workflowVersionToYAML() { return """ Workflow: - $workflow.manifest.name: ${getWorkflowVersion()} - Nextflow: $workflow.nextflow.version + ${workflow.manifest.name}: ${getWorkflowVersion()} + Nextflow: ${workflow.nextflow.version} """.stripIndent().trim() } @@ -121,11 +113,7 @@ def workflowVersionToYAML() { // Get channel of software versions used in pipeline in YAML format // def softwareVersionsToYAML(ch_versions) { - return ch_versions - .unique() - .map { version -> processVersionsFromYAML(version) } - .unique() - .mix(Channel.of(workflowVersionToYAML())) + return ch_versions.unique().map { version -> processVersionsFromYAML(version) }.unique().mix(Channel.of(workflowVersionToYAML())) } // @@ -133,25 +121,31 @@ def softwareVersionsToYAML(ch_versions) { // def paramsSummaryMultiqc(summary_params) { def summary_section = '' - summary_params.keySet().each { group -> - def group_params = summary_params.get(group) // This gets the parameters of that particular group - if (group_params) { - summary_section += "

    $group

    \n" - summary_section += "
    \n" - group_params.keySet().sort().each { param -> - summary_section += "
    $param
    ${group_params.get(param) ?: 'N/A'}
    \n" + summary_params + .keySet() + .each { group -> + def group_params = summary_params.get(group) + // This gets the parameters of that particular group + if (group_params) { + summary_section += "

    ${group}

    \n" + summary_section += "
    \n" + group_params + .keySet() + .sort() + .each { param -> + summary_section += "
    ${param}
    ${group_params.get(param) ?: 'N/A'}
    \n" + } + summary_section += "
    \n" } - summary_section += "
    \n" } - } - def yaml_file_text = "id: '${workflow.manifest.name.replace('/','-')}-summary'\n" as String - yaml_file_text += "description: ' - this information is collected when the pipeline is started.'\n" - yaml_file_text += "section_name: '${workflow.manifest.name} Workflow Summary'\n" - yaml_file_text += "section_href: 'https://github.com/${workflow.manifest.name}'\n" - yaml_file_text += "plot_type: 'html'\n" - yaml_file_text += "data: |\n" - yaml_file_text += "${summary_section}" + def yaml_file_text = "id: '${workflow.manifest.name.replace('/', '-')}-summary'\n" as String + yaml_file_text += "description: ' - this information is collected when the pipeline is started.'\n" + yaml_file_text += "section_name: '${workflow.manifest.name} Workflow Summary'\n" + yaml_file_text += "section_href: 'https://github.com/${workflow.manifest.name}'\n" + yaml_file_text += "plot_type: 'html'\n" + yaml_file_text += "data: |\n" + yaml_file_text += "${summary_section}" return yaml_file_text } @@ -199,54 +193,54 @@ def logColours(monochrome_logs=true) { colorcodes['hidden'] = monochrome_logs ? '' : "\033[8m" // Regular Colors - colorcodes['black'] = monochrome_logs ? '' : "\033[0;30m" - colorcodes['red'] = monochrome_logs ? '' : "\033[0;31m" - colorcodes['green'] = monochrome_logs ? '' : "\033[0;32m" - colorcodes['yellow'] = monochrome_logs ? '' : "\033[0;33m" - colorcodes['blue'] = monochrome_logs ? '' : "\033[0;34m" - colorcodes['purple'] = monochrome_logs ? '' : "\033[0;35m" - colorcodes['cyan'] = monochrome_logs ? '' : "\033[0;36m" - colorcodes['white'] = monochrome_logs ? '' : "\033[0;37m" + colorcodes['black'] = monochrome_logs ? '' : "\033[0;30m" + colorcodes['red'] = monochrome_logs ? '' : "\033[0;31m" + colorcodes['green'] = monochrome_logs ? '' : "\033[0;32m" + colorcodes['yellow'] = monochrome_logs ? '' : "\033[0;33m" + colorcodes['blue'] = monochrome_logs ? '' : "\033[0;34m" + colorcodes['purple'] = monochrome_logs ? '' : "\033[0;35m" + colorcodes['cyan'] = monochrome_logs ? '' : "\033[0;36m" + colorcodes['white'] = monochrome_logs ? '' : "\033[0;37m" // Bold - colorcodes['bblack'] = monochrome_logs ? '' : "\033[1;30m" - colorcodes['bred'] = monochrome_logs ? '' : "\033[1;31m" - colorcodes['bgreen'] = monochrome_logs ? '' : "\033[1;32m" - colorcodes['byellow'] = monochrome_logs ? '' : "\033[1;33m" - colorcodes['bblue'] = monochrome_logs ? '' : "\033[1;34m" - colorcodes['bpurple'] = monochrome_logs ? '' : "\033[1;35m" - colorcodes['bcyan'] = monochrome_logs ? '' : "\033[1;36m" - colorcodes['bwhite'] = monochrome_logs ? '' : "\033[1;37m" + colorcodes['bblack'] = monochrome_logs ? '' : "\033[1;30m" + colorcodes['bred'] = monochrome_logs ? '' : "\033[1;31m" + colorcodes['bgreen'] = monochrome_logs ? '' : "\033[1;32m" + colorcodes['byellow'] = monochrome_logs ? '' : "\033[1;33m" + colorcodes['bblue'] = monochrome_logs ? '' : "\033[1;34m" + colorcodes['bpurple'] = monochrome_logs ? '' : "\033[1;35m" + colorcodes['bcyan'] = monochrome_logs ? '' : "\033[1;36m" + colorcodes['bwhite'] = monochrome_logs ? '' : "\033[1;37m" // Underline - colorcodes['ublack'] = monochrome_logs ? '' : "\033[4;30m" - colorcodes['ured'] = monochrome_logs ? '' : "\033[4;31m" - colorcodes['ugreen'] = monochrome_logs ? '' : "\033[4;32m" - colorcodes['uyellow'] = monochrome_logs ? '' : "\033[4;33m" - colorcodes['ublue'] = monochrome_logs ? '' : "\033[4;34m" - colorcodes['upurple'] = monochrome_logs ? '' : "\033[4;35m" - colorcodes['ucyan'] = monochrome_logs ? '' : "\033[4;36m" - colorcodes['uwhite'] = monochrome_logs ? '' : "\033[4;37m" + colorcodes['ublack'] = monochrome_logs ? '' : "\033[4;30m" + colorcodes['ured'] = monochrome_logs ? '' : "\033[4;31m" + colorcodes['ugreen'] = monochrome_logs ? '' : "\033[4;32m" + colorcodes['uyellow'] = monochrome_logs ? '' : "\033[4;33m" + colorcodes['ublue'] = monochrome_logs ? '' : "\033[4;34m" + colorcodes['upurple'] = monochrome_logs ? '' : "\033[4;35m" + colorcodes['ucyan'] = monochrome_logs ? '' : "\033[4;36m" + colorcodes['uwhite'] = monochrome_logs ? '' : "\033[4;37m" // High Intensity - colorcodes['iblack'] = monochrome_logs ? '' : "\033[0;90m" - colorcodes['ired'] = monochrome_logs ? '' : "\033[0;91m" - colorcodes['igreen'] = monochrome_logs ? '' : "\033[0;92m" - colorcodes['iyellow'] = monochrome_logs ? '' : "\033[0;93m" - colorcodes['iblue'] = monochrome_logs ? '' : "\033[0;94m" - colorcodes['ipurple'] = monochrome_logs ? '' : "\033[0;95m" - colorcodes['icyan'] = monochrome_logs ? '' : "\033[0;96m" - colorcodes['iwhite'] = monochrome_logs ? '' : "\033[0;97m" + colorcodes['iblack'] = monochrome_logs ? '' : "\033[0;90m" + colorcodes['ired'] = monochrome_logs ? '' : "\033[0;91m" + colorcodes['igreen'] = monochrome_logs ? '' : "\033[0;92m" + colorcodes['iyellow'] = monochrome_logs ? '' : "\033[0;93m" + colorcodes['iblue'] = monochrome_logs ? '' : "\033[0;94m" + colorcodes['ipurple'] = monochrome_logs ? '' : "\033[0;95m" + colorcodes['icyan'] = monochrome_logs ? '' : "\033[0;96m" + colorcodes['iwhite'] = monochrome_logs ? '' : "\033[0;97m" // Bold High Intensity - colorcodes['biblack'] = monochrome_logs ? '' : "\033[1;90m" - colorcodes['bired'] = monochrome_logs ? '' : "\033[1;91m" - colorcodes['bigreen'] = monochrome_logs ? '' : "\033[1;92m" - colorcodes['biyellow'] = monochrome_logs ? '' : "\033[1;93m" - colorcodes['biblue'] = monochrome_logs ? '' : "\033[1;94m" - colorcodes['bipurple'] = monochrome_logs ? '' : "\033[1;95m" - colorcodes['bicyan'] = monochrome_logs ? '' : "\033[1;96m" - colorcodes['biwhite'] = monochrome_logs ? '' : "\033[1;97m" + colorcodes['biblack'] = monochrome_logs ? '' : "\033[1;90m" + colorcodes['bired'] = monochrome_logs ? '' : "\033[1;91m" + colorcodes['bigreen'] = monochrome_logs ? '' : "\033[1;92m" + colorcodes['biyellow'] = monochrome_logs ? '' : "\033[1;93m" + colorcodes['biblue'] = monochrome_logs ? '' : "\033[1;94m" + colorcodes['bipurple'] = monochrome_logs ? '' : "\033[1;95m" + colorcodes['bicyan'] = monochrome_logs ? '' : "\033[1;96m" + colorcodes['biwhite'] = monochrome_logs ? '' : "\033[1;97m" return colorcodes } @@ -261,14 +255,15 @@ def attachMultiqcReport(multiqc_report) { mqc_report = multiqc_report.getVal() if (mqc_report.getClass() == ArrayList && mqc_report.size() >= 1) { if (mqc_report.size() > 1) { - log.warn "[$workflow.manifest.name] Found multiple reports from process 'MULTIQC', will use only one" + log.warn("[${workflow.manifest.name}] Found multiple reports from process 'MULTIQC', will use only one") } mqc_report = mqc_report[0] } } - } catch (all) { + } + catch (Exception all) { if (multiqc_report) { - log.warn "[$workflow.manifest.name] Could not attach MultiQC report to summary email" + log.warn("[${workflow.manifest.name}] Could not attach MultiQC report to summary email") } } return mqc_report @@ -280,26 +275,35 @@ def attachMultiqcReport(multiqc_report) { def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdir, monochrome_logs=true, multiqc_report=null) { // Set up the e-mail variables - def subject = "[$workflow.manifest.name] Successful: $workflow.runName" + def subject = "[${workflow.manifest.name}] Successful: ${workflow.runName}" if (!workflow.success) { - subject = "[$workflow.manifest.name] FAILED: $workflow.runName" + subject = "[${workflow.manifest.name}] FAILED: ${workflow.runName}" } def summary = [:] - summary_params.keySet().sort().each { group -> - summary << summary_params[group] - } + summary_params + .keySet() + .sort() + .each { group -> + summary << summary_params[group] + } def misc_fields = [:] misc_fields['Date Started'] = workflow.start misc_fields['Date Completed'] = workflow.complete misc_fields['Pipeline script file path'] = workflow.scriptFile misc_fields['Pipeline script hash ID'] = workflow.scriptId - if (workflow.repository) misc_fields['Pipeline repository Git URL'] = workflow.repository - if (workflow.commitId) misc_fields['Pipeline repository Git Commit'] = workflow.commitId - if (workflow.revision) misc_fields['Pipeline Git branch/tag'] = workflow.revision - misc_fields['Nextflow Version'] = workflow.nextflow.version - misc_fields['Nextflow Build'] = workflow.nextflow.build + if (workflow.repository) { + misc_fields['Pipeline repository Git URL'] = workflow.repository + } + if (workflow.commitId) { + misc_fields['Pipeline repository Git Commit'] = workflow.commitId + } + if (workflow.revision) { + misc_fields['Pipeline Git branch/tag'] = workflow.revision + } + misc_fields['Nextflow Version'] = workflow.nextflow.version + misc_fields['Nextflow Build'] = workflow.nextflow.build misc_fields['Nextflow Compile Timestamp'] = workflow.nextflow.timestamp def email_fields = [:] @@ -337,7 +341,7 @@ def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdi // Render the sendmail template def max_multiqc_email_size = (params.containsKey('max_multiqc_email_size') ? params.max_multiqc_email_size : 0) as nextflow.util.MemoryUnit - def smail_fields = [ email: email_address, subject: subject, email_txt: email_txt, email_html: email_html, projectDir: "${workflow.projectDir}", mqcFile: mqc_report, mqcMaxSize: max_multiqc_email_size.toBytes() ] + def smail_fields = [email: email_address, subject: subject, email_txt: email_txt, email_html: email_html, projectDir: "${workflow.projectDir}", mqcFile: mqc_report, mqcMaxSize: max_multiqc_email_size.toBytes()] def sf = new File("${workflow.projectDir}/assets/sendmail_template.txt") def sendmail_template = engine.createTemplate(sf).make(smail_fields) def sendmail_html = sendmail_template.toString() @@ -346,30 +350,32 @@ def completionEmail(summary_params, email, email_on_fail, plaintext_email, outdi def colors = logColours(monochrome_logs) as Map if (email_address) { try { - if (plaintext_email) { throw new org.codehaus.groovy.GroovyException('Send plaintext e-mail, not HTML') } + if (plaintext_email) { +new org.codehaus.groovy.GroovyException('Send plaintext e-mail, not HTML') } // Try to send HTML e-mail using sendmail def sendmail_tf = new File(workflow.launchDir.toString(), ".sendmail_tmp.html") sendmail_tf.withWriter { w -> w << sendmail_html } - [ 'sendmail', '-t' ].execute() << sendmail_html - log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Sent summary e-mail to $email_address (sendmail)-" - } catch (all) { + ['sendmail', '-t'].execute() << sendmail_html + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Sent summary e-mail to ${email_address} (sendmail)-") + } + catch (Exception all) { // Catch failures and try with plaintext - def mail_cmd = [ 'mail', '-s', subject, '--content-type=text/html', email_address ] + def mail_cmd = ['mail', '-s', subject, '--content-type=text/html', email_address] mail_cmd.execute() << email_html - log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Sent summary e-mail to $email_address (mail)-" + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Sent summary e-mail to ${email_address} (mail)-") } } // Write summary e-mail HTML to a file def output_hf = new File(workflow.launchDir.toString(), ".pipeline_report.html") output_hf.withWriter { w -> w << email_html } - nextflow.extension.FilesEx.copyTo(output_hf.toPath(), "${outdir}/pipeline_info/pipeline_report.html"); + nextflow.extension.FilesEx.copyTo(output_hf.toPath(), "${outdir}/pipeline_info/pipeline_report.html") output_hf.delete() // Write summary e-mail TXT to a file def output_tf = new File(workflow.launchDir.toString(), ".pipeline_report.txt") output_tf.withWriter { w -> w << email_txt } - nextflow.extension.FilesEx.copyTo(output_tf.toPath(), "${outdir}/pipeline_info/pipeline_report.txt"); + nextflow.extension.FilesEx.copyTo(output_tf.toPath(), "${outdir}/pipeline_info/pipeline_report.txt") output_tf.delete() } @@ -380,12 +386,14 @@ def completionSummary(monochrome_logs=true) { def colors = logColours(monochrome_logs) as Map if (workflow.success) { if (workflow.stats.ignoredCount == 0) { - log.info "-${colors.purple}[$workflow.manifest.name]${colors.green} Pipeline completed successfully${colors.reset}-" - } else { - log.info "-${colors.purple}[$workflow.manifest.name]${colors.yellow} Pipeline completed successfully, but with errored process(es) ${colors.reset}-" + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.green} Pipeline completed successfully${colors.reset}-") + } + else { + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.yellow} Pipeline completed successfully, but with errored process(es) ${colors.reset}-") } - } else { - log.info "-${colors.purple}[$workflow.manifest.name]${colors.red} Pipeline completed with errors${colors.reset}-" + } + else { + log.info("-${colors.purple}[${workflow.manifest.name}]${colors.red} Pipeline completed with errors${colors.reset}-") } } @@ -394,21 +402,30 @@ def completionSummary(monochrome_logs=true) { // def imNotification(summary_params, hook_url) { def summary = [:] - summary_params.keySet().sort().each { group -> - summary << summary_params[group] - } + summary_params + .keySet() + .sort() + .each { group -> + summary << summary_params[group] + } def misc_fields = [:] - misc_fields['start'] = workflow.start - misc_fields['complete'] = workflow.complete - misc_fields['scriptfile'] = workflow.scriptFile - misc_fields['scriptid'] = workflow.scriptId - if (workflow.repository) misc_fields['repository'] = workflow.repository - if (workflow.commitId) misc_fields['commitid'] = workflow.commitId - if (workflow.revision) misc_fields['revision'] = workflow.revision - misc_fields['nxf_version'] = workflow.nextflow.version - misc_fields['nxf_build'] = workflow.nextflow.build - misc_fields['nxf_timestamp'] = workflow.nextflow.timestamp + misc_fields['start'] = workflow.start + misc_fields['complete'] = workflow.complete + misc_fields['scriptfile'] = workflow.scriptFile + misc_fields['scriptid'] = workflow.scriptId + if (workflow.repository) { + misc_fields['repository'] = workflow.repository + } + if (workflow.commitId) { + misc_fields['commitid'] = workflow.commitId + } + if (workflow.revision) { + misc_fields['revision'] = workflow.revision + } + misc_fields['nxf_version'] = workflow.nextflow.version + misc_fields['nxf_build'] = workflow.nextflow.build + misc_fields['nxf_timestamp'] = workflow.nextflow.timestamp def msg_fields = [:] msg_fields['version'] = getWorkflowVersion() @@ -433,13 +450,13 @@ def imNotification(summary_params, hook_url) { def json_message = json_template.toString() // POST - def post = new URL(hook_url).openConnection(); + def post = new URL(hook_url).openConnection() post.setRequestMethod("POST") post.setDoOutput(true) post.setRequestProperty("Content-Type", "application/json") - post.getOutputStream().write(json_message.getBytes("UTF-8")); - def postRC = post.getResponseCode(); - if (! postRC.equals(200)) { - log.warn(post.getErrorStream().getText()); + post.getOutputStream().write(json_message.getBytes("UTF-8")) + def postRC = post.getResponseCode() + if (!postRC.equals(200)) { + log.warn(post.getErrorStream().getText()) } } diff --git a/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test b/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test index 842dc432..8fb30164 100644 --- a/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test +++ b/subworkflows/nf-core/utils_nfschema_plugin/tests/main.nf.test @@ -42,7 +42,7 @@ nextflow_workflow { params { test_data = '' - outdir = 1 + outdir = null } workflow { @@ -94,7 +94,7 @@ nextflow_workflow { params { test_data = '' - outdir = 1 + outdir = null } workflow { diff --git a/tests/nextflow.config b/tests/nextflow.config index 59ffd5da..80312509 100644 --- a/tests/nextflow.config +++ b/tests/nextflow.config @@ -11,6 +11,7 @@ params { // References for test data fasta = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000.fasta" + elfasta = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000.elfasta" fai = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000.fasta.fai" dict = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000.dict" sdf = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/genome/hg38_chr21_22000000_23000000_sdf.tar.gz" @@ -31,6 +32,13 @@ params { cram3 = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24385.cram" crai3 = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/crams/NA24385.cram.crai" + bam1 = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/bams/NA24143.bam" + bai1 = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/bams/NA24143.bam.bai" + bam2 = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/bams/NA24149.bam" + bai2 = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/bams/NA24149.bam.bai" + bam3 = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/bams/NA24385.bam" + bai3 = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/bams/NA24385.bam.bai" + vcf1 = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/vcfs/NA24143.vcf.gz" tbi1 = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/vcfs/NA24143.vcf.gz.tbi" vcf2 = "https://github.com/nf-cmgg/test-datasets/raw/germline/data/genomics/homo_sapiens/illumina/vcfs/NA24149.vcf.gz" @@ -58,7 +66,8 @@ params { igenomes_ignore = true genomes_ignore = true - validationSchemaIgnoreParams = 'genomes,igenomes_base,test_data,cram1,cram2,cram3,crai1,crai2,crai3,vcf1,vcf2,vcf3,tbi1,tbi2,tbi3,gvcf1,gvcf2,gvcf3,gtbi1,gtbi2,gtbi3,famvcf,famtbi,ped,bed,split1,split2,split3' + validationSchemaIgnoreParams = 'genomes,igenomes_base,test_data,cram1,cram2,cram3,crai1,crai2,crai3,vcf1,vcf2,vcf3,tbi1,tbi2,tbi3,gvcf1,gvcf2,gvcf3,gtbi1,gtbi2,gtbi3,famvcf,famtbi,ped,bed,split1,split2,split3,modules_testdata_base_path' + modules_testdata_base_path = 'https://raw.githubusercontent.com/nf-core/test-datasets/modules/data/' } process { diff --git a/tests/pipeline/callers/main.nf.test b/tests/pipeline/callers/main.nf.test index 18cfcb3a..cf6bfbbd 100644 --- a/tests/pipeline/callers/main.nf.test +++ b/tests/pipeline/callers/main.nf.test @@ -20,14 +20,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("vardict") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -45,14 +50,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("haplotypecaller") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -70,14 +80,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("vardict + haplotypecaller") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -91,3 +106,9 @@ def getRecursiveFileNames(fileOrDir, outputDir) { } return fileOrDir.toString().replace("${outputDir}/", "") } + +def getDynamicOutputName() { + def Map nfcoreYaml = new groovy.yaml.YamlSlurper().parseText(file(".nf-core.yml").text) + def date = new java.text.SimpleDateFormat("yyyy_MM_dd").format(new Date()) + return "v${nfcoreYaml.template.version.replace('.', '_')}_${date}" as String +} diff --git a/tests/pipeline/callers/main.nf.test.snap b/tests/pipeline/callers/main.nf.test.snap index f5629747..dcd0e17c 100644 --- a/tests/pipeline/callers/main.nf.test.snap +++ b/tests/pipeline/callers/main.nf.test.snap @@ -1,154 +1,110 @@ { - "haplotypecaller": { + "pipeline_callers - haplotypecaller": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:39:15.537730988" + "timestamp": "2024-11-13T14:05:33.245638727" }, - "vardict + haplotypecaller": { + "pipeline_callers - vardict + haplotypecaller": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/NA24143.vardict.ped", - "test/Ashkenazim/NA24143.vardict.vcf.gz", - "test/Ashkenazim/NA24143.vardict.vcf.gz.csi", - "test/Ashkenazim/NA24143.vardict.vcf.gz.tbi", - "test/Ashkenazim/NA24149.vardict.ped", - "test/Ashkenazim/NA24149.vardict.vcf.gz", - "test/Ashkenazim/NA24149.vardict.vcf.gz.csi", - "test/Ashkenazim/NA24149.vardict.vcf.gz.tbi", - "test/Ashkenazim/NA24385.vardict.ped", - "test/Ashkenazim/NA24385.vardict.vcf.gz", - "test/Ashkenazim/NA24385.vardict.vcf.gz.csi", - "test/Ashkenazim/NA24385.vardict.vcf.gz.tbi", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html", - "test/Ashkenazim/reports/NA24143.vardict.bcftools_stats.txt", - "test/Ashkenazim/reports/NA24143.vardict.somalier.html", - "test/Ashkenazim/reports/NA24149.vardict.bcftools_stats.txt", - "test/Ashkenazim/reports/NA24149.vardict.somalier.html", - "test/Ashkenazim/reports/NA24385.vardict.bcftools_stats.txt", - "test/Ashkenazim/reports/NA24385.vardict.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/output__/NA24143.vardict.ped", + "Ashkenazim/output__/NA24143.vardict.vcf.gz", + "Ashkenazim/output__/NA24143.vardict.vcf.gz.tbi", + "Ashkenazim/output__/NA24149.vardict.ped", + "Ashkenazim/output__/NA24149.vardict.vcf.gz", + "Ashkenazim/output__/NA24149.vardict.vcf.gz.tbi", + "Ashkenazim/output__/NA24385.vardict.ped", + "Ashkenazim/output__/NA24385.vardict.vcf.gz", + "Ashkenazim/output__/NA24385.vardict.vcf.gz.tbi", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "Ashkenazim/qc__/NA24143.vardict.bcftools_stats.txt", + "Ashkenazim/qc__/NA24143.vardict.html", + "Ashkenazim/qc__/NA24149.vardict.bcftools_stats.txt", + "Ashkenazim/qc__/NA24149.vardict.html", + "Ashkenazim/qc__/NA24385.vardict.bcftools_stats.txt", + "Ashkenazim/qc__/NA24385.vardict.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:56:32.019963111" + "timestamp": "2024-11-13T14:09:45.604865258" }, - "vardict": { + "pipeline_callers - vardict": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/NA24143.vardict.ped", - "test/Ashkenazim/NA24143.vardict.vcf.gz", - "test/Ashkenazim/NA24143.vardict.vcf.gz.csi", - "test/Ashkenazim/NA24143.vardict.vcf.gz.tbi", - "test/Ashkenazim/NA24149.vardict.ped", - "test/Ashkenazim/NA24149.vardict.vcf.gz", - "test/Ashkenazim/NA24149.vardict.vcf.gz.csi", - "test/Ashkenazim/NA24149.vardict.vcf.gz.tbi", - "test/Ashkenazim/NA24385.vardict.ped", - "test/Ashkenazim/NA24385.vardict.vcf.gz", - "test/Ashkenazim/NA24385.vardict.vcf.gz.csi", - "test/Ashkenazim/NA24385.vardict.vcf.gz.tbi", - "test/Ashkenazim/reports/NA24143.vardict.bcftools_stats.txt", - "test/Ashkenazim/reports/NA24143.vardict.somalier.html", - "test/Ashkenazim/reports/NA24149.vardict.bcftools_stats.txt", - "test/Ashkenazim/reports/NA24149.vardict.somalier.html", - "test/Ashkenazim/reports/NA24385.vardict.bcftools_stats.txt", - "test/Ashkenazim/reports/NA24385.vardict.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/output__/NA24143.vardict.ped", + "Ashkenazim/output__/NA24143.vardict.vcf.gz", + "Ashkenazim/output__/NA24143.vardict.vcf.gz.tbi", + "Ashkenazim/output__/NA24149.vardict.ped", + "Ashkenazim/output__/NA24149.vardict.vcf.gz", + "Ashkenazim/output__/NA24149.vardict.vcf.gz.tbi", + "Ashkenazim/output__/NA24385.vardict.ped", + "Ashkenazim/output__/NA24385.vardict.vcf.gz", + "Ashkenazim/output__/NA24385.vardict.vcf.gz.tbi", + "Ashkenazim/qc__/NA24143.vardict.bcftools_stats.txt", + "Ashkenazim/qc__/NA24143.vardict.html", + "Ashkenazim/qc__/NA24149.vardict.bcftools_stats.txt", + "Ashkenazim/qc__/NA24149.vardict.html", + "Ashkenazim/qc__/NA24385.vardict.bcftools_stats.txt", + "Ashkenazim/qc__/NA24385.vardict.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:54:10.944847435" + "timestamp": "2024-11-13T14:02:38.547566464" } } \ No newline at end of file diff --git a/tests/pipeline/default/main.nf.test b/tests/pipeline/default/main.nf.test index 31b1a025..6f112d0d 100644 --- a/tests/pipeline/default/main.nf.test +++ b/tests/pipeline/default/main.nf.test @@ -19,14 +19,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("default") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -40,3 +45,9 @@ def getRecursiveFileNames(fileOrDir, outputDir) { } return fileOrDir.toString().replace("${outputDir}/", "") } + +def getDynamicOutputName() { + def Map nfcoreYaml = new groovy.yaml.YamlSlurper().parseText(file(".nf-core.yml").text) + def date = new java.text.SimpleDateFormat("yyyy_MM_dd").format(new Date()) + return "v${nfcoreYaml.template.version.replace('.', '_')}_${date}" as String +} diff --git a/tests/pipeline/default/main.nf.test.snap b/tests/pipeline/default/main.nf.test.snap index ab43dc3b..3bf2a32c 100644 --- a/tests/pipeline/default/main.nf.test.snap +++ b/tests/pipeline/default/main.nf.test.snap @@ -1,47 +1,33 @@ { - "default": { + "pipeline_default - default": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:40:36.127517277" + "timestamp": "2024-11-13T13:55:37.532795185" } } \ No newline at end of file diff --git a/tests/pipeline/gvcfs/main.nf.test b/tests/pipeline/gvcfs/main.nf.test index 19ec78f9..694288ff 100644 --- a/tests/pipeline/gvcfs/main.nf.test +++ b/tests/pipeline/gvcfs/main.nf.test @@ -20,14 +20,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("gvcfs") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -41,3 +46,9 @@ def getRecursiveFileNames(fileOrDir, outputDir) { } return fileOrDir.toString().replace("${outputDir}/", "") } + +def getDynamicOutputName() { + def Map nfcoreYaml = new groovy.yaml.YamlSlurper().parseText(file(".nf-core.yml").text) + def date = new java.text.SimpleDateFormat("yyyy_MM_dd").format(new Date()) + return "v${nfcoreYaml.template.version.replace('.', '_')}_${date}" as String +} diff --git a/tests/pipeline/gvcfs/main.nf.test.snap b/tests/pipeline/gvcfs/main.nf.test.snap index 3b3b89d8..2cd80e35 100644 --- a/tests/pipeline/gvcfs/main.nf.test.snap +++ b/tests/pipeline/gvcfs/main.nf.test.snap @@ -1,26 +1,21 @@ { - "gvcfs": { + "pipeline_default - gvcfs": { "content": [ [ - - ], - [ - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html" + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:41:32.108980116" + "timestamp": "2024-11-13T14:14:20.958812996" } } \ No newline at end of file diff --git a/tests/pipeline/variations/main.nf.test b/tests/pipeline/variations/main.nf.test index 738ec677..837a84be 100644 --- a/tests/pipeline/variations/main.nf.test +++ b/tests/pipeline/variations/main.nf.test @@ -20,14 +20,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("annotate") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -46,14 +51,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("annotate + vcfanno") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -71,14 +81,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("filter") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -96,14 +111,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("only_call") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -122,14 +142,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("only_merge") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -147,14 +172,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("automap") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -167,3 +197,9 @@ def getRecursiveFileNames(fileOrDir, outputDir) { } return fileOrDir.toString().replace("${outputDir}/", "") } + +def getDynamicOutputName() { + def Map nfcoreYaml = new groovy.yaml.YamlSlurper().parseText(file(".nf-core.yml").text) + def date = new java.text.SimpleDateFormat("yyyy_MM_dd").format(new Date()) + return "v${nfcoreYaml.template.version.replace('.', '_')}_${date}" as String +} diff --git a/tests/pipeline/variations/main.nf.test.snap b/tests/pipeline/variations/main.nf.test.snap index 348a2942..95999b13 100644 --- a/tests/pipeline/variations/main.nf.test.snap +++ b/tests/pipeline/variations/main.nf.test.snap @@ -1,272 +1,189 @@ { - "filter": { + "pipeline_variations - only_merge": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/output__/Ashkenazim_haplotypecaller_genomicsdb", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:45:02.491962487" + "timestamp": "2024-11-13T14:28:35.043452694" }, - "only_call": { + "pipeline_variations - filter": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:45:47.266012698" + "timestamp": "2024-11-13T14:24:24.556080575" }, - "annotate + vcfanno": { + "pipeline_variations - annotate + vcfanno": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.gzi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:43:47.539727235" + "timestamp": "2024-11-13T14:21:15.80188879" }, - "automap": { + "pipeline_variations - annotate": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/automap_haplotypecaller/sample1/sample1.HomRegions.cmgg_bio.tsv", - "test/Ashkenazim/automap_haplotypecaller/sample1/sample1.HomRegions.pdf", - "test/Ashkenazim/automap_haplotypecaller/sample1/sample1.HomRegions.strict.cmgg_bio.tsv", - "test/Ashkenazim/automap_haplotypecaller/sample1/sample1.HomRegions.tsv", - "test/Ashkenazim/automap_haplotypecaller/sample2/sample2.HomRegions.cmgg_bio.tsv", - "test/Ashkenazim/automap_haplotypecaller/sample2/sample2.HomRegions.pdf", - "test/Ashkenazim/automap_haplotypecaller/sample2/sample2.HomRegions.strict.cmgg_bio.tsv", - "test/Ashkenazim/automap_haplotypecaller/sample2/sample2.HomRegions.tsv", - "test/Ashkenazim/automap_haplotypecaller/sample3/sample3.HomRegions.cmgg_bio.tsv", - "test/Ashkenazim/automap_haplotypecaller/sample3/sample3.HomRegions.pdf", - "test/Ashkenazim/automap_haplotypecaller/sample3/sample3.HomRegions.strict.cmgg_bio.tsv", - "test/Ashkenazim/automap_haplotypecaller/sample3/sample3.HomRegions.tsv", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:47:48.972236882" + "timestamp": "2024-11-13T14:17:45.975261786" }, - "only_merge": { + "pipeline_variations - only_call": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/genomicsdb_Ashkenazim" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:46:37.403799074" + "timestamp": "2024-11-13T14:26:32.219983438" }, - "annotate": { + "pipeline_variations - automap": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/output__/automap/haplotypecaller/sample1/sample1.HomRegions.cmgg_bio.tsv", + "Ashkenazim/output__/automap/haplotypecaller/sample1/sample1.HomRegions.pdf", + "Ashkenazim/output__/automap/haplotypecaller/sample1/sample1.HomRegions.strict.cmgg_bio.tsv", + "Ashkenazim/output__/automap/haplotypecaller/sample1/sample1.HomRegions.tsv", + "Ashkenazim/output__/automap/haplotypecaller/sample2/sample2.HomRegions.cmgg_bio.tsv", + "Ashkenazim/output__/automap/haplotypecaller/sample2/sample2.HomRegions.pdf", + "Ashkenazim/output__/automap/haplotypecaller/sample2/sample2.HomRegions.strict.cmgg_bio.tsv", + "Ashkenazim/output__/automap/haplotypecaller/sample2/sample2.HomRegions.tsv", + "Ashkenazim/output__/automap/haplotypecaller/sample3/sample3.HomRegions.cmgg_bio.tsv", + "Ashkenazim/output__/automap/haplotypecaller/sample3/sample3.HomRegions.pdf", + "Ashkenazim/output__/automap/haplotypecaller/sample3/sample3.HomRegions.strict.cmgg_bio.tsv", + "Ashkenazim/output__/automap/haplotypecaller/sample3/sample3.HomRegions.tsv", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:42:39.793880467" + "timestamp": "2024-11-13T14:31:35.104677632" } } \ No newline at end of file diff --git a/tests/pipeline/variations2/main.nf.test b/tests/pipeline/variations2/main.nf.test index 58ad40a6..d61fe35b 100644 --- a/tests/pipeline/variations2/main.nf.test +++ b/tests/pipeline/variations2/main.nf.test @@ -20,14 +20,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("normalize") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -45,14 +50,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("updio") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -70,14 +80,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("gemini") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -95,14 +110,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("validate") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -120,14 +140,19 @@ nextflow_pipeline { then { assertAll( { assert workflow.success }, + { assert !workflow.stdout }, { assert snapshot( - workflow.stdout, path("${outputDir}") .list() - .findAll { !it.toString().endsWith("pipeline_info") } .collect { getRecursiveFileNames(it, "${outputDir}") } .flatten() - ).match("add_ped") } + .findAll { + !(it.contains("/execution_") || it.contains("/params_") || it.contains("/pipeline_")) + } + .collect { + it.replace(getDynamicOutputName(), "_") + } + ).match() } ) } @@ -141,3 +166,9 @@ def getRecursiveFileNames(fileOrDir, outputDir) { } return fileOrDir.toString().replace("${outputDir}/", "") } + +def getDynamicOutputName() { + def Map nfcoreYaml = new groovy.yaml.YamlSlurper().parseText(file(".nf-core.yml").text) + def date = new java.text.SimpleDateFormat("yyyy_MM_dd").format(new Date()) + return "v${nfcoreYaml.template.version.replace('.', '_')}_${date}" as String +} diff --git a/tests/pipeline/variations2/main.nf.test.snap b/tests/pipeline/variations2/main.nf.test.snap index 053dd3eb..e0053c6b 100644 --- a/tests/pipeline/variations2/main.nf.test.snap +++ b/tests/pipeline/variations2/main.nf.test.snap @@ -1,285 +1,215 @@ { - "gemini": { + "pipeline_variations - add_ped": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.db", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:59:55.764681555" + "timestamp": "2024-11-13T14:48:10.189725508" }, - "normalize": { + "pipeline_variations - normalize": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:57:36.725019238" + "timestamp": "2024-11-13T14:34:50.228561252" }, - "updio": { + "pipeline_variations - updio": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T11:58:51.377552309" + "timestamp": "2024-11-13T14:38:05.475938734" }, - "add_ped": { + "pipeline_variations - gemini": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.db", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T12:02:31.850254513" + "timestamp": "2024-11-13T14:41:11.213707091" }, - "validate": { + "pipeline_variations - validate": { "content": [ [ - - ], - [ - "NA24143/NA24143.bed", - "NA24143/NA24143.haplotypecaller.g.vcf.gz", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.csi", - "NA24143/NA24143.haplotypecaller.g.vcf.gz.tbi", - "NA24143/reports/NA24143.global.dist.txt", - "NA24143/reports/NA24143.haplotypecaller.bcftools_stats.txt", - "NA24143/reports/NA24143.summary.txt", - "NA24143/validation/haplotypecaller/NA24143.fn.vcf.gz", - "NA24143/validation/haplotypecaller/NA24143.fn.vcf.gz.tbi", - "NA24143/validation/haplotypecaller/NA24143.fp.vcf.gz", - "NA24143/validation/haplotypecaller/NA24143.fp.vcf.gz.tbi", - "NA24143/validation/haplotypecaller/NA24143.non_snp.png", - "NA24143/validation/haplotypecaller/NA24143.non_snp.svg", - "NA24143/validation/haplotypecaller/NA24143.non_snp_roc.tsv.gz", - "NA24143/validation/haplotypecaller/NA24143.phasing.txt", - "NA24143/validation/haplotypecaller/NA24143.snp.png", - "NA24143/validation/haplotypecaller/NA24143.snp.svg", - "NA24143/validation/haplotypecaller/NA24143.snp_roc.tsv.gz", - "NA24143/validation/haplotypecaller/NA24143.summary.txt", - "NA24143/validation/haplotypecaller/NA24143.tp-baseline.vcf.gz", - "NA24143/validation/haplotypecaller/NA24143.tp-baseline.vcf.gz.tbi", - "NA24143/validation/haplotypecaller/NA24143.tp.vcf.gz", - "NA24143/validation/haplotypecaller/NA24143.tp.vcf.gz.tbi", - "NA24143/validation/haplotypecaller/NA24143.weighted.png", - "NA24143/validation/haplotypecaller/NA24143.weighted.svg", - "NA24143/validation/haplotypecaller/NA24143.weighted_roc.tsv.gz", - "NA24149/NA24149.bed", - "NA24149/NA24149.haplotypecaller.g.vcf.gz", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.csi", - "NA24149/NA24149.haplotypecaller.g.vcf.gz.tbi", - "NA24149/reports/NA24149.global.dist.txt", - "NA24149/reports/NA24149.haplotypecaller.bcftools_stats.txt", - "NA24149/reports/NA24149.summary.txt", - "NA24149/validation/haplotypecaller/NA24149.fn.vcf.gz", - "NA24149/validation/haplotypecaller/NA24149.fn.vcf.gz.tbi", - "NA24149/validation/haplotypecaller/NA24149.fp.vcf.gz", - "NA24149/validation/haplotypecaller/NA24149.fp.vcf.gz.tbi", - "NA24149/validation/haplotypecaller/NA24149.non_snp.png", - "NA24149/validation/haplotypecaller/NA24149.non_snp.svg", - "NA24149/validation/haplotypecaller/NA24149.non_snp_roc.tsv.gz", - "NA24149/validation/haplotypecaller/NA24149.phasing.txt", - "NA24149/validation/haplotypecaller/NA24149.snp.png", - "NA24149/validation/haplotypecaller/NA24149.snp.svg", - "NA24149/validation/haplotypecaller/NA24149.snp_roc.tsv.gz", - "NA24149/validation/haplotypecaller/NA24149.summary.txt", - "NA24149/validation/haplotypecaller/NA24149.tp-baseline.vcf.gz", - "NA24149/validation/haplotypecaller/NA24149.tp-baseline.vcf.gz.tbi", - "NA24149/validation/haplotypecaller/NA24149.tp.vcf.gz", - "NA24149/validation/haplotypecaller/NA24149.tp.vcf.gz.tbi", - "NA24149/validation/haplotypecaller/NA24149.weighted.png", - "NA24149/validation/haplotypecaller/NA24149.weighted.svg", - "NA24149/validation/haplotypecaller/NA24149.weighted_roc.tsv.gz", - "NA24385/NA24385.bed", - "NA24385/NA24385.haplotypecaller.g.vcf.gz", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.csi", - "NA24385/NA24385.haplotypecaller.g.vcf.gz.tbi", - "NA24385/reports/NA24385.global.dist.txt", - "NA24385/reports/NA24385.haplotypecaller.bcftools_stats.txt", - "NA24385/reports/NA24385.summary.txt", - "NA24385/validation/haplotypecaller/NA24385.fn.vcf.gz", - "NA24385/validation/haplotypecaller/NA24385.fn.vcf.gz.tbi", - "NA24385/validation/haplotypecaller/NA24385.fp.vcf.gz", - "NA24385/validation/haplotypecaller/NA24385.fp.vcf.gz.tbi", - "NA24385/validation/haplotypecaller/NA24385.non_snp.png", - "NA24385/validation/haplotypecaller/NA24385.non_snp.svg", - "NA24385/validation/haplotypecaller/NA24385.non_snp_roc.tsv.gz", - "NA24385/validation/haplotypecaller/NA24385.phasing.txt", - "NA24385/validation/haplotypecaller/NA24385.snp.png", - "NA24385/validation/haplotypecaller/NA24385.snp.svg", - "NA24385/validation/haplotypecaller/NA24385.snp_roc.tsv.gz", - "NA24385/validation/haplotypecaller/NA24385.summary.txt", - "NA24385/validation/haplotypecaller/NA24385.tp-baseline.vcf.gz", - "NA24385/validation/haplotypecaller/NA24385.tp-baseline.vcf.gz.tbi", - "NA24385/validation/haplotypecaller/NA24385.tp.vcf.gz", - "NA24385/validation/haplotypecaller/NA24385.tp.vcf.gz.tbi", - "NA24385/validation/haplotypecaller/NA24385.weighted.png", - "NA24385/validation/haplotypecaller/NA24385.weighted.svg", - "NA24385/validation/haplotypecaller/NA24385.weighted_roc.tsv.gz", - "multiqc/multiqc_plots", - "multiqc/multiqc_report.html", - "samplesheet.csv", - "test/Ashkenazim/Ashkenazim.bed", - "test/Ashkenazim/Ashkenazim.haplotypecaller.ped", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.csi", - "test/Ashkenazim/Ashkenazim.haplotypecaller.vcf.gz.tbi", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.bcftools_stats.txt", - "test/Ashkenazim/reports/Ashkenazim.haplotypecaller.somalier.html" + "Ashkenazim/NA24143__/NA24143.bed", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24143__/NA24143.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.fn.vcf.gz", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.fn.vcf.gz.tbi", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.fp.vcf.gz", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.fp.vcf.gz.tbi", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.non_snp.png", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.non_snp.svg", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.non_snp_roc.tsv.gz", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.phasing.txt", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.snp.png", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.snp.svg", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.snp_roc.tsv.gz", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.summary.txt", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.tp-baseline.vcf.gz", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.tp-baseline.vcf.gz.tbi", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.tp.vcf.gz", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.tp.vcf.gz.tbi", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.weighted.png", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.weighted.svg", + "Ashkenazim/NA24143__/validation/haplotypecaller/NA24143.weighted_roc.tsv.gz", + "Ashkenazim/NA24149__/NA24149.bed", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24149__/NA24149.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.fn.vcf.gz", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.fn.vcf.gz.tbi", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.fp.vcf.gz", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.fp.vcf.gz.tbi", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.non_snp.png", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.non_snp.svg", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.non_snp_roc.tsv.gz", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.phasing.txt", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.snp.png", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.snp.svg", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.snp_roc.tsv.gz", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.summary.txt", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.tp-baseline.vcf.gz", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.tp-baseline.vcf.gz.tbi", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.tp.vcf.gz", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.tp.vcf.gz.tbi", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.weighted.png", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.weighted.svg", + "Ashkenazim/NA24149__/validation/haplotypecaller/NA24149.weighted_roc.tsv.gz", + "Ashkenazim/NA24385__/NA24385.bed", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz", + "Ashkenazim/NA24385__/NA24385.haplotypecaller.g.vcf.gz.tbi", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.fn.vcf.gz", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.fn.vcf.gz.tbi", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.fp.vcf.gz", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.fp.vcf.gz.tbi", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.non_snp.png", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.non_snp.svg", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.non_snp_roc.tsv.gz", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.phasing.txt", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.snp.png", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.snp.svg", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.snp_roc.tsv.gz", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.summary.txt", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.tp-baseline.vcf.gz", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.tp-baseline.vcf.gz.tbi", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.tp.vcf.gz", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.tp.vcf.gz.tbi", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.weighted.png", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.weighted.svg", + "Ashkenazim/NA24385__/validation/haplotypecaller/NA24385.weighted_roc.tsv.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.bed", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.ped", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz", + "Ashkenazim/output__/Ashkenazim.haplotypecaller.vcf.gz.tbi", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.bcftools_stats.txt", + "Ashkenazim/qc__/Ashkenazim.haplotypecaller.html", + "_/multiqc_report.html", + "_/samplesheet.csv" ] ], "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" + "nf-test": "0.9.1", + "nextflow": "24.10.0" }, - "timestamp": "2024-09-05T12:01:21.850991249" + "timestamp": "2024-11-13T14:44:54.946386845" } } \ No newline at end of file diff --git a/tests/subworkflows/local/cram_call_genotype_gatk4/main.nf.test b/tests/subworkflows/local/cram_call_genotype_gatk4/main.nf.test deleted file mode 100644 index 37eca413..00000000 --- a/tests/subworkflows/local/cram_call_genotype_gatk4/main.nf.test +++ /dev/null @@ -1,495 +0,0 @@ -nextflow_workflow { - - name "Test Workflow CRAM_CALL_GENOTYPE_GATK4" - script "subworkflows/local/cram_call_genotype_gatk4/main.nf" - workflow "CRAM_CALL_GENOTYPE_GATK4" - - tag "subworkflows" - tag "subworkflows_local" - tag "cram_call_genotype_gatk4" - tag "cram_call_gatk4" // This is also tested here - tag "gvcf_joint_genotype_gatk4" // This is also tested here - tag "vcf_filter_bcftools" // This is also tested here - tag "vcf_concat_bcftools" // This is also tested here - - test("cram_call_genotype_gatk4 - default - crams") { - - when { - params { - callers = "haplotypecaller" - } - workflow { - """ - input[0] = Channel.of([ - [id:"NA24143.00001", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true), - file(params.split1, checkIfExists:true) - ],[ - [id:"NA24143.00002", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true), - file(params.split2, checkIfExists:true) - ],[ - [id:"NA24143.00003", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true), - file(params.split3, checkIfExists:true) - ]) - input[1] = Channel.empty() - input[2] = Channel.value([ - [id:"fasta"], - file(params.fasta, checkIfExists:true) - ]) - input[3] = Channel.value([ - [id:"fai"], - file(params.fai, checkIfExists:true) - ]) - input[4] = Channel.value([ - [id:"dict"], - file(params.dict, checkIfExists:true) - ]) - input[5] = Channel.value([ - [id:"strtablefile"], - file(params.strtablefile, checkIfExists:true) - ]) - input[6] = [[],[]] - input[7] = [[],[]] - input[8] = false - input[9] = false - input[10] = false - input[11] = false - input[12] = 2 - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert snapshot( - workflow.out.vcfs.collect { it.collect { it instanceof Map ? it : file(it).name } }, - workflow.out.reports - ).match("default - crams") } - ) - } - - } - - test("cram_call_genotype_gatk4 - default - gvcfs") { - - when { - params { - callers = "haplotypecaller" - } - workflow { - """ - input[0] = Channel.empty() - input[1] = Channel.of([ - [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143"], - file(params.gvcf1, checkIfExists:true), - file(params.gtbi1, checkIfExists:true) - ]) - input[2] = Channel.value([ - [id:"fasta"], - file(params.fasta, checkIfExists:true) - ]) - input[3] = Channel.value([ - [id:"fai"], - file(params.fai, checkIfExists:true) - ]) - input[4] = Channel.value([ - [id:"dict"], - file(params.dict, checkIfExists:true) - ]) - input[5] = Channel.value([ - [id:"strtablefile"], - file(params.strtablefile, checkIfExists:true) - ]) - input[6] = [[],[]] - input[7] = [[],[]] - input[8] = false - input[9] = false - input[10] = false - input[11] = false - input[12] = 2 - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert snapshot( - workflow.out.vcfs.collect { it.collect { it instanceof Map ? it : file(it).name } }, - workflow.out.reports - ).match("default - gvcfs") } - ) - } - - } - - test("cram_call_genotype_gatk4 - default - family") { - - when { - params { - callers = "haplotypecaller" - } - workflow { - """ - input[0] = Channel.of([ - [id:"NA24835.00001", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split1, checkIfExists:true) - ],[ - [id:"NA24835.00002", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split2, checkIfExists:true) - ],[ - [id:"NA24835.00003", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split3, checkIfExists:true) - ]) - input[1] = Channel.of([ - [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385"], - file(params.gvcf1, checkIfExists:true), - file(params.gtbi1, checkIfExists:true) - ],[ - [id:"NA24149", sample:"NA24149", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385"], - file(params.gvcf2, checkIfExists:true), - file(params.gtbi2, checkIfExists:true) - ]) - input[2] = Channel.value([ - [id:"fasta"], - file(params.fasta, checkIfExists:true) - ]) - input[3] = Channel.value([ - [id:"fai"], - file(params.fai, checkIfExists:true) - ]) - input[4] = Channel.value([ - [id:"dict"], - file(params.dict, checkIfExists:true) - ]) - input[5] = Channel.value([ - [id:"strtablefile"], - file(params.strtablefile, checkIfExists:true) - ]) - input[6] = [[],[]] - input[7] = [[],[]] - input[8] = false - input[9] = false - input[10] = false - input[11] = false - input[12] = 2 - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert snapshot( - workflow.out.vcfs.collect { it.collect { it instanceof Map ? it : file(it).name } }, - workflow.out.reports - ).match("default - family") } - ) - } - - } - - test("cram_call_genotype_gatk4 - filter - family") { - - when { - params { - callers = "haplotypecaller" - filter = true - } - workflow { - """ - input[0] = Channel.of([ - [id:"NA24835.00001", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split1, checkIfExists:true) - ],[ - [id:"NA24835.00002", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split2, checkIfExists:true) - ],[ - [id:"NA24835.00003", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split3, checkIfExists:true) - ]) - input[1] = Channel.of([ - [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385"], - file(params.gvcf1, checkIfExists:true), - file(params.gtbi1, checkIfExists:true) - ],[ - [id:"NA24149", sample:"NA24149", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385"], - file(params.gvcf2, checkIfExists:true), - file(params.gtbi2, checkIfExists:true) - ]) - input[2] = Channel.value([ - [id:"fasta"], - file(params.fasta, checkIfExists:true) - ]) - input[3] = Channel.value([ - [id:"fai"], - file(params.fai, checkIfExists:true) - ]) - input[4] = Channel.value([ - [id:"dict"], - file(params.dict, checkIfExists:true) - ]) - input[5] = Channel.value([ - [id:"strtablefile"], - file(params.strtablefile, checkIfExists:true) - ]) - input[6] = [[],[]] - input[7] = [[],[]] - input[8] = false - input[9] = false - input[10] = false - input[11] = true - input[12] = 2 - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert snapshot( - workflow.out.vcfs.collect { it.collect { it instanceof Map ? it : file(it).name } }, - workflow.out.reports - ).match("filter - family") } - ) - } - - } - - test("cram_call_genotype_gatk4 - only_call - family") { - - when { - params { - callers = "haplotypecaller" - only_call = true - } - workflow { - """ - input[0] = Channel.of([ - [id:"NA24835.00001", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split1, checkIfExists:true) - ],[ - [id:"NA24835.00002", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split2, checkIfExists:true) - ],[ - [id:"NA24835.00003", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split3, checkIfExists:true) - ]) - input[1] = Channel.of([ - [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385"], - file(params.gvcf1, checkIfExists:true), - file(params.gtbi1, checkIfExists:true) - ],[ - [id:"NA24149", sample:"NA24149", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385"], - file(params.gvcf2, checkIfExists:true), - file(params.gtbi2, checkIfExists:true) - ]) - input[2] = Channel.value([ - [id:"fasta"], - file(params.fasta, checkIfExists:true) - ]) - input[3] = Channel.value([ - [id:"fai"], - file(params.fai, checkIfExists:true) - ]) - input[4] = Channel.value([ - [id:"dict"], - file(params.dict, checkIfExists:true) - ]) - input[5] = Channel.value([ - [id:"strtablefile"], - file(params.strtablefile, checkIfExists:true) - ]) - input[6] = [[],[]] - input[7] = [[],[]] - input[8] = false - input[9] = true - input[10] = false - input[11] = false - input[12] = 2 - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert snapshot( - workflow.out.vcfs.collect { it.collect { it instanceof Map ? it : file(it).name } }, - workflow.out.reports - ).match("only_call - family") } - ) - } - - } - - test("cram_call_genotype_gatk4 - only_merge - family") { - - when { - params { - callers = "haplotypecaller" - only_merge = true - } - workflow { - """ - input[0] = Channel.of([ - [id:"NA24835.00001", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split1, checkIfExists:true) - ],[ - [id:"NA24835.00002", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split2, checkIfExists:true) - ],[ - [id:"NA24835.00003", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split3, checkIfExists:true) - ]) - input[1] = Channel.of([ - [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385"], - file(params.gvcf1, checkIfExists:true), - file(params.gtbi1, checkIfExists:true) - ],[ - [id:"NA24149", sample:"NA24149", family:"Ashkenazim", family_samples:"NA24143,NA24149,NA24385"], - file(params.gvcf2, checkIfExists:true), - file(params.gtbi2, checkIfExists:true) - ]) - input[2] = Channel.value([ - [id:"fasta"], - file(params.fasta, checkIfExists:true) - ]) - input[3] = Channel.value([ - [id:"fai"], - file(params.fai, checkIfExists:true) - ]) - input[4] = Channel.value([ - [id:"dict"], - file(params.dict, checkIfExists:true) - ]) - input[5] = Channel.value([ - [id:"strtablefile"], - file(params.strtablefile, checkIfExists:true) - ]) - input[6] = [[],[]] - input[7] = [[],[]] - input[8] = false - input[9] = false - input[10] = true - input[11] = false - input[12] = 2 - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert snapshot( - workflow.out.vcfs.collect { it.collect { it instanceof Map ? it : file(it).name } }, - workflow.out.reports - ).match("only_merge - family") } - ) - } - - } - - test("cram_call_genotype_gatk4 - default - sample + family") { - - when { - params { - callers = "haplotypecaller" - } - workflow { - """ - input[0] = Channel.of([ - [id:"NA24835.00001", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split1, checkIfExists:true) - ],[ - [id:"NA24835.00002", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split2, checkIfExists:true) - ],[ - [id:"NA24835.00003", sample:"NA24835", family:"Ashkenazim", family_samples:"NA24149,NA24385", split_count:3], - file(params.cram3, checkIfExists:true), - file(params.crai3, checkIfExists:true), - file(params.split3, checkIfExists:true) - ]) - input[1] = Channel.of([ - [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143"], - file(params.gvcf1, checkIfExists:true), - file(params.gtbi1, checkIfExists:true) - ],[ - [id:"NA24149", sample:"NA24149", family:"NA24149", family_samples:"NA24149,NA24385"], - file(params.gvcf2, checkIfExists:true), - file(params.gtbi2, checkIfExists:true) - ]) - input[2] = Channel.value([ - [id:"fasta"], - file(params.fasta, checkIfExists:true) - ]) - input[3] = Channel.value([ - [id:"fai"], - file(params.fai, checkIfExists:true) - ]) - input[4] = Channel.value([ - [id:"dict"], - file(params.dict, checkIfExists:true) - ]) - input[5] = Channel.value([ - [id:"strtablefile"], - file(params.strtablefile, checkIfExists:true) - ]) - input[6] = [[],[]] - input[7] = [[],[]] - input[8] = false - input[9] = false - input[10] = false - input[11] = false - input[12] = 2 - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert snapshot( - workflow.out.vcfs.collect { it.collect { it instanceof Map ? it : file(it).name } }, - workflow.out.reports - ).match("only_merge - sample + family") } - ) - } - - } - -} diff --git a/tests/subworkflows/local/cram_call_genotype_gatk4/main.nf.test.snap b/tests/subworkflows/local/cram_call_genotype_gatk4/main.nf.test.snap deleted file mode 100644 index a0aec425..00000000 --- a/tests/subworkflows/local/cram_call_genotype_gatk4/main.nf.test.snap +++ /dev/null @@ -1,164 +0,0 @@ -{ - "only_merge - family": { - "content": [ - [ - - ], - [ - [ - "NA24835.haplotypecaller.bcftools_stats.txt:md5,5f42bee02b2bd0d2af2954292ec3b422" - ] - ] - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-09-05T11:20:45.005084818" - }, - "default - family": { - "content": [ - [ - [ - { - "family": "Ashkenazim", - "family_samples": "NA24143,NA24149,NA24385", - "caller": "haplotypecaller", - "id": "Ashkenazim" - }, - "Ashkenazim.haplotypecaller.vcf.gz", - "Ashkenazim.haplotypecaller.vcf.gz.tbi" - ] - ], - [ - [ - "NA24835.haplotypecaller.bcftools_stats.txt:md5,5f42bee02b2bd0d2af2954292ec3b422" - ] - ] - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-09-05T17:35:02.44674969" - }, - "filter - family": { - "content": [ - [ - [ - { - "family": "Ashkenazim", - "family_samples": "NA24143,NA24149,NA24385", - "caller": "haplotypecaller", - "id": "Ashkenazim" - }, - "Ashkenazim.haplotypecaller.vcf.gz", - "Ashkenazim.haplotypecaller.vcf.gz.tbi" - ] - ], - [ - [ - "NA24835.haplotypecaller.bcftools_stats.txt:md5,5f42bee02b2bd0d2af2954292ec3b422" - ] - ] - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-09-05T17:35:46.768542501" - }, - "default - gvcfs": { - "content": [ - [ - [ - { - "family": "Ashkenazim", - "family_samples": "NA24143", - "caller": "haplotypecaller", - "id": "Ashkenazim" - }, - "Ashkenazim.haplotypecaller.vcf.gz", - "Ashkenazim.haplotypecaller.vcf.gz.tbi" - ] - ], - [ - - ] - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-09-05T17:34:17.846266913" - }, - "default - crams": { - "content": [ - [ - [ - { - "family": "Ashkenazim", - "family_samples": "NA24143", - "caller": "haplotypecaller", - "id": "Ashkenazim" - }, - "Ashkenazim.haplotypecaller.vcf.gz", - "Ashkenazim.haplotypecaller.vcf.gz.tbi" - ] - ], - [ - [ - "NA24143.haplotypecaller.bcftools_stats.txt:md5,09b4e7674e0f5b98b1e548df3002250e" - ] - ] - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-09-05T17:33:52.045772718" - }, - "only_call - family": { - "content": [ - [ - - ], - [ - [ - "NA24835.haplotypecaller.bcftools_stats.txt:md5,5f42bee02b2bd0d2af2954292ec3b422" - ] - ] - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-09-05T11:20:04.714403906" - }, - "only_merge - sample + family": { - "content": [ - [ - [ - { - "family": "Ashkenazim", - "family_samples": "NA24143", - "caller": "haplotypecaller", - "id": "Ashkenazim" - }, - "Ashkenazim.haplotypecaller.vcf.gz", - "Ashkenazim.haplotypecaller.vcf.gz.tbi" - ] - ], - [ - [ - "NA24835.haplotypecaller.bcftools_stats.txt:md5,5f42bee02b2bd0d2af2954292ec3b422" - ] - ] - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-09-06T14:26:29.249708339" - } -} \ No newline at end of file diff --git a/tests/subworkflows/local/cram_call_vardictjava/main.nf.test b/tests/subworkflows/local/cram_call_vardictjava/main.nf.test deleted file mode 100644 index 05798304..00000000 --- a/tests/subworkflows/local/cram_call_vardictjava/main.nf.test +++ /dev/null @@ -1,201 +0,0 @@ -nextflow_workflow { - - name "Test Workflow CRAM_CALL_VARDICTJAVA" - script "subworkflows/local/cram_call_vardictjava/main.nf" - workflow "CRAM_CALL_VARDICTJAVA" - - tag "subworkflows" - tag "subworkflows_local" - tag "cram_call_vardictjava" - tag "vcf_concat_bcftools" - tag "vcf_filter_bcftools" - - test("cram_call_vardictjava - default") { - - - when { - params { - callers = "vardict" - } - workflow { - """ - input[0] = Channel.of([ - [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143"], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true) - ]) - input[1] = Channel.of([ - [id:"NA24143.00001", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true), - file(params.split1, checkIfExists:true) - ],[ - [id:"NA24143.00002", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true), - file(params.split2, checkIfExists:true) - ],[ - [id:"NA24143.00003", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true), - file(params.split3, checkIfExists:true) - ]) - input[2] = Channel.value([ - [id:"fasta"], - file(params.fasta, checkIfExists:true) - ]) - input[3] = Channel.value([ - [id:"fai"], - file(params.fai, checkIfExists:true) - ]) - input[4] = [[],[]] - input[5] = [[],[]] - input[6] = false - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert snapshot( - workflow.out.vcfs.collect { it.collect { it instanceof Map ? it : file(it).name } } - ).match("default") } - ) - } - - } - - test("cram_call_vardictjava - filter") { - - - when { - params { - filter = true - callers = "vardict" - } - workflow { - """ - input[0] = Channel.of([ - [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143"], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true) - ]) - input[1] = Channel.of([ - [id:"NA24143.00001", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true), - file(params.split1, checkIfExists:true) - ],[ - [id:"NA24143.00002", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true), - file(params.split2, checkIfExists:true) - ],[ - [id:"NA24143.00003", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143", split_count:3], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true), - file(params.split3, checkIfExists:true) - ]) - input[2] = Channel.value([ - [id:"fasta"], - file(params.fasta, checkIfExists:true) - ]) - input[3] = Channel.value([ - [id:"fai"], - file(params.fai, checkIfExists:true) - ]) - input[4] = [[],[]] - input[5] = [[],[]] - input[6] = true - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert snapshot( - workflow.out.vcfs.collect { it.collect { it instanceof Map ? it : file(it).name } } - ).match("filter") } - ) - } - - } - - test("cram_call_vardictjava - family") { - // The family should not be merged here - - when { - params { - callers = "vardict" - } - workflow { - """ - input[0] = Channel.of([ - [id:"NA24143", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143,NA24149"], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true) - ],[ - [id:"NA24149", sample:"NA24149", family:"Ashkenazim", family_samples:"NA24143,NA24149"], - file(params.cram2, checkIfExists:true), - file(params.crai2, checkIfExists:true) - ]) - input[1] = Channel.of([ - [id:"NA24143.00001", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143,NA24149", split_count:3], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true), - file(params.split1, checkIfExists:true) - ],[ - [id:"NA24143.00002", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143,NA24149", split_count:3], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true), - file(params.split2, checkIfExists:true) - ],[ - [id:"NA24143.00003", sample:"NA24143", family:"Ashkenazim", family_samples:"NA24143,NA24149", split_count:3], - file(params.cram1, checkIfExists:true), - file(params.crai1, checkIfExists:true), - file(params.split3, checkIfExists:true) - ],[ - [id:"NA24149.00001", sample:"NA24149", family:"Ashkenazim", family_samples:"NA24143,NA24149", split_count:3], - file(params.cram2, checkIfExists:true), - file(params.crai2, checkIfExists:true), - file(params.split1, checkIfExists:true) - ],[ - [id:"NA24149.00002", sample:"NA24149", family:"Ashkenazim", family_samples:"NA24143,NA24149", split_count:3], - file(params.cram2, checkIfExists:true), - file(params.crai2, checkIfExists:true), - file(params.split2, checkIfExists:true) - ],[ - [id:"NA24149.00003", sample:"NA24149", family:"Ashkenazim", family_samples:"NA24143,NA24149", split_count:3], - file(params.cram2, checkIfExists:true), - file(params.crai2, checkIfExists:true), - file(params.split3, checkIfExists:true) - ]) - input[2] = Channel.value([ - [id:"fasta"], - file(params.fasta, checkIfExists:true) - ]) - input[3] = Channel.value([ - [id:"fai"], - file(params.fai, checkIfExists:true) - ]) - input[4] = [[],[]] - input[5] = [[],[]] - input[6] = false - """ - } - } - - then { - assertAll( - { assert workflow.success }, - { assert snapshot( - workflow.out.vcfs.collect { it.collect { it instanceof Map ? it : file(it).name } } - ).match("family") } - ) - } - - } - -} diff --git a/tests/subworkflows/local/cram_call_vardictjava/main.nf.test.snap b/tests/subworkflows/local/cram_call_vardictjava/main.nf.test.snap deleted file mode 100644 index f2057a7b..00000000 --- a/tests/subworkflows/local/cram_call_vardictjava/main.nf.test.snap +++ /dev/null @@ -1,83 +0,0 @@ -{ - "filter": { - "content": [ - [ - [ - { - "id": "NA24143", - "sample": "NA24143", - "family": "Ashkenazim", - "family_samples": "NA24143", - "caller": "vardict", - "samples": "NA24143" - }, - "NA24143.vardict.vcf.gz", - "NA24143.vardict.vcf.gz.tbi" - ] - ] - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-09-05T17:38:01.461442987" - }, - "default": { - "content": [ - [ - [ - { - "id": "NA24143", - "sample": "NA24143", - "family": "Ashkenazim", - "family_samples": "NA24143", - "caller": "vardict", - "samples": "NA24143" - }, - "NA24143.vardict.vcf.gz", - "NA24143.vardict.vcf.gz.tbi" - ] - ] - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-09-05T17:37:40.178107639" - }, - "family": { - "content": [ - [ - [ - { - "id": "NA24143", - "sample": "NA24143", - "family": "Ashkenazim", - "family_samples": "NA24143,NA24149", - "caller": "vardict", - "samples": "NA24143" - }, - "NA24143.vardict.vcf.gz", - "NA24143.vardict.vcf.gz.tbi" - ], - [ - { - "id": "NA24149", - "sample": "NA24149", - "family": "Ashkenazim", - "family_samples": "NA24143,NA24149", - "caller": "vardict", - "samples": "NA24149" - }, - "NA24149.vardict.vcf.gz", - "NA24149.vardict.vcf.gz.tbi" - ] - ] - ], - "meta": { - "nf-test": "0.9.0", - "nextflow": "24.04.4" - }, - "timestamp": "2024-09-05T17:38:29.203806206" - } -} \ No newline at end of file diff --git a/workflows/germline.nf b/workflows/germline.nf index fe3e04b6..4eafdef3 100644 --- a/workflows/germline.nf +++ b/workflows/germline.nf @@ -17,14 +17,17 @@ include { methodsDescriptionText } from '../subworkflows/local/utils_ include { CRAM_PREPARE_SAMTOOLS_BEDTOOLS } from '../subworkflows/local/cram_prepare_samtools_bedtools/main' include { INPUT_SPLIT_BEDTOOLS } from '../subworkflows/local/input_split_bedtools/main' -include { CRAM_CALL_GENOTYPE_GATK4 } from '../subworkflows/local/cram_call_genotype_gatk4/main' -include { CRAM_CALL_VARDICTJAVA } from '../subworkflows/local/cram_call_vardictjava/main' +include { CRAM_CALL_GATK4 } from '../subworkflows/local/cram_call_gatk4/main' +include { GVCF_JOINT_GENOTYPE_GATK4 } from '../subworkflows/local/gvcf_joint_genotype_gatk4/main' +include { BAM_CALL_ELPREP } from '../subworkflows/local/bam_call_elprep/main' +include { BAM_CALL_VARDICTJAVA } from '../subworkflows/local/bam_call_vardictjava/main' include { VCF_EXTRACT_RELATE_SOMALIER } from '../subworkflows/local/vcf_extract_relate_somalier/main' include { VCF_PED_RTGTOOLS } from '../subworkflows/local/vcf_ped_rtgtools/main' include { VCF_ANNOTATION } from '../subworkflows/local/vcf_annotation/main' include { VCF_VALIDATE_SMALL_VARIANTS } from '../subworkflows/local/vcf_validate_small_variants/main' include { VCF_UPD_UPDIO } from '../subworkflows/local/vcf_upd_updio/main' include { VCF_ROH_AUTOMAP } from '../subworkflows/local/vcf_roh_automap/main' +include { VCF_FILTER_BCFTOOLS } from '../subworkflows/local/vcf_filter_bcftools/main' /* ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ @@ -34,6 +37,7 @@ include { VCF_ROH_AUTOMAP } from '../subworkflows/local/vcf_ro include { SAMTOOLS_FAIDX as FAIDX } from '../modules/nf-core/samtools/faidx/main' include { GATK4_CREATESEQUENCEDICTIONARY as CREATESEQUENCEDICTIONARY } from '../modules/nf-core/gatk4/createsequencedictionary/main' +include { ELPREP_FASTATOELFASTA } from '../modules/nf-core/elprep/fastatoelfasta/main' include { GATK4_COMPOSESTRTABLEFILE as COMPOSESTRTABLEFILE } from '../modules/nf-core/gatk4/composestrtablefile/main' include { RTGTOOLS_FORMAT } from '../modules/nf-core/rtgtools/format/main' include { UNTAR } from '../modules/nf-core/untar/main' @@ -67,6 +71,7 @@ workflow GERMLINE { fasta // string: path to the reference fasta fai // string: path to the index of the reference fasta dict // string: path to the sequence dictionary file + elfasta // string: path to the elfasta reference file strtablefile // string: path to the strtable file sdf // string: path to the SDF directory dbsnp // string: path to the DBSNP VCF file @@ -97,6 +102,7 @@ workflow GERMLINE { automap_panel // string: path to the Automap panel file outdir // string: path to the output directory pedFiles // map: a map that has the family ID as key and a PED file as value + elsites // string: path to the elsites file for elprep // Boolean inputs dragstr // boolean: create a dragstr model and use it for haplotypecaller @@ -127,43 +133,46 @@ workflow GERMLINE { main: - ch_versions = Channel.empty() - ch_reports = Channel.empty() - ch_multiqc_files = Channel.empty() + def ch_versions = Channel.empty() + def ch_reports = Channel.empty() + def ch_multiqc_files = Channel.empty() // // Importing and convert the input files passed through the parameters to channels // - ch_fasta_ready = Channel.fromPath(fasta).map{ fasta_file -> [[id:"reference"], fasta_file] }.collect() - ch_fai = fai ? Channel.fromPath(fai).map{ fai_file -> [[id:"reference"], fai_file] }.collect() : null - ch_dict = dict ? Channel.fromPath(dict).map{ dict_file -> [[id:"reference"], dict_file] }.collect() : null - ch_strtablefile = strtablefile ? Channel.fromPath(strtablefile).map{ str_file -> [[id:"reference"], str_file] }.collect() : null - ch_sdf = sdf ? Channel.fromPath(sdf).map { sdf_file -> [[id:'reference'], sdf_file] }.collect() : null + def ch_fasta_ready = Channel.fromPath(fasta).map{ fasta_file -> [[id:"reference"], fasta_file] }.collect() + def ch_fai = fai ? Channel.fromPath(fai).map{ fai_file -> [[id:"reference"], fai_file] }.collect() : null + def ch_dict = dict ? Channel.fromPath(dict).map{ dict_file -> [[id:"reference"], dict_file] }.collect() : null + def ch_elfasta = elfasta ? Channel.fromPath(elfasta).map { elfasta_file -> [[id:"reference"], elfasta_file]}.collect() : null + def ch_strtablefile = strtablefile ? Channel.fromPath(strtablefile).map{ str_file -> [[id:"reference"], str_file] }.collect() : null + def ch_sdf = sdf ? Channel.fromPath(sdf).map { sdf_file -> [[id:'reference'], sdf_file] }.collect() : null - ch_default_roi = roi ? Channel.fromPath(roi).collect() : [] + def ch_default_roi = roi ? Channel.fromPath(roi).collect() : [] - ch_dbsnp_ready = dbsnp ? Channel.fromPath(dbsnp).collect { dbsnp_file -> [[id:"dbsnp"], dbsnp_file] } : [[],[]] - ch_dbsnp_tbi = dbsnp_tbi ? Channel.fromPath(dbsnp_tbi).collect { dbsnp_file -> [[id:"dbsnp"], dbsnp_file] } : [[],[]] + def ch_dbsnp_ready = dbsnp ? Channel.fromPath(dbsnp).collect { dbsnp_file -> [[id:"dbsnp"], dbsnp_file] } : [[],[]] + def ch_dbsnp_tbi = dbsnp_tbi ? Channel.fromPath(dbsnp_tbi).collect { dbsnp_file -> [[id:"dbsnp"], dbsnp_file] } : [[],[]] - ch_somalier_sites = somalier_sites ? Channel.fromPath(somalier_sites).collect { sites_file -> [[id:"somalier_sites"], sites_file] } : [[],[]] + def ch_somalier_sites = somalier_sites ? Channel.fromPath(somalier_sites).collect { sites_file -> [[id:"somalier_sites"], sites_file] } : [[],[]] - ch_vep_cache = vep_cache ? Channel.fromPath(vep_cache).collect() : [] + def ch_vep_cache = vep_cache ? Channel.fromPath(vep_cache).collect() : [] - ch_vcfanno_config = vcfanno_config ? Channel.fromPath(vcfanno_config).collect() : [] - ch_vcfanno_lua = vcfanno_lua ? Channel.fromPath(vcfanno_lua).collect() : [] - ch_vcfanno_resources = vcfanno_resources ? Channel.of(vcfanno_resources.split(";")).collect{ res -> file(res, checkIfExists:true) } : [] + def ch_vcfanno_config = vcfanno_config ? Channel.fromPath(vcfanno_config).collect() : [] + def ch_vcfanno_lua = vcfanno_lua ? Channel.fromPath(vcfanno_lua).collect() : [] + def ch_vcfanno_resources = vcfanno_resources ? Channel.of(vcfanno_resources.split(";")).collect{ res -> file(res, checkIfExists:true) } : [] - ch_updio_common_cnvs = updio_common_cnvs ? Channel.fromPath(updio_common_cnvs).map{ common_cnvs -> [[id:'updio_cnv'], common_cnvs] } : [[],[]] + def ch_updio_common_cnvs = updio_common_cnvs ? Channel.fromPath(updio_common_cnvs).map{ common_cnvs -> [[id:'updio_cnv'], common_cnvs] } : [[],[]] - ch_automap_repeats = automap_repeats ? Channel.fromPath(automap_repeats).map{ repeats -> [[id:"repeats"], repeats] }.collect() : [] - ch_automap_panel = automap_panel ? Channel.fromPath(automap_panel).map{ panel -> [[id:"automap_panel"], panel] }.collect() : [[],[]] + def ch_automap_repeats = automap_repeats ? Channel.fromPath(automap_repeats).map{ repeats -> [[id:"repeats"], repeats] }.collect() : [] + def ch_automap_panel = automap_panel ? Channel.fromPath(automap_panel).map{ panel -> [[id:"automap_panel"], panel] }.collect() : [[],[]] + + def ch_elsites = elsites ? Channel.fromPath(elsites).map{ elsites_file -> [[id:'elsites'], elsites_file] }.collect() : [[],[]] // // Check for the presence of EnsemblVEP plugins that use extra files // - ch_vep_extra_files = [] + def ch_vep_extra_files = [] if(annotate){ // Check if all dbnsfp files are given @@ -219,24 +228,20 @@ workflow GERMLINE { // // DBSNP index - if (ch_dbsnp_ready && !ch_dbsnp_tbi) { + def ch_dbsnp_tbi_ready = Channel.empty() + if (ch_dbsnp_ready != [[],[]] && ch_dbsnp_tbi == [[],[]]) { TABIX_DBSNP( - ch_dbsnp_ready.map { dbnsp -> [[id:'dbsnp'], dbsnp] } + ch_dbsnp_ready ) ch_versions = ch_versions.mix(TABIX_DBSNP.out.versions) - TABIX_DBSNP.out.tbi - .map{ meta, tbi -> - [ tbi ] - } - .collect() - .set { ch_dbsnp_tbi_ready } + ch_dbsnp_tbi_ready = TABIX_DBSNP.out.tbi.collect() } else { ch_dbsnp_tbi_ready = ch_dbsnp_tbi } // Reference fasta index - ch_fai_ready = Channel.empty() + def ch_fai_ready = Channel.empty() if (!ch_fai) { FAIDX( ch_fasta_ready, @@ -244,31 +249,40 @@ workflow GERMLINE { ) ch_versions = ch_versions.mix(FAIDX.out.versions) - FAIDX.out.fai + ch_fai_ready = FAIDX.out.fai .collect() - .dump(tag:'fasta_fai', pretty:true) - .set { ch_fai_ready } } else { - ch_fai.set { ch_fai_ready } + ch_fai_ready = ch_fai } // Reference sequence dictionary - ch_dict_ready = Channel.empty() + def ch_dict_ready = Channel.empty() if (!ch_dict) { CREATESEQUENCEDICTIONARY( ch_fasta_ready ) ch_versions = ch_versions.mix(CREATESEQUENCEDICTIONARY.out.versions) - CREATESEQUENCEDICTIONARY.out.dict + ch_dict_ready = CREATESEQUENCEDICTIONARY.out.dict .collect() - .dump(tag:'dict', pretty:true) - .set { ch_dict_ready } } else { - ch_dict.set { ch_dict_ready } + ch_dict_ready = ch_dict + } + + def ch_elfasta_ready = Channel.empty() + def elprep_used = callers.contains("elprep") + if (!ch_elfasta && elprep_used) { + ELPREP_FASTATOELFASTA( + ch_fasta_ready + ) + ch_versions = ch_versions.mix(ELPREP_FASTATOELFASTA.out.versions) + ch_elfasta_ready = ELPREP_FASTATOELFASTA.out.elfasta + } else { + ch_elfasta_ready = ch_elfasta } // Reference STR table file + def ch_strtablefile_ready = Channel.empty() if (dragstr && !ch_strtablefile) { COMPOSESTRTABLEFILE( ch_fasta_ready, @@ -276,28 +290,21 @@ workflow GERMLINE { ch_dict_ready ) ch_versions = ch_versions.mix(COMPOSESTRTABLEFILE.out.versions) - - COMPOSESTRTABLEFILE.out.str_table - .collect() - .dump(tag:'strtablefile', pretty:true) - .set { ch_strtablefile_ready } + ch_strtablefile_ready = COMPOSESTRTABLEFILE.out.str_table.collect() } else if (dragstr) { - ch_strtablefile.set { ch_strtablefile_ready } + ch_strtablefile_ready = ch_strtablefile } else { ch_strtablefile_ready = [] } // Reference validation SDF + def ch_sdf_ready = Channel.empty() if (validate && !ch_sdf) { RTGTOOLS_FORMAT( ch_fasta_ready.map { meta, fasta_file -> [meta, fasta_file, [], []] } ) ch_versions = ch_versions.mix(RTGTOOLS_FORMAT.out.versions) - - RTGTOOLS_FORMAT.out.sdf - .collect() - .dump(tag:'sdf', pretty:true) - .set { ch_sdf_ready } + ch_sdf_ready = RTGTOOLS_FORMAT.out.sdf.collect() } else if (validate && sdf.endsWith(".tar.gz")) { UNTAR( @@ -305,22 +312,22 @@ workflow GERMLINE { ) ch_versions = ch_versions.mix(UNTAR.out.versions) - UNTAR.out.untar - .dump(tag:'sdf', pretty:true) - .set { ch_sdf_ready } + ch_sdf_ready = UNTAR.out.untar.collect() } else if(validate) { - ch_sdf.set { ch_sdf_ready } + ch_sdf_ready = ch_sdf } else { ch_sdf_ready = [[],[]] } + // VEP annotation cache + def ch_vep_cache_ready = Channel.empty() if (!ch_vep_cache && annotate) { ENSEMBLVEP_DOWNLOAD( Channel.of([[id:"vep_cache"], genome == "hg38" ? "GRCh38" : genome, species, vep_cache_version]).collect() ) ch_versions = ch_versions.mix(ENSEMBLVEP_DOWNLOAD.out.versions) - ch_vep_cache_ready = ENSEMBLVEP_DOWNLOAD.out.cache.collect{ meta, cache -> cache } + ch_vep_cache_ready = ENSEMBLVEP_DOWNLOAD.out.cache.collect{ _meta, cache -> cache } } else { ch_vep_cache_ready = ch_vep_cache } @@ -329,8 +336,18 @@ workflow GERMLINE { // Split the input channel into the right channels // - ch_samplesheet + def usedGvcfCallers = callers.intersect(GlobalVariables.gvcfCallers) + + def ch_input = ch_samplesheet .multiMap { meta, cram, crai, gvcf, tbi, roi_file, truth_vcf, truth_tbi, truth_bed -> + // Error checks that were not possible using nf-schema + if (gvcf && usedGvcfCallers.size() >= 2) { + error("GVCF input is not supported for runs that use more than one caller that produces a GVCF output") + } + if (gvcf && validate) { + error("Validation is not supported for GVCF inputs, use CRAM files instead when using `--validate`.") + } + // Divide the input files into their corresponding channel def new_meta = meta + [ type: gvcf && cram ? "gvcf_cram" : gvcf ? "gvcf" : "cram" // Define the type of input data @@ -338,19 +355,23 @@ workflow GERMLINE { def new_meta_validation = meta.subMap(["id", "sample", "family", "duplicate_count"]) + def new_meta_gvcf = meta + if (usedGvcfCallers.size() == 1) { + new_meta_gvcf = meta + [caller:usedGvcfCallers[0]] + } + truth_variants: [new_meta_validation, truth_vcf, truth_tbi, truth_bed] // Optional channel containing the truth VCF, its index and the optional BED file gvcf: [new_meta, gvcf, tbi] // Optional channel containing the GVCFs and their optional indices cram: [new_meta, cram, crai] // Mandatory channel containing the CRAM files and their optional indices roi: [new_meta, roi_file] // Optional channel containing the ROI BED files for WES samples } - .set { ch_input } // // Create the GVCF index if it's missing // - ch_input.gvcf - .filter { meta, gvcf, tbi -> + def ch_gvcf_branch = ch_input.gvcf + .filter { meta, _gvcf, _tbi -> // Filter out samples that have no GVCF meta.type == "gvcf" || meta.type == "gvcf_cram" } @@ -360,78 +381,114 @@ workflow GERMLINE { tbi: tbi return [ meta, gvcf, tbi ] } - .set { ch_gvcf_branch } TABIX_GVCF( ch_gvcf_branch.no_tbi ) ch_versions = ch_versions.mix(TABIX_GVCF.out.versions) - ch_gvcf_branch.no_tbi + def ch_gvcfs_ready = ch_gvcf_branch.no_tbi .join(TABIX_GVCF.out.tbi, failOnDuplicate:true, failOnMismatch:true) .mix(ch_gvcf_branch.tbi) - .set { ch_gvcfs_ready } + .combine(callers.intersect(GlobalVariables.gvcfCallers)) + .map { meta, gvcf, tbi, caller -> + def new_meta = meta + [caller:caller] + [ new_meta, gvcf, tbi ] + } // // Run sample preparation // + def create_bam_files = callers.intersect(GlobalVariables.bamCallers).size() > 0 // Only create BAM files when needed CRAM_PREPARE_SAMTOOLS_BEDTOOLS( - ch_input.cram.filter { meta, cram, crai -> + ch_input.cram.filter { meta, _cram, _crai -> // Filter out files that already have a called GVCF when only GVCF callers are used meta.type == "cram" || (meta.type == "gvcf_cram" && callers - GlobalVariables.gvcfCallers) }, - ch_input.roi.filter { meta, roi_file -> + ch_input.roi.filter { meta, _roi_file -> // Filter out files that already have a called GVCF when only GVCF callers are used meta.type == "cram" || (meta.type == "gvcf_cram" && callers - GlobalVariables.gvcfCallers) }, ch_fasta_ready, ch_fai_ready, - ch_default_roi + ch_default_roi, + create_bam_files ) ch_versions = ch_versions.mix(CRAM_PREPARE_SAMTOOLS_BEDTOOLS.out.versions) + def ch_single_beds = CRAM_PREPARE_SAMTOOLS_BEDTOOLS.out.ready_beds // // Split the BED files // + def ch_split_cram_bam = Channel.empty() + if(create_bam_files) { + ch_split_cram_bam = CRAM_PREPARE_SAMTOOLS_BEDTOOLS.out.ready_crams + .join(CRAM_PREPARE_SAMTOOLS_BEDTOOLS.out.ready_bams, failOnDuplicate:true, failOnMismatch:true) + } else { + ch_split_cram_bam = CRAM_PREPARE_SAMTOOLS_BEDTOOLS.out.ready_crams + } + INPUT_SPLIT_BEDTOOLS( - CRAM_PREPARE_SAMTOOLS_BEDTOOLS.out.ready_beds.map { meta, bed -> + ch_single_beds.map { meta, bed -> [meta, bed, scatter_count] }, - CRAM_PREPARE_SAMTOOLS_BEDTOOLS.out.ready_crams + ch_split_cram_bam ) ch_versions = ch_versions.mix(INPUT_SPLIT_BEDTOOLS.out.versions) - ch_calls = Channel.empty() + def ch_caller_inputs = INPUT_SPLIT_BEDTOOLS.out.split + .multiMap { meta, cram, crai, bam=[], bai=[], bed -> + cram: [meta, cram, crai, bed] + bam: [meta, bam, bai, bed] + } + def ch_calls = Channel.empty() + def ch_gvcf_reports = Channel.empty() if("haplotypecaller" in callers) { // // Call variants with GATK4 HaplotypeCaller // - CRAM_CALL_GENOTYPE_GATK4( - INPUT_SPLIT_BEDTOOLS.out.split.filter { meta, cram, crai, bed -> + CRAM_CALL_GATK4( + ch_caller_inputs.cram.filter { meta, _cram, _crai, _bed -> // Filter out the entries that already have a GVCF meta.type == "cram" }, - ch_gvcfs_ready, ch_fasta_ready, ch_fai_ready, ch_dict_ready, ch_strtablefile_ready, ch_dbsnp_ready, ch_dbsnp_tbi_ready, - dragstr, - only_call, - only_merge, - filter, - scatter_count + dragstr ) - ch_versions = ch_versions.mix(CRAM_CALL_GENOTYPE_GATK4.out.versions) - ch_reports = ch_reports.mix(CRAM_CALL_GENOTYPE_GATK4.out.reports) + ch_gvcfs_ready = ch_gvcfs_ready.mix(CRAM_CALL_GATK4.out.gvcfs) + ch_versions = ch_versions.mix(CRAM_CALL_GATK4.out.versions) + ch_reports = ch_reports.mix(CRAM_CALL_GATK4.out.reports.map { _meta, report -> report }) + ch_gvcf_reports = ch_gvcf_reports.mix(CRAM_CALL_GATK4.out.reports) + } - ch_calls = ch_calls.mix(CRAM_CALL_GENOTYPE_GATK4.out.vcfs) + if("elprep" in callers) { + // + // Call variants with Elprep + // + + BAM_CALL_ELPREP( + ch_caller_inputs.bam.filter { meta, _bam, _bai, _bed -> + // Filter out the entries that already have a GVCF + meta.type == "cram" + }, + ch_elfasta_ready, + ch_elsites, + ch_dbsnp_ready, + ch_dbsnp_tbi_ready + ) + ch_gvcfs_ready = ch_gvcfs_ready.mix(BAM_CALL_ELPREP.out.gvcfs) + ch_versions = ch_versions.mix(BAM_CALL_ELPREP.out.versions) + ch_reports = ch_reports.mix(BAM_CALL_ELPREP.out.reports.map { _meta, report -> report }) + ch_gvcf_reports = ch_gvcf_reports.mix(BAM_CALL_ELPREP.out.reports) } @@ -440,69 +497,102 @@ workflow GERMLINE { // Call variants with VarDict // - CRAM_CALL_VARDICTJAVA( - CRAM_PREPARE_SAMTOOLS_BEDTOOLS.out.ready_crams, - INPUT_SPLIT_BEDTOOLS.out.split, + BAM_CALL_VARDICTJAVA( + ch_caller_inputs.bam, ch_fasta_ready, ch_fai_ready, ch_dbsnp_ready, - ch_dbsnp_tbi_ready, - filter + ch_dbsnp_tbi_ready ) - ch_versions = ch_versions.mix(CRAM_CALL_VARDICTJAVA.out.versions) + ch_versions = ch_versions.mix(BAM_CALL_VARDICTJAVA.out.versions) - ch_calls = ch_calls.mix(CRAM_CALL_VARDICTJAVA.out.vcfs) + ch_calls = ch_calls.mix(BAM_CALL_VARDICTJAVA.out.vcfs) } - ch_calls - .map { meta, vcf, tbi -> - def new_meta = meta - meta.subMap(["type", "vardict_min_af"]) - [ new_meta, vcf, tbi ] - } - .set { ch_called_variants } - - BCFTOOLS_STATS( - ch_called_variants, - [[],[]], - [[],[]], - [[],[]], - [[],[]], - [[],[]] + // Stop pipeline execution when only calls should happen + def ch_gvcfs_final = ch_gvcfs_ready.filter { !only_call } + + GVCF_JOINT_GENOTYPE_GATK4( + ch_gvcfs_final, + ch_fasta_ready, + ch_fai_ready, + ch_dict_ready, + ch_dbsnp_ready, + ch_dbsnp_tbi_ready, + only_merge, + scatter_count ) - ch_versions = ch_versions.mix(BCFTOOLS_STATS.out.versions.first()) - ch_reports = ch_reports.mix(BCFTOOLS_STATS.out.stats.collect { meta, report -> report }) + ch_versions = ch_versions.mix(GVCF_JOINT_GENOTYPE_GATK4.out.versions) + ch_calls = ch_calls.mix(GVCF_JOINT_GENOTYPE_GATK4.out.vcfs) + def ch_joint_beds = GVCF_JOINT_GENOTYPE_GATK4.out.beds + def ch_final_genomicsdb = GVCF_JOINT_GENOTYPE_GATK4.out.genomicsdb + + def ch_final_vcfs = Channel.empty() + def ch_final_dbs = Channel.empty() + def ch_final_peds = Channel.empty() + def ch_final_reports = Channel.empty() + def ch_final_automap = Channel.empty() + def ch_final_updio = Channel.empty() + def ch_final_validation = Channel.empty() + + if (!only_call && !only_merge) { + def ch_called_variants = ch_calls + .map { meta, vcf, tbi -> + def new_meta = meta - meta.subMap(["type", "vardict_min_af"]) + [ new_meta, vcf, tbi ] + } - ch_normalized_variants = Channel.empty() - if(normalize) { - BCFTOOLS_NORM( + BCFTOOLS_STATS( ch_called_variants, - ch_fasta_ready, + [[],[]], + [[],[]], + [[],[]], + [[],[]], + [[],[]] ) - ch_versions = ch_versions.mix(BCFTOOLS_NORM.out.versions.first()) + ch_versions = ch_versions.mix(BCFTOOLS_STATS.out.versions.first()) + ch_final_reports = BCFTOOLS_STATS.out.stats + ch_reports = ch_reports.mix(ch_final_reports.collect { _meta, report -> report }) + + def ch_filtered_variants = Channel.empty() + if(filter) { + VCF_FILTER_BCFTOOLS( + ch_called_variants, + true + ) + ch_versions = ch_versions.mix(VCF_FILTER_BCFTOOLS.out.versions) + ch_filtered_variants = VCF_FILTER_BCFTOOLS.out.vcfs + } else { + ch_filtered_variants = ch_called_variants + } - TABIX_NORMALIZE( - BCFTOOLS_NORM.out.vcf - ) - ch_versions = ch_versions.mix(TABIX_NORMALIZE.out.versions.first()) + def ch_normalized_variants = Channel.empty() + if(normalize) { + BCFTOOLS_NORM( + ch_filtered_variants, + ch_fasta_ready, + ) + ch_versions = ch_versions.mix(BCFTOOLS_NORM.out.versions.first()) - BCFTOOLS_NORM.out.vcf - .join(TABIX_NORMALIZE.out.tbi, failOnDuplicate:true, failOnMismatch:true) - .set { ch_normalized_variants } - } else { - ch_called_variants.set { ch_normalized_variants } - } + TABIX_NORMALIZE( + BCFTOOLS_NORM.out.vcf + ) + ch_versions = ch_versions.mix(TABIX_NORMALIZE.out.versions.first()) - if(!only_merge && !only_call) { + ch_normalized_variants = BCFTOOLS_NORM.out.vcf + .join(TABIX_NORMALIZE.out.tbi, failOnDuplicate:true, failOnMismatch:true) + } else { + ch_normalized_variants = ch_filtered_variants + } // // Preprocess the PED channel // - ch_normalized_variants - .map { meta, vcf, tbi -> + def ch_somalier_input = ch_normalized_variants + .map { meta, _vcf, _tbi -> [ meta, pedFiles.containsKey(meta.family) ? pedFiles[meta.family] : [] ] } - .set { ch_somalier_input } // // Run relation tests with somalier @@ -516,40 +606,41 @@ workflow GERMLINE { ch_somalier_input ) ch_versions = ch_versions.mix(VCF_EXTRACT_RELATE_SOMALIER.out.versions) + ch_final_peds = VCF_EXTRACT_RELATE_SOMALIER.out.peds + ch_final_reports = ch_final_reports.mix(VCF_EXTRACT_RELATE_SOMALIER.out.html) + ch_reports = ch_reports.mix(VCF_EXTRACT_RELATE_SOMALIER.out.pairs_tsv.map { _meta, report -> report }) + ch_reports = ch_reports.mix(VCF_EXTRACT_RELATE_SOMALIER.out.samples_tsv.map { _meta, report -> report }) // // Add PED headers to the VCFs // - ch_ped_vcfs = Channel.empty() + def ch_ped_vcfs = Channel.empty() if(add_ped){ VCF_PED_RTGTOOLS( ch_normalized_variants, - VCF_EXTRACT_RELATE_SOMALIER.out.peds + ch_final_peds ) ch_versions = ch_versions.mix(VCF_PED_RTGTOOLS.out.versions) - VCF_PED_RTGTOOLS.out.ped_vcfs - .set { ch_ped_vcfs } + ch_ped_vcfs = VCF_PED_RTGTOOLS.out.ped_vcfs } else { - ch_normalized_variants - .map { meta, vcf, tbi=[] -> + ch_ped_vcfs = ch_normalized_variants + .map { meta, vcf, _tbi=[] -> [ meta, vcf ] } - .set { ch_ped_vcfs } } // // Annotation of the variants and creation of Gemini-compatible database files // - ch_annotation_output = Channel.empty() + def ch_annotation_output = Channel.empty() if (annotate) { VCF_ANNOTATION( ch_ped_vcfs, ch_fasta_ready, - ch_fai_ready, ch_vep_cache_ready, ch_vep_extra_files, ch_vcfanno_config, @@ -564,13 +655,11 @@ workflow GERMLINE { ch_versions = ch_versions.mix(VCF_ANNOTATION.out.versions) ch_reports = ch_reports.mix(VCF_ANNOTATION.out.reports) - VCF_ANNOTATION.out.annotated_vcfs.set { ch_annotation_output } + ch_annotation_output = VCF_ANNOTATION.out.annotated_vcfs } else { - ch_ped_vcfs.set { ch_annotation_output } + ch_annotation_output = ch_ped_vcfs } - ch_annotation_output.dump(tag:'annotation_output', pretty:true) - // // Tabix the resulting VCF // @@ -580,17 +669,15 @@ workflow GERMLINE { ) ch_versions = ch_versions.mix(TABIX_FINAL.out.versions.first()) - ch_annotation_output + ch_final_vcfs = ch_annotation_output .join(TABIX_FINAL.out.tbi, failOnDuplicate:true, failOnMismatch:true) - .set { ch_final_vcfs } // // Validate the found variants // if (validate){ - - ch_input.truth_variants + def ch_truths_input = ch_input.truth_variants .map { meta, vcf, tbi, bed -> def new_meta = meta - meta.subMap("duplicate_count") [ groupKey(new_meta, meta.duplicate_count), vcf, tbi, bed ] @@ -603,16 +690,15 @@ workflow GERMLINE { def one_bed = bed.find { bed_file -> bed_file != [] } ?: [] [ meta, one_vcf, one_tbi, one_bed ] } - .branch { meta, vcf, tbi, bed -> + .branch { _meta, vcf, tbi, _bed -> no_vcf: !vcf tbi: tbi no_tbi: !tbi } - .set { ch_truths_input } // Create truth VCF indices if none were given TABIX_TRUTH( - ch_truths_input.no_tbi.map { meta, vcf, tbi, bed -> + ch_truths_input.no_tbi.map { meta, vcf, _tbi, _bed -> [ meta, vcf ] } ) @@ -620,7 +706,7 @@ workflow GERMLINE { ch_truths_input.no_tbi .join(TABIX_TRUTH.out.tbi, failOnDuplicate:true, failOnMismatch:true) - .map { meta, vcf, empty, bed, tbi -> + .map { meta, vcf, _empty, bed, tbi -> [ meta, vcf, tbi, bed ] } .mix(ch_truths_input.tbi) @@ -630,9 +716,9 @@ workflow GERMLINE { def new_meta = meta + [caller: caller] [ new_meta, vcf, tbi, bed ] } - .set { ch_truths } + .set { ch_truths } // Set needs to be used here due to some Nextflow bug - ch_final_vcfs + def ch_validation_input = ch_final_vcfs .map { meta, vcf, tbi -> def new_meta = meta - meta.subMap("family_samples") [ new_meta, vcf, tbi, meta.family_samples.tokenize(",") ] @@ -648,24 +734,59 @@ workflow GERMLINE { [ new_meta, vcf, tbi ] } .join(ch_truths, failOnMismatch:true, failOnDuplicate:true) - .filter { meta, vcf, tbi, truth_vcf, truth_tbi, truth_bed -> + .filter { _meta, _vcf, _tbi, truth_vcf, _truth_tbi, _truth_bed -> // Filter out all samples that have no truth VCF truth_vcf != [] } .multiMap { meta, vcf, tbi, truth_vcf, truth_tbi, truth_bed -> vcfs: [meta, vcf, tbi, truth_vcf, truth_tbi] - bed: [meta, truth_bed, []] + bed: [meta, truth_bed] } - .set { ch_validation_input } + + ch_single_beds + .combine(callers) + .map { meta, bed, caller -> + def new_meta = [ + id:meta.id, + sample:meta.sample, + family:meta.family, + caller:caller + ] + [ new_meta, bed ] + } + .join(ch_validation_input.bed, failOnMismatch:true, failOnDuplicate:true) + .map { meta, regions, truth -> + [ meta, truth, regions ] + } + .set { ch_validation_regions } // Set needs to be used here due to some Nextflow bug VCF_VALIDATE_SMALL_VARIANTS( ch_validation_input.vcfs, - ch_validation_input.bed, - ch_fasta_ready, - ch_fai_ready, + ch_validation_regions, ch_sdf_ready.collect() ) ch_versions = ch_versions.mix(VCF_VALIDATE_SMALL_VARIANTS.out.versions) + + ch_final_validation = VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_true_positive_vcf.mix( + VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_true_positive_vcf_tbi, + VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_false_negative_vcf, + VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_false_negative_vcf_tbi, + VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_false_positive_vcf, + VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_false_positive_vcf_tbi, + VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_true_positive_baseline_vcf, + VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_true_positive_baseline_vcf_tbi, + VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_summary, + VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_phasing, + VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_snp_roc, + VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_non_snp_roc, + VCF_VALIDATE_SMALL_VARIANTS.out.vcfeval_weighted_roc, + VCF_VALIDATE_SMALL_VARIANTS.out.rtgtools_snp_png_rocplot, + VCF_VALIDATE_SMALL_VARIANTS.out.rtgtools_non_snp_png_rocplot, + VCF_VALIDATE_SMALL_VARIANTS.out.rtgtools_weighted_png_rocplot, + VCF_VALIDATE_SMALL_VARIANTS.out.rtgtools_snp_svg_rocplot, + VCF_VALIDATE_SMALL_VARIANTS.out.rtgtools_non_snp_svg_rocplot, + VCF_VALIDATE_SMALL_VARIANTS.out.rtgtools_weighted_svg_rocplot + ) } // @@ -673,20 +794,17 @@ workflow GERMLINE { // if(gemini){ - CustomChannelOperators.joinOnKeys( - ch_final_vcfs.map { meta, vcf, tbi -> [ meta, vcf ]}, - VCF_EXTRACT_RELATE_SOMALIER.out.peds, - ['id', 'family', 'family_samples'] - ) - .dump(tag:'vcf2db_input', pretty:true) - .set { ch_vcf2db_input } + def ch_vcf2db_input = CustomChannelOperators.joinOnKeys( + ch_final_vcfs.map { meta, vcf, _tbi -> [ meta, vcf ]}, + ch_final_peds, + ['id', 'family', 'family_samples'] + ) VCF2DB( ch_vcf2db_input ) ch_versions = ch_versions.mix(VCF2DB.out.versions.first()) - - VCF2DB.out.db.dump(tag:'vcf2db_output', pretty:true) + ch_final_dbs = VCF2DB.out.db } // @@ -696,10 +814,11 @@ workflow GERMLINE { if(updio) { VCF_UPD_UPDIO( ch_final_vcfs, - VCF_EXTRACT_RELATE_SOMALIER.out.peds, + ch_final_peds, ch_updio_common_cnvs ) - ch_versions = ch_versions.mix(VCF_UPD_UPDIO.out.versions.first()) + ch_versions = ch_versions.mix(VCF_UPD_UPDIO.out.versions) + ch_final_updio = VCF_UPD_UPDIO.out.updio } // @@ -713,50 +832,52 @@ workflow GERMLINE { ch_automap_panel, genome ) - ch_versions = ch_versions.mix(VCF_ROH_AUTOMAP.out.versions.first()) + ch_versions = ch_versions.mix(VCF_ROH_AUTOMAP.out.versions) + ch_final_automap = VCF_ROH_AUTOMAP.out.automap } } // // Collate and save software versions // - softwareVersionsToYAML(ch_versions) - .collectFile(storeDir: "${outdir}/pipeline_info", name: 'nf_core_pipeline_software_mqc_versions.yml', sort: true, newLine: true) - .set { ch_collated_versions } + def ch_collated_versions = softwareVersionsToYAML(ch_versions) + .collectFile( + storeDir: "${outdir}/${params.unique_out}", + name: '' + 'pipeline_software_' + 'mqc_' + 'versions.yml', + sort: true, + newLine: true + ) // // Perform multiQC on all QC data // - ch_multiqc_config = Channel.fromPath( - "$projectDir/assets/multiqc_config.yml", checkIfExists: true) - ch_multiqc_custom_config = multiqc_config ? - Channel.fromPath(multiqc_config, checkIfExists: true) : - Channel.empty() - ch_multiqc_logo = multiqc_logo ? - Channel.fromPath(multiqc_logo, checkIfExists: true) : - Channel.empty() - - summary_params = paramsSummaryMap( - workflow, parameters_schema: "nextflow_schema.json") - ch_workflow_summary = Channel.value(paramsSummaryMultiqc(summary_params)) - - ch_multiqc_custom_methods_description = multiqc_methods_description ? - file(multiqc_methods_description, checkIfExists: true) : - file("$projectDir/assets/methods_description_template.yml", checkIfExists: true) - ch_methods_description = Channel.value( - methodsDescriptionText(ch_multiqc_custom_methods_description)) - - ch_multiqc_files = ch_multiqc_files.mix( - ch_workflow_summary.collectFile(name: 'workflow_summary_mqc.yaml')) - ch_multiqc_files = ch_multiqc_files.mix(ch_collated_versions) - ch_multiqc_files = ch_multiqc_files.mix( - ch_methods_description.collectFile( - name: 'methods_description_mqc.yaml', - sort: false - ) - ) - ch_multiqc_files = ch_multiqc_files.mix(ch_reports) + def ch_multiqc_config = Channel.fromPath( + "$projectDir/assets/multiqc_config.yml", checkIfExists: true) + def ch_multiqc_custom_config = multiqc_config ? + Channel.fromPath(multiqc_config, checkIfExists: true) : + Channel.empty() + def ch_multiqc_logo = multiqc_logo ? + Channel.fromPath(multiqc_logo, checkIfExists: true) : + Channel.empty() + + def summary_params = paramsSummaryMap(workflow, parameters_schema: "nextflow_schema.json") + def ch_workflow_summary = Channel.value(paramsSummaryMultiqc(summary_params)) + def ch_multiqc_custom_methods_description = multiqc_methods_description ? + file(multiqc_methods_description, checkIfExists: true) : + file("$projectDir/assets/methods_description_template.yml", checkIfExists: true) + def ch_methods_description = Channel.value(methodsDescriptionText(ch_multiqc_custom_methods_description)) + + ch_multiqc_files = ch_multiqc_files.mix( + ch_workflow_summary.collectFile(name: 'workflow_summary_mqc.yaml'), + ch_collated_versions, + ch_methods_description.collectFile( + name: 'methods_description_mqc.yaml', + sort: false + ), + ch_reports + ) + MULTIQC ( ch_multiqc_files.collect(), @@ -768,8 +889,21 @@ workflow GERMLINE { ) emit: - multiqc_report = MULTIQC.out.report.toList() // channel: /path/to/multiqc_report.html - versions = ch_versions // channel: [ path(versions.yml) ] + gvcfs = ch_gvcfs_ready // channel: [ val(meta), path(gvcf), path(tbi) ] + genomicsdb = ch_final_genomicsdb // channel: [ val(meta), path(genomicsdb) ] + vcfs = ch_final_vcfs // channel: [ val(meta), path(vcf), path(tbi) ] + gemini = ch_final_dbs // channel: [ val(meta), path(db) ] + peds = ch_final_peds // channel: [ val(meta), path(ped) ] + single_beds = ch_single_beds // channel: [ val(meta), path(bed) ] + joint_beds = ch_joint_beds // channel: [ val(meta), path(bed) ] + final_reports = ch_final_reports // channel: [ val(meta), path(report) ] + gvcf_reports = ch_gvcf_reports // channel: [ val(meta), path(report) ] + automap = ch_final_automap // channel: [ val(meta), path(automap) ] + updio = ch_final_updio // channel: [ val(meta), path(updio) ] + validation = ch_final_validation // channel: [ val(meta), path(file) ] + multiqc_report = MULTIQC.out.report.toList() // channel: /path/to/multiqc_report.html + multiqc_data = MULTIQC.out.data // channel: /path/to/multiqc_data + versions = ch_versions // channel: [ path(versions.yml) ] } /*