Skip to content

grp-bork/MultiPath

Repository files navigation

MultiPath workflow

Bork Group Logo Developed by the Bork Group in collaboration with UFZ
Raise an issue or contact us

See our other Software & Services
Contributors:
Collaborators:
The development of this workflow was supported by NFDI4Microbiota NFDI4Microbiota icon

Description

The MultiPath workflow is a multi-omics workflow for the integration of multi-omics data of microbial species. The workflow was originally developed at the Helmholtz Centre for Environmental Research GmbH (UFZ) within the UC-Multi use case for NFDI4Microbiota. MultiPath is a nextflow port developed at EMBL Heidelberg, powered by the independent nevermore workflow component library.

Citation

This workflow: DOI


Overview

MultiPath Workflow Schema


Requirements

We recommend running MultiPath with Docker/Singularity. By default, it makes use of the biocontainers versions of its dependencies (with the exception of bwa/samtools, s. below)

Essential/Mandatory

  • Unicycler
  • prodigal
  • salmon
  • carveme
  • memote (note that nextflow versions >= 23 have problems with the memote container, see usage)
  • seqtk

Optional

  • bbmap (bbduk, reformat)
  • kraken2
  • sortmeRNA
  • FastQC
  • MultiQC

Usage

Cloud-based Workflow Manager (CloWM)

This workflow will be available on the CloWM platform (coming soon).

Command-Line Interface (CLI)

The workflow run is controlled by environment-specific parameters (see run.config) and study-specific parameters (see params.yml). The parameters in the params.yml can be specified on the command line as well.

You can either clone this repository from GitHub and run it as follows

git clone https://github.com/grp-bork/MultiPath.git
nextflow run /path/to/MultiPath [-resume] -c /path/to/run.config -params-file /path/to/params.yml

Or, you can have nextflow pull it from github and run it from the $HOME/.nextflow directory.

nextflow run grp-bork/MultiPath [-resume] -c /path/to/run.config -params-file /path/to/params.yml

Input files

Fastq files are supported and can be either uncompressed (but shouldn't be!) or compressed with gzip or bzip2. Sample data must be arranged in one directory per sample.

Per-sample input directories

All files in a sample directory will be associated with the name of the sample folder. Paired-end mate files need to have matching prefixes. Mates 1 and 2 can be specified with suffixes _[12], _R[12], .[12], .R[12]. Lane IDs or other read id modifiers have to precede the mate identifier. Files with names not containing either of those patterns will be assigned to be single-ended. Samples consisting of both single and paired end files are assumed to be paired end with all single end files being orphans (quality control survivors).