Skip to content

A collection of singularity recipies for bioinformatic pipelines.

Notifications You must be signed in to change notification settings

Grelot/bioinfo_singularity_recipes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

bioinfo_singularity_recipes

https://www.singularity-hub.org/static/simg/hosted-singularity--hub-%23e32929.svg

Singularity recipies I cooked for bioinformatic pipelines

Pierre-Edouard Guerin, 2019-2021


We provide ready to run versions of Singularity containers


1. Install Singularity

See https://www.sylabs.io/docs/ for instructions to install Singularity.

2. Obitools

  • The OBITools package 1.0 is a set of programs specifically designed for analyzing NGS data in a DNA metabarcoding context, taking into account taxonomic information.
  • ecoPrimers 1.0.1 is a software that finds primers from a set of sequences.
  • ecoPCR 0.5 simulate in silico PCR digestion.
  • EMBOSS is "The European Molecular Biology Open Software Suite". It is a free Open Source software analysis package specially developed for the needs of the molecular biology

2.1 Download the Obitools container

singularity pull --name obitools.simg shub://Grelot/bioinfo_singularity_recipes:obitools

Alternatively, if you're using the Montpellier Bioinformatics Biodiversity platform, download this custom container :

singularity pull --name obitools.simg shub://Grelot/bioinfo_singularity_recipes:obitoolsmbb

2.2 Run the Obitools container

singularity run obitools.simg

it should output:

Opening container...ubuntu xenial: OBITOOLS, ecoPRIMERS, ecoPCR, EMBOSS

2.3 Execute some programs from the container

## OBITOOLS: illuminapairedend 
singularity exec obitools.simg illuminapairedend --help
## OBITOOLS: ngsfilter
singularity exec obitools.simg ngsfilter --help
## OBITOOLS: obigrep
singularity exec obitools.simg obigrep --help
## OBITOOLS: obiclean
singularity exec obitools.simg obiclean --help
## OBITOOLS: ecotag
singularity exec obitools.simg obiclean --help
## ecoPCR
singularity exec obitools.simg ecoPCR --help
## ecoPrimers
singularity exec obitools.simg ecoPrimers --help
## seqret
singularity exec obitools.simg seqret --help

3. Useful programs for eDNA analysis

  • vsearch 2.13.4 supports de novo and reference based chimera detection, clustering, full-length and prefix dereplication, rereplication, reverse complementation, masking, all-vs-all pairwise global alignment, exact and global alignment searching, shuffling, subsampling and sorting. It also supports FASTQ file analysis, filtering, conversion and merging of paired-end reads.
  • pear 0.9.11 is an ultrafast, memory-efficient and highly accurate pair-end read merger
  • fastq-join 1.3.1 joins two paired-end reads on the overlapping ends.
  • pandaseq 2.11 aligns Illumina reads, optionally with PCR primers embedded in the sequence, and reconstruct an overlapping sequence.
  • jellyfish 2.2.6 reads FASTA and multi-FASTA files containing DNA sequences. It outputs its k-mer counts.
  • casper 0.8.2 (Context-Aware Scheme for Paired-End Read) is state-of-the art paired-end reads merging tool.
  • FLASh 1.2.11 (Fast Length Adjustment of SHort reads) is a very fast and accurate software tool to merge paired-end reads from next-generation sequencing experiments.
  • fastq-multx 1.3.1 demultiplexes a fastq. Capable of auto-determining barcode id's based on a master set fields.
  • cutadapt 2.3 removes adapter sequences from high-throughput sequencing reads.
  • SWARM 2.2.2 performs clustering method for amplicon-based studies.
  • Reaper 13.274 is a program for demultiplexing, trimming and filtering short read sequencing data. It can handle barcodes, trim adapter sequences, strip low quality bases and low complexity sequence, and has many more features.
  • TAGcleaner 0.16 detects and trims tag sequences from sequence data.
  • Flexbar 3.0.3 preprocesses high-throughput sequencing data efficiently. It demultiplexes barcoded runs and removes adapter sequences. Several adapter removal presets for Illumina libraries are included.
  • usearch 11.0.667 offers search and clustering algorithms that are often orders of magnitude faster than BLAST.
  • deML 1.0 demultiplexes Illumina sequences.
  • NGmerge merges paired-end reads and removes adapters.
  • FASTP is an ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)

3.1 Download the eDNA analysis container

singularity pull --name ednatools.simg shub://Grelot/bioinfo_singularity_recipes:ednatools

Alternatively, if you're using the Montpellier Bioinformatics Biodiversity platform, download this custom container :

singularity pull --name ednatools.simg shub://Grelot/bioinfo_singularity_recipes:ednatoolsmbb

3.2 Run the eDNA analysis container

singularity run ednatools.simg

it should output:

Opening container...ubuntu beaver: vsearch, PEAR, fastq-join, pandaseq, jellyfish, casper, FLASH, fastq-multx, cutadapt, SWARM, REAPER, tally, minion, swan, tagCleaner, flexbar, usearch, deML, trimmomatic, prinseq, NGmerge, FASTP

3.3 Execute some programs from the container

## vsearch
singularity exec ednatools.simg vsearch -h
## pear
singularity exec ednatools.simg pear -h
## pandaseq
singularity exec ednatools.simg pandaseq -h
## casper
singularity exec ednatools.simg casper -h
## FLASh
singularity exec ednatools.simg flash -h
## fastq-multx
singularity exec ednatools.simg fastq-multx -h
## fastq-join
singularity exec ednatools.simg fastq-join -h
## cutadapt
singularity exec ednatools.simg cutadapt -h
## SWARM
singularity exec ednatools.simg swarm -h
## Reaper
singularity exec ednatools.simg reaper -h
singularity exec ednatools.simg tally -h
singularity exec ednatools.simg minion -h
## TAGcleaner
singularity exec ednatools.simg tagcleaner -h
## Flexbar
singularity exec ednatools.simg flexbar -h
## usearch
singularity exec ednatools.simg usearch
## deML
singularity exec ednatools.simg deML -h
## prinseq
singularity exec ednatools.simg perl /prinseq-lite-0.20.4/prinseq-lite.pl -h
## NGmerge
singularity exec ednatools.simg NGmerge -h
## FASTP
singularity exec ednatools.simg fastp -h

4. R for metabarcoding analysis

This recipe have been written thanks to RPACIB

R with useful packages for metabarcoding analysis

  • R 3.6.0
  • R-packages tidyverse, rlang, dada2, seqRFLP, phyloseq

Download the container

singularity pull --name ednaR.simg shub://Grelot/bioinfo_singularity_recipes:ednar

Run R from the container

singularity shell ednaR.simg
R

or

singularity exec ednaR.simg R

5. Grinder

Grinder is a versatile open-source bioinformatic tool to create simulated omic shotgun and amplicon sequence libraries for all main sequencing platforms.

Download the container and run Grinder from the container

singularity pull --name grinder.simg shub://Grelot/bioinfo_singularity_recipes:grindermbb
singularity exec grinder.simg grinder -h

6. JAMP

JAMP is a modular metabarcoding pipeline, integrating different functions from USEARCH, VSEARCH, CUTADAPT and other programs. The pipeline is run as an R package and automatically generates the needed folders and summary statistics.

Download the container

singularity pull --name jamp.simg shub://Grelot/bioinfo_singularity_recipes:jamp

Check dependencies

## cast singularity shell
singularity shell jamp.simg
## check version of required programs
usearch --version
vsearch --version
cutadapt --version 
R --version
## start R session
R
## check libraries inside R session
library(JAMP)