This is the updated pipeline for APHA's processing of Mycobacterium bovis WGS data. BovTB-nf is designed to process a batch (1 or more samples) of paired-end fastq files generated on an Illumina sequencer. It will first remove duplicate reads from the dataset (FastUniq) and then trim the unique reads based on base-call quality and the presence of adapters (Trimmomatic). Reads are then mapped to the M. bovis AF2122 reference genome and variants called (bwa/samtools/bcftools).
It has been built to run using nextflow, using standard bioinformatic tools for the most part. The external dependancies are:
- FastUniq
- Trimmomatic
- bwa
- samtools and bcftools
- bedtools
- Kraken2 (and database)
- Bracken
Of course Nextflow itself is a prerequisite and should be installed as described in the Nextflow Documentation
If you have the dependancies installed the pipeline can run by simply typing:
nextflow run ellisrichardj/BovTB-nf
Alternatively, clone the repository:
git clone
If required, there is simple script for installing the dependancies (helpfully called, which will also update the nextflow config file with their locations.
Alternatively, the pipeline can run in an ubuntu image on docker.
To build the image:
docker build /PATH/TO/REPO/ -t bov-tb
Run a docker container in bash:
docker run --rm -it bov-tb
In its simplest form just run the Nextflow process from the directory containing the fastq files:
cd /path/to/Data
nextflow run BovTB-nf