nxtrim: Software to remove Nextera Mate Pair adapters and categorise reads according to the orientation implied by the adapter location. This software is not commercially supported.
BOOST - we use Boost 1.55.0 but most recent versions are probably fine
git clone git@github.com:sequencing/NxTrim.git
cd NxTrim
You will also need to point the BOOST_ROOT environment variable at your boost installation eg.
export BOOST_ROOT=/your/boost/installation
if boost is installed globally then
export BOOST_ROOT=/usr/lib/
should work
####Usage Trim the data:
nxtrim -1 sample_R1.fastq.gz -2 sample_R2.fastq.gz -O sample
Assemble with Velvet:
velveth output_dir 55 -short -fastq.gz sample.se.fastq.gz -shortPaired2 -fastq.gz sample.pe.fastq.gz -shortPaired3 -fastq.gz sample.mp.fastq.gz -shortPaired4 -fastq.gz sample.unknown.fastq.gz
velvetg output_dir -exp_cov auto -cov_cutoff auto -shortMatePaired4 yes
Assemble with SPAdes:
cat sample.mp.fastq.gz sample.unknown.fastq.gz > sample.allmp.fastq.gz
spades.py -k 21,33,55,77 -t 4 --pe1-s sample.se.fastq.gz --pe2-12 sample.pe.fastq.gz --hqmp3-12 sample.allmp.fastq.gz --hqmp3-fr -o output_dir
Note we concatenate the unknown/mp libraries for SPAdes. This command is suitable for 2x151bp data, if you have 2x251bp then use -k 21,33,55,77,127
The default behaviour expects raw fastq files from a Nextera Mate-Pair library kit in Reverse-Forward orientation. Based on the location of the Nextera adapter sequence (if detected), nxtrim produces four different "virtual libraries":
- mp: read pairs that are large insert-size mate-pairs
- pe: read pairs that are short insert-sze paired-end reads
- se: single reads
- unknown: a library of read-pairs that are mostly large-insert mate-pair, but possibly contain a small proportion of paired end contaminants
The trimmer will reverse-complement the reads such that the resulting libraries will be in Forward-Reverse orientation, this reverse-complementing can be disabled via the --norc flag.
####Example data:
