Skip to content

New convert_reads workflow for large files

Compare
Choose a tag to compare
@oleraj oleraj released this 11 Feb 18:14
· 37 commits to master since this release

This point release adds a new script merge_tally.pl, which replaces the functionality of merge_tally_overlapping_regions.pl, and also makes it possible to run convert_reads on very large files, containing millions of reads. The procedure basically entails splitting the fasta file into separate files (~5-10,000 reads per file is good), running convert_reads_to_amino_acid.pl on each file, then merging the output with merge_tally.pl. The nuc, codon, aa, and merged tally files are recreated. (See tutorial for examples of running.)

As part of this, the code for the convert_reads script was refactored to put several subroutines into a module, primerid.pm so that they were accessible to merge_tally.pl.