Release Plass Release 3-764a3 · soedinglab/plass

Changes since Release 2-c7e35:

At a glance: Significant further development of the nucleotide assembler. Reduced hard disk requirements for protein assembler and many bug fixes.

Updated mmseqs submodule and adjusted plass to multiple MMseqs2 changes.

Breaking Changes

added reverse complement treatment for nucleotide sequences (plass nuclassemble)
introduced --kmer-per-seq-scale parameter to make sure not to miss good hits of long sequences. The number of extracted kmers can now be scaled with a user defined factor multiplied by the length of the sequence.
changed scoring mode for alignment calculation (--rescore-mode 3)

Features

add stdin support. cat reads.fas | plass assemble stdin asm tmp
reduced hard disk requirements by roughly a factor of 12 (--delete-tmp-inc)
added a first raw version of a cycle detector (still experimental) to avoid over extension for nucleotide assembly
introduced a new header format, which is now consistent for protein and nucleotide assembler
<uniq ID> len:<len> cycle:<0|1> The cycle field is optional (for the nucleotide case)
introduced a new logic to handle sequences with N repeated k-mers: sequences with more than N repeated k-mers are no longer ignored in the assembly process completely, but instead repeated k-mers are only ignored in the kmermatcher phase. Replaced --skip-n-repeat parameter by --ignore-multi-kmer
overlaps are still sorted by ScorePerColumn but the bit score was replaced by the raw score to scale correctly with the overlap length
introduced --min-contig-len parameter to set minimum length of assembled contig to output (for nucleotide assembly)
added redundancy reduction (for nucleotide assembly) by clustering sequences based on user defined threshold (--clust-thr, default 0.97)
Dockerfile now uses Debian slim instead of alpine

Bugs

fixed problems in the first iteration of the protein assembler
fixed problems with start and stop codons occurring in the transition from protein alignments to nucleotide alignments and alignment offset calculation
split file existence check in workflows to individual checks to avoid repeated linking problems
fixed bug in the reverse complement calculation for N's in nucleotide sequences
fixed different problems for long sequences regarding the kmermatching phase
fixed broken compilation without zlib

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Plass Release 3-764a3

Breaking Changes

Features

Bugs