Changes:
- Even if
--exclude_relatives
is not specified (default), the top specific variant combination will now be reported. - During CDR3 selection for a UMI, the sequence similarity algorithm will dynamically switch from Levenshtein distance to Hamming distance if the number of distinct CDR3 sequences exceeds 5. In the testing data, this reduced the total runtime from 4.5 hours to 1 hour.
- For CDR3 nucleotide sequences constructed from a consensus sequence, if there are ambiguous codons (e.g XXG), the sequence will not be translated.
- The
--verbosity
argument now works as intended. - Deprecated functions have been wiped from the codebase.
- The console output for filter_queries and recover_cdr3s has been clarified and neatened.
Outstanding work:
- For alignment.py, the transformation of reads from the single-end FASTQ to barcode and biological FASTQs needs to be rewritten in Python.