The format is based on Keep a Changelog and this project adheres to Semantic Versioning.
Initial release of nf-core/kmermaid, created with the nf-core template.
- Add option to use Dayhoff encoding for sourmash.
- Add
bam2fasta
process to kmermaid pipeline and flags involved. - Add
extract_coding
andpeptide_bloom_filter
process and flags involved. - Add
track_abundance
feature to keep track of hashed kmer frequency. - Add social preview image
- Add
fastp
process for trimming reads - Add option to use compressed
.tgz
file containing output from 10X Genomics'cellranger count
outputs, includingpossorted_genome_bam.bam
andbarcodes.tsv
files - Add samtools_fastq_unaligned and samtools_fastq_aligned process for converting bam to per cell barcode fastq
- Add version printing for sencha, bam2fasta, and sourmash in Dockerfile, update versions in environment.yml
- For processes translate, sourmash compute add cpus=1 as they are only serial (#107)
- Add
sourmash sig merge
for aligned/unaligned signatures from bam files, and add--skip_sig_merge
option to turn it off - Add
--protein_fastas
option for creating sketches of already-translated protein sequences - Add
--skip_compare option
to skipsourmash_compare_sketches
process - Add merging of aligned/unaligned parts of single-cell data (#117)
- Add renamed package dependency orpheum (used to be known as sencha)
- Increase CPUs in
high_memory_long
profile from 1 to 10
- Rename splitkmer to
split_kmer
- Remove
one_signature_per_record
flag and add bam2fasta count_umis_percell and make_fastqs_percell instead of bam2fasta sharding method - Use ripgrep instead of bam2fasta to make per-cell fastq, which will hopefully make resuming long-running pipelines on bams much faster
- Make sure
samtools_fastq_aligned
outputs ALL aligned reads, regardless of mapping quality or primary alignment status
- add
--skip_compute option
to skipsourmash_compute_sketch_*
- Used
.combine()
instead ofeach
to do cartesian product of all possible molecules, ksizes, and sketch values - Do
sourmash compute
on all input ksizes, and all peptide molecule types, at once to save disk reading/writing efforts
- Updated sencha=1.0.3 to fix the bug in memory errors possibly with the numpy array on unique filenames (PR #96 on orpheum)
- Add option to write non-coding nucleotide sequences fasta files while doing sencha translate
- Don't save translate csvs and jsons by default, add separate
--save_translate_json
and--save_translate_csv
- Updated
sencha translate
default parameters to be--ksize 8 --jaccard-threshold 0.05
because those were the most successful - Update renaming of
khtools
commands tosencha
- Fix the use of
skip_multiqc
flag condition with if and not when
- Removed ability to specify multiple
--scaled
or--num-hashes
values to enable merging of signatures