Skip to content

Releases: vanheeringen-lab/seq2science

Release v0.7.1

10 Feb 10:48
Choose a tag to compare

Automated preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and (sc)RNA-seq workflows.


  • issue with broad peaks and upsetplots

Release v0.7.0

02 Feb 13:44
Choose a tag to compare

Automated preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and (sc)RNA-seq workflows.

Biggest change is that we revert back to snakemake 5.18 since higher versioned snakemake's cause instability.


  • upset plot as QC for peak calling. Should give a first feeling about the distribution of peaks between samples/conditions.


  • downgraded the snakemake backend as snakemake 6+ is unstable for us.


  • corrupt environment creation with libreadline for edgeR normalization.
  • subsampling causing a crash caused by bad syntax.

Release v0.6.1

17 Dec 12:50
Choose a tag to compare

Automated preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and (sc)RNA-seq workflows.


  • corrupt environment creation with libcrypto in combination with strandedness rule

Release v0.6.0

11 Dec 11:10
Choose a tag to compare

Automated preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and (sc)RNA-seq workflows.

Release 0.6.0 is a mix of bug fixes, small changes, and bigger stuff. Most importantly:

  • added a deseq2science command to do differential expression analysis on user-supplied tables with seq2science settings
  • for single-cell RNA-seq ADT-quantification is possible
  • snakemake library updated, giving seq2science a new-ish look :)

The full changes are listed below:


  • added generic stats to the MultiQC report about the assembly, which might help pin point problems with the assembly used.
  • added the slop parameter to the config.yaml of atac-seq and chip-seq workflows, just so they are more visible.
  • added support for seurat object export and merging for kb workflow.
  • added support for CITE-seq-count for ADT quantification
  • added the option to downsample to a specific number of reads.
  • new deseq2science command


  • Seq2science now makes a separate blacklist file per blacklist option (encode & mitochondria), so that e.g. RNA-seq and ATAC-seq workflows can be run in parallel and don't conflict on the blacklist.
  • error messages don't show the full traceback anymore, making it (hopefully) more clear what is going wrong.
  • The effective genome size is now not calculated per sample, but per read length. When dealing with multiple samples (of similar) length this improves computational burden quite some.
  • samtools environment updated to version 1.14


  • config option slop is now passed along to each rule using it
  • edge-case where local samples are in the cache, but not present in the fastq_dir
  • bug with differential peak/gene expression across multiple assemblies
  • bug with kb ref not creating index for non-velocity analysis
  • bug with count import in read_kb_counts.R for technical replicates and meta-data handling
  • deseq2 ordering in multiqc report
  • issue with slop not being used for the final count table
  • bug with onehot peaks not reporting the sample names as columns

Release v0.5.6

19 Oct 15:11
Choose a tag to compare

Automated preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and RNA-seq workflows.


  • MA plot, volcano plot, and PCA plots added to QC report for deseq2 related workflows


  • updated salmon & tximeta versions
  • colors for DESeq2 distance plots "fixed"
  • updated bwa-mem2 version and reduced the expected memory usage of bwa-mem2 to 40GB
  • seq2science now uses snakemake-minimal


  • stranded bigwigs are no longer inverted (forward containing reverse reads and vice-versa).
  • fix in rename_sample preventing the inversion of R1 and R2 FASTQs.
  • bug with parsing cli for explanations
  • show/hide buttons for treps are actually made for multiqc report
  • fixes in deseq2/utils.R
    • the samples.tsv will now work with only 2 columns
    • the samples.tsv column names will be stripped of excess whitespace, similar to the config.
  • ATAC-seq pipeline removing the final bams, keeping the unsorted one

Release v0.5.5

01 Sep 11:38
Choose a tag to compare

Automated preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and RNA-seq workflows.


  • duplicate read marking happens before sieving and no reads get removed. Removal of duplicate reads now controlled with flag remove_dups in the config.
  • changed option heatmap_deeptools_options to deeptools_heatmap_options
  • Updated sra tools and parallel fastq-dump versions
  • Updated genomepy version
  • Gene annotations are no longer gzipped and ungzipped. This should reduce rerunning.


  • rerunning being triggered too easily by input order
  • issue with qc plots and broad peaks
  • magic with prefetch not having the same output location on all machines
  • issue with explain having duplicate lines

Release v0.5.4

07 Jul 08:46
Choose a tag to compare

Automated preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and RNA-seq workflows.


  • added support for kb-python kite workflow


  • kb count output validation
  • optional barcodefile argument for scRNA-seq workflow
  • MultiQC updated to newest version
  • updated kb-python version

Release v0.5.3

03 Jun 13:12
Choose a tag to compare

Automated preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and RNA-seq workflows.


  • DESeq2 blind sample distance & correlation cluster heatmaps for RNA-, ATAC- ChIP-seq counts
    • find them annotated in the MultiQC when running >1 sample


  • "biological_replicate" and "technical_replicate" renamed to *"_replicates" (matches between samples.tsv & config.yaml)
  • fixed bug with seq2science making a {output.allsizes} file
  • Changed explain to use 'passive style'
  • Genrich peak calling defaults
    • Doesn't remove PCR duplicates anymore (best to do with markduplicates)
    • Changed extsize to 200 to be similar to macs settings
    • Turned off tn5 shift, since that is done by seq2science


  • depend less on local genomes (only when data is unavailable online)
  • trackhub explanation was missing, added
  • bug with broad peaks and qc that could not be made

Release v0.5.2

11 May 09:00
Choose a tag to compare

Automated preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and RNA-seq workflows.


  • added rule for scRNA post-processing R Markdown for plate/droplet based scRNA protocols (experimental)
  • added explanation for kb_seurat_pp rule
  • heatmap of N random peaks to the multiqc report in the end


  • removed a warning of genome.fa.sizes already existing due to being already being downloaded beforehand (it's removed in between)
  • genomepy's provider statuc checking not being used.

Release v0.5.1

01 Apr 17:46
Choose a tag to compare

Automated preprocessing of Next-Generation Sequencing data, including full (sc)ATAC-seq, ChIP-seq, and RNA-seq workflows.


  • added CLI functionality to the deseq2.R script (try it with Rscript /path/to/deseq2.R --help!)
  • --force flag to seq2science init to automatically overwrite existing samples.tsv and config.yaml
  • local fastqs with Illumina's '_100' are now recognized
  • added the workflow explanation to the multiqc report


  • config checks: all keys converted to lower case & duplicate keys throw an exception
  • MultiQC updated to v1.10
  • Link to seq2science log instead of snakemake log in final message


  • Issue when filtering a combination of single-end and paired-end reads on template length
  • explain functionality testing
  • scATAC can properly use SE fastqs
  • scRNA can use fqexts other than R1/R2
  • fastq renaming works again
  • added missing schemas to extended docs


  • Bug with edgeR.upperquartile normalization. Now makes everything NaN, so pipeline finishes succesfully.