-
Notifications
You must be signed in to change notification settings - Fork 2
Home
seQuoia is a framework for running diverse omics workflows, primarily focused on QC, processing and analysis of bacterial genomic sequencing sets. It was initially developed for QC of genomic sequencing data and referred to as seQc.
There are six developed workflows, but the framework allows for construction of new ones with ease of reusing available modules.
The six ready to use workflows include:
- Basic - quick sequencing and sample quality assessment.
- LSARP Genomics - genomics assessment pipeline developed for ResistanceDB. A multi-institute effort led by collaborators at the University of Calgary and Calgary Lab Services.
- Hybrid Assembly - pipeline to create hybrid genome assemblies using illumina short-read + ONT long-read sequencing data.
- Metagenomic - pipeline for processing metagenomic / meta-transcriptomic genomes.
- Pilon - pipeline for Pilon and bwa based variant calling against a reference genome.
- GAEMR - pipeline for assessing assembly using GAEMR.
Besides the Hybrid Assembly workflow, the rest of the workflows only support short-read Illumina data.
With the current list of available modules, seQuoia requires eight separate conda environments. Admittingly, seQuoia is not the easiest suite to install and this current version will not be actively supported moving forwards. The primary intend of the software stands to share our workflows with the scientific community. However, if installing is successfully achieved on a server system, incorporating the OOP framework through pip installation and creating custom workflows should be straightforward.
Over the last couple years similar frameworks/pipelines/workflows for processing bacterial genomics data have been created by other groups:
- Bactopia: https://github.com/bactopia/bactopia
- ASA3P: https://github.com/oschwengers/asap
- Nullarbor: https://github.com/tseemann/nullarbor
- TORMES: https://github.com/nmquijada/tormes
Importantly, seQuoia is not built to be run on cloud-based platforms, while many of these fantastic alternative frameworks do. So if running workflows on cloud environments is an essential need, or to best meet your individual needs, we suggest you check out some of these great alternate frameworks for bacterial genomics.
There are certain features of seQuoia which we think provide some nice perks:
- OOP framework - This allows for users to create their own simple workflows in Python 3 through creating, processing, and analysis of FASTQ, Alignment (bam/sam), and Assembly objects. Check out this page 8 showcasing this framework in greater detail!
- Layers of Modularity - There are three layers to seQuoia: (1) the OOP framework (most granular), (2) tasks and modules (comprehensive/standalone-ish) steps which comprise (3) workflows.
- Multiple workflows - Some of the alternate pipelines feature a single workflow which can be configured; however, seQuoia offers six different workflows for various needs, from simple QC to hybrid assembly to a full genomics pipeline.
- Easy, flexible, and precision workflow customization - Workflow parameters can be configured per run (batch of samples) or for each sample within a run using two different types of configuration files.
- No Docker/Singularity - Docker containers have tons of advantages and vastly simplify running pipelines on cloud infrastructure. However on some systems, they are still considered a security concern.