Skip to content

RajLabMSSM/snakeSV

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

snakeSV: Flexible framework for large-scale SV discovery

install with bioconda


snakeSV is an integrated pipeline in Snakemake for complete SV analysis. The pipeline includes pre- and post-processing steps to deal with large scale studies. The input data of the pipeline consists of BAM files for each sample, a reference genome file (.FASTA) and a configuration file in yaml format. Additionally, users can also input custom annotation files in BED format for SV interpretation and VCF files with structural variants to be genotyped in addition to the discovery set.

Pipeline Schematic


Getting Started

The easiest way of using snakeSV is using Bioconda!

Install snakeSV by creating a separated environment (named "snakesv_env") with the command:

conda create -n snakesv_env -conda-forge -c bioconda snakesv
conda activate snakesv_env # Command to activate the environment. To deactivate use "conda deactivate"

After installing, to test if everything is working well, you can run the pipeline with an example data set included.

# First create a folder to run the test
mkdir snakesv_test
cd snakesv_test

# Run the snakeSV using example data.
snakeSV --test_run

You can also test an installation and small test runs using Google Cloud Shell here


For more details check the wiki pages for detailed configuration and input instructions! We also provide 2 study cases to illustrate uses of customized annotations and genotyping using a panel of SVs discovered using long-reads!

Reference

Vialle, R.A., Raj, T. (2022). snakeSV: Flexible Framework for Large-Scale SV Discovery. In: Proukakis, C. (eds) Genomic Structural Variants in Nervous System Disorders. Neuromethods, vol 182. Humana, New York, NY. link