scRIApipe

single-cell RNA Isoform Analysis Pipeline

Author: Vivek Bhardwaj (@vivekbhr) and Thijs Makaske

DAG

Quick Installation

From Conda:

conda create -n scria -c vivekbhr -c bioconda -c conda-forge scriapipe

From GitHub:

conda create -n scria python pip
conda activate scria && pip install snakemake
pip install git+https://github.com/vivekbhr/scRIApipe.git@master

Configure paths

The semi-permanent parameters to the workflow could be configured with a yaml file (config.yaml). Download here

After conda install, move to the folder where you wan to run the workflow, and prepare the config.yaml

cd <your output dir>
vim config.yaml ## now modify the paths as per your requirements

The workflow needs

path to a GTF file (preferably pre-filter the GTF to remove pseudogene and low confidence annotations)
UCSC ID of the genome

cDNA fasta and GTF can be downloaded here UCSC ID is, for example "mm10" (mouse) or "hg38" (human)

Additional parameters for the workflow can be accessed via --help

scRIA --help

Test-drive the workflow

## inside the output dir
scRIA -i <fastq_folder> -o . -c <your>config.yaml -j <jobs> -s ' -np'

4. Submission parameters

Running on HPC Cluster

In the workflow command above, j is the number of parallel jobs you want to run, -cl means submit to cluster (default is to run locally). Therefore if you wish to run the workflow on a cluster, simply use the workflow with the -cl command on the submission node.
cluster configuration, such as memory and cluster submission command are placed in cluster_config.yaml, and can be modified to suite the users internal infrastructure.

To find the currently active cluster-config file, run the following:

conda activate scria
which scRIA

This shows the path to scRIA binary, usually it's like:

/path/to/miniconda3/envs/scria/bin/scRIA

You can find the cluster-config here:

/path/to/miniconda3/envs/scria/lib/python<version>/site-packages/scriapipe/cluster_config.yaml

Dry-run

In order to just test what the workflow would do, use the command -s ' -np'

memory errors

Index builing needs >40G of memory, if the workflow fails and the logs/velocity_index.err says something like std::badalloc, increase memory in the file cluster_config.yaml in the scRIA folder.

Other technical Notes

After running the pipeline, LOG file are stored in the /log/ directory and the workflow top-level log is in scRIA.log file.
Currently the -o option is not very flexible and and pipeline works only when it's executed in the output directory.
Use the -t argument to specify a local directory for temp files. Default is to use the /tmp/ folder, which might have low space on cluster (unless tmpspace is specified in cluster_config.yaml)
Manual interruption of the workflow: Simple Ctrl+C is enough to cancel/inturrupt the workflow. However, in some cases re-running the workflow after inturruption might fail with message "Locked working directory". In that case, please run the workflow with -s ' --unlock' once.

Output

Major outputs of the workflow are:

Transcript compatibility counts (TCC) in folder <outdir>/transcripts_quant/<sample>/eq_counts/tcc.mtx
Gene counts in folder <outdir>/transcripts_quant/<sample>/gene_counts/gene.mtx
RNA velocity output in folder <outdir>/velocity_output (normal/filtered loom files, velocity plots)

Name		Name	Last commit message	Last commit date
Latest commit History 198 Commits
bin		bin
conda-recipe		conda-recipe
scriapipe		scriapipe
LICENSE		LICENSE
MANIFEST.in		MANIFEST.in
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py
workflow_dag.png		workflow_dag.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

scRIApipe

DAG

Quick Installation

Configure paths

Test-drive the workflow

4. Submission parameters

Running on HPC Cluster

Dry-run

memory errors

Other technical Notes

Output

About

Releases 3

Packages

Contributors 2

Languages

License

vivekbhr/scRIApipe

Folders and files

Latest commit

History

Repository files navigation

scRIApipe

DAG

Quick Installation

Configure paths

Test-drive the workflow

4. Submission parameters

Running on HPC Cluster

Dry-run

memory errors

Other technical Notes

Output

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 3

Packages 0

Contributors 2

Languages

Packages