CNVand is a snakemake workflow for CNV analysis, tailored for preparing data used by the CNVizard CNV visualization tool. Given a set of BAM and VCF files, it utilizes the tools CNVkit
and AnnotSV
to analyze and annotate copy number variations.
To configure this pipeline, modify the config under config/config.yaml
as needed. Detailed explanations for each setting are provided within the file.
Add samples to the pipeline by completing config/samplesheet.tsv
. Each sample
should be associated with a path
to the corresponding BAM and VCF file.
For detailed instructions on how to configure CNVand see config/README.md
.
To use CNVand some external reference files are needed alongside your sample data.
For cnvkit_fix
to work, you need to specify a reference genome in the config file. Take care to use the same reference file for your entire workflow!
For AnnotSV to work, the annotation files must be downloaded separately and be referenced in the config file under the respective key. For human annotations, this can be done here. In case this link is not working, check the original AnnotSV repository for updates on how to obtain the annotations.
CNVand can be executed using mamba environments or a pre-built docker container.
For a one-click installation, snakedeploy can be used. For further information, see the entry for CNVand in the Snakemake Workflow Catalog
This workflow can easily setup manually with the given environment file. Install Snakemake and dependencies using the command:
mamba env create -f environment.yml
Then activate the newly created environment with:
mamba activate cnvand
Now configure the pipeline and download the needed annotation and refenrece files. When everything is set up, Execute the pipeline with:
snakemake --cores all --use-conda
Generate a comprehensive execution report by running:
snakemake --report report.zip
CNVand can also be used inside a Docker container. To do so, first pull the Docker image with:
docker pull ghcr.io/ihggm-aachen/cnvand:latest
Then run the container with the bind mounts needed in your setup:
docker run -it -v /path/to/your/data:/data ghcr.io/ihggm-aachen/cnvand:latest /bin/bash
This command opens an interactive shell inside the Docker container. Once inside the container, you are placed inside the /cnvand
the directory. From there then run the pipeline once you set an appropriate configuration:
snakemake --cores all --use-conda
We welcome contributions to improve CNVand. Please see our CONTRIBUTING.md for details on how to get started.
We are committed to fostering an open and welcoming environment. Please see our CODE_OF_CONDUCT.md for our community guidelines.
Detailed documentation for the workflow can be found in workflow/documentation.md
.
To ensure the pipeline runs correctly, we have set up both unit and integration tests. Unit tests are generated from successful workflow runs, and integration tests are configured to run the entire workflow with test data.
The integration test can be run using the data and config provided. Remember to download the correct reference/annotations (GRCh38 in case of the bundled NIST data) by yourself and adjust your local paths as necessary!
Run the unit tests with:
pytest -v .tests/unit
This will check for the correct CNVand output per rule.
This project is licensed under the MIT License. See the LICENSE file for details.