Variant calling pipeline for genomic data analysis
- Python3 - version 3.4.1
- Trimmomatic - version 0.36
- Bowtie2 - version 2.2.9
- Picard tools - version 2.6.0
- GATK - version 3.4
Reference genomes can be downloaded from Illumina iGenomes
Use the following protocol to download and prepare test dataset from NIST sample NA12878
wget ftp://ftp-trace.ncbi.nih.gov/giab/ftp/data/NA12878/Garvan_NA12878_HG001_HiSeq_Exome/NIST7035_TAAGGCGA_L001_R1_001.fastq.gz
wget ftp://ftp-trace.ncbi.nih.gov/giab/ftp/data/NA12878/Garvan_NA12878_HG001_HiSeq_Exome/NIST7035_TAAGGCGA_L001_R2_001.fastq.gz
gunzip NIST7035_TAAGGCGA_L001_R1_001.fastq.gz
gunzip NIST7035_TAAGGCGA_L001_R2_001.fastq.gz
head -100000 NIST7035_TAAGGCGA_L001_R1_001.fastq > test_r1.fastq
head -100000 NIST7035_TAAGGCGA_L001_R2_001.fastq > test_r2.fastq
To access help use the following command:
python3 ahcg_pipeline.py -h