Workflow for SNV calling on tumor-only targeted NGS data
Currently, there are 2 preprocessing steps on the bam files (but eventually these could be moved to an 'post-alignment' processing workflow)
- target region filter
- LoFreq indelquality calculation
Includes different steps:
- LoFreq variant calling
-
followed by 'lofreq filter' based on NGS quality metrics, including: minimal coverage --min-cov 20
cap coverage at depth --max-depth 1000
minimal mapping quality --min-mq 30
minimal base quality --min-bq (30 / 20)
minimal alt base quality --min-alt-bq (30 / 20)
significance --sig 0.01
minimal VAF --af-min 0.05 \ strand bias p-value --sb-alpha 0.05 \ strand bias for indels --sb-incl-indels \ -
TODO: Homopolymer filter for Lofreq: 'HRUN= ' in vcf
--snvqual-thresh 77 --indelqual-thresh 61
- Apply LoFreq Panel Of Normal (PON) Blacklist
- to remove all artificial/germline calls from the PON dataset (todo: determine what cutoff to use: filter out all calls present in at least 2 PON-cases)
- Variant Annotation
- Variant Discrimination
- distinguish snp/somatic
- optionally: from all somatic calls: distinguish driver/passenger (using MutSig2CV)
- distinguish functional/non-functional somatic
- optionally: from all functional somatic SNVs, select only the significant driver genes, based on list of significant driver genes from MutSig2CV-output