Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Snippy: no output - identification of 0 variants and empty csv file #580

Open
ZoseJapata opened this issue Jan 25, 2024 · 4 comments
Open

Comments

@ZoseJapata
Copy link

Hello,
I am trying to use snippy 4.6.0 via miniconda3 on Linux and the output given indicates my samples have 0 variants compared to the reference genome.
The created snps.csv has no data in it and is empty.
Does anyone know what to do?
Thank you if you're able to help!!

I've attached the run log and the snps.log file:
--- Run log file:
[11:46:25] This is snippy 4.6.0
[11:46:25] Written by Torsten Seemann
[11:46:25] Obtained from https://github.com/tseemann/snippy
[11:46:25] Detected operating system: linux
[11:46:25] Enabling bundled linux tools.
[11:46:25] Found bwa - /old_Users/[--]/miniconda3/envs/snippy/bin/bwa
[11:46:25] Found bcftools - /old_Users/[--]/miniconda3/envs/snippy/bin/bcftools
[11:46:25] Found samtools - /old_Users/[--]/miniconda3/envs/snippy/bin/samtools
[11:46:25] Found java - /old_Users/[--]/miniconda3/envs/snippy/bin/java
[11:46:25] Found snpEff - /old_Users/[--]/miniconda3/envs/snippy/bin/snpEff
[11:46:25] Found samclip - /old_Users/[--]/miniconda3/envs/snippy/bin/samclip
[11:46:25] Found seqtk - /old_Users/[--]/miniconda3/envs/snippy/bin/seqtk
[11:46:25] Found parallel - /old_Users/[--]/miniconda3/envs/snippy/bin/parallel
[11:46:25] Found freebayes - /old_Users/[--]/miniconda3/envs/snippy/bin/freebayes
[11:46:25] Found freebayes-parallel - /old_Users/[--]/miniconda3/envs/snippy/bin/freebayes-parallel
[11:46:25] Found fasta_generate_regions.py - /old_Users/[--]/miniconda3/envs/snippy/bin/fasta_generate_regions.py
[11:46:25] Found vcfstreamsort - /old_Users/[--]/miniconda3/envs/snippy/bin/vcfstreamsort
[11:46:25] Found vcfuniq - /old_Users/[--]/miniconda3/envs/snippy/bin/vcfuniq
[11:46:25] Found vcffirstheader - /old_Users/[--]/miniconda3/envs/snippy/bin/vcffirstheader
[11:46:25] Found gzip - /usr/bin/gzip
[11:46:25] Found vt - /old_Users/[--]/miniconda3/envs/snippy/bin/vt
[11:46:25] Found snippy-vcf_to_tab - /old_Users/[--]/miniconda3/envs/snippy/bin/snippy-vcf_to_tab
[11:46:25] Found snippy-vcf_report - /old_Users/[--]/miniconda3/envs/snippy/bin/snippy-vcf_report
[11:46:27] Checking version: samtools --version is >= 1.7 - ok, have 1.19
[11:46:29] Checking version: bcftools --version is >= 1.7 - ok, have 1.19
[11:46:30] Checking version: freebayes --version is >= 1.1 - ok, have 1.3.6
[11:46:36] Checking version: snpEff -version is >= 4.3 - ok, have 5.0
[11:46:37] Checking version: bwa is >= 0.7.12 - ok, have 0.7.17
[11:46:37] Using reference: /old_Users/[--]/downgraded_Snippy_version/NC_007795_NCTC8325.gb
[11:46:37] Treating reference as 'genbank' format.
[11:46:37] Will use 8 CPU cores.
[11:46:37] Using read file: /old_Users/[--]/downgraded_Snippy_version/[sample]-IA-M06434-210331_S30_L001_R1_001.fastq
[11:46:37] Using read file: /old_Users/[--]/downgraded_Snippy_version/[sample]-IA-M06434-210331_S30_L001_R2_001.fastq
[11:46:37] Creating folder: snp_test
[11:46:37] Changing working directory: snp_test
[11:46:37] Creating reference folder: reference
[11:46:37] Extracting FASTA and GFF from reference.
[11:46:39] Wrote 1 sequences to ref.fa
[11:46:39] Wrote 2844 features to ref.gff
[11:46:39] Creating reference/snpeff.config
[11:46:39] Freebayes will process 15 chunks of 191231 bp, 8 chunks at a time.
[11:46:39] Using BAM RG (Read Group) ID: snp_test
[11:46:39] Running: samtools faidx reference/ref.fa 2>> snps.log
[11:46:39] Running: bwa index reference/ref.fa 2>> snps.log
[11:46:40] Running: mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa 2>> snps.log
[11:46:40] Running: ln -sf reference/ref.fa . 2>> snps.log
[11:46:40] Running: ln -sf reference/ref.fa.fai . 2>> snps.log
[11:46:40] Running: mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz 2>> snps.log
[11:46:40] Running: snpEff build -c reference/snpeff.config -dataDir . -gff3 ref 2>> snps.log
[11:46:44] Running: bwa mem -Y -M -R '@rg\tID:snp_test\tSM:snp_test' -t 8 reference/ref.fa /old_Users/[--]/downgraded_Snippy_version/[sample]-IA-M06434-210331_S30_L001_R1_001.fastq /old_Users/[--]/downgraded_Snippy_version/[sample]-IA-M06434-210331_S30_L001_R2_001.fastq | samclip --max 10 --ref reference/ref.fa.fai | samtools sort -n -l 0 -T /localscratch/583419.1.[-] --threads 3 -m 2000M | samtools fixmate -m --threads 3 - - | samtools sort -l 0 -T /localscratch/583419.1.[-] --threads 3 -m 2000M | samtools markdup -T /localscratch/583419.1.[-] --threads 3 -r -s - - > snps.bam 2>> snps.log
[M::bwa_idx_load_from_disk] read 0 ALT contigs
[samclip] samclip 0.4.0 by Torsten Seemann (@torstenseemann)
[samclip] Loading: reference/ref.fa.fai
[samclip] Found 1 sequences in reference/ref.fa.fai
[M::process] read 289906 sequences (80000551 bp)...
[M::process] read 289176 sequences (80000175 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 132467, 31, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (268, 342, 430)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 754)
[M::mem_pestat] mean and std.dev: (351.22, 126.79)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 916)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (939, 1593, 3844)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 9654)
[M::mem_pestat] mean and std.dev: (1935.81, 1451.15)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 12559)
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_pestat] skip orientation RF
[M::mem_process_seqs] Processed 289906 reads in 36.382 CPU sec, 4.595 real sec
[samclip] Processed 100000 records...
[samclip] Processed 200000 records...
[M::process] read 288790 sequences (80000471 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 131816, 24, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (269, 342, 429)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 749)
[M::mem_pestat] mean and std.dev: (351.39, 125.60)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 909)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (391, 1048, 1415)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 3463)
[M::mem_pestat] mean and std.dev: (862.41, 546.18)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 4487)
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_pestat] skip orientation RF
[M::mem_process_seqs] Processed 289176 reads in 38.864 CPU sec, 5.133 real sec
[samclip] Processed 300000 records...
[samclip] Processed 400000 records...
[samclip] Processed 500000 records...
[M::process] read 53090 sequences (14767427 bp)...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 131644, 21, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (270, 344, 433)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 759)
[M::mem_pestat] mean and std.dev: (354.05, 127.24)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 922)
[M::mem_pestat] analyzing insert size distribution for orientation RF...
[M::mem_pestat] (25, 50, 75) percentile: (403, 1075, 2134)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 5596)
[M::mem_pestat] mean and std.dev: (1531.71, 1386.03)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 7327)
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_pestat] skip orientation RF
[M::mem_process_seqs] Processed 288790 reads in 39.789 CPU sec, 5.230 real sec
[samclip] Processed 600000 records...
[samclip] Processed 700000 records...
[M::mem_pestat] # candidate unique pairs for (FF, FR, RF, RR): (0, 24204, 2, 0)
[M::mem_pestat] skip orientation FF as there are not enough pairs
[M::mem_pestat] analyzing insert size distribution for orientation FR...
[M::mem_pestat] (25, 50, 75) percentile: (272, 345, 433)
[M::mem_pestat] low and high boundaries for computing mean and std.dev: (1, 755)
[M::mem_pestat] mean and std.dev: (354.67, 125.71)
[M::mem_pestat] low and high boundaries for proper pairs: (1, 916)
[M::mem_pestat] skip orientation RF as there are not enough pairs
[M::mem_pestat] skip orientation RR as there are not enough pairs
[M::mem_process_seqs] Processed 53090 reads in 8.279 CPU sec, 1.163 real sec
[samclip] Processed 800000 records...
[samclip] Processed 900000 records...
[main] Version: 0.7.17-r1188
[main] CMD: bwa mem -Y -M -R @rg\tID:snp_test\tSM:snp_test -t 8 reference/ref.fa /old_Users/[--]/downgraded_Snippy_version/[sample]-IA-M06434-210331_S30_L001_R1_001.fastq /old_Users/[--]/downgraded_Snippy_version/[sample]-IA-M06434-210331_S30_L001_R2_001.fastq[samclip] Total SAM records 922936, removed 46250, allowed 17108, passed 876686
[samclip] Header contained 3 lines
[samclip] Done.

[main] Real time: 18.665 sec; CPU: 124.138 sec
[bam_sort_core] merging from 0 files and 3 in-memory blocks...
[bam_sort_core] merging from 0 files and 3 in-memory blocks...
[11:47:13] Running: samtools index snps.bam 2>> snps.log
[11:47:13] Running: fasta_generate_regions.py reference/ref.fa.fai 191231 > reference/ref.txt 2>> snps.log
[11:47:14] Running: freebayes-parallel reference/ref.txt 8 -p 2 -P 0 -C 2 -F 0.05 --min-coverage 10 --min-repeat-entropy 1.0 -q 13 -m 60 --strict-vcf -f reference/ref.fa snps.bam > snps.raw.vcf 2>> snps.log
[11:47:23] Running: bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa - | bcftools annotate --remove '^INFO/TYPE,^INFO/DP,^INFO/RO,^INFO/AO,^INFO/AB,^FORMAT/GT,^FORMAT/DP,^FORMAT/RO,^FORMAT/AO,^FORMAT/QR,^FORMAT/QA,^FORMAT/GL' > snps.filt.vcf 2>> snps.log
normalize v0.5

options: input VCF file -
[o] output VCF file -
[w] sorting window size 10000
[n] no fail on reference inconsistency for non SNPs false
[q] quiet false
[d] debug false
[r] reference FASTA file reference/ref.fa

[variant_manip.cpp:75 is_ref_consistent] Variant is not consistent: NC_007795:20306-20325 - TTAAAGGTAAAGGTAAAGGT(REF) vs (FASTA)
[normalize.cpp:178 normalize] Normalization not performed due to inconsistent reference sequences. (use -n option to relax this)
[11:47:24] Running: snpEff ann -noLog -noStats -no-downstream -no-upstream -no-utr -c reference/snpeff.config -dataDir . ref snps.filt.vcf > snps.vcf 2>> snps.log
[11:47:26] Running: /old_Users/[--]/miniconda3/envs/snippy/bin/snippy-vcf_to_tab --gff reference/ref.gff --ref reference/ref.fa --vcf snps.vcf > snps.tab 2>> snps.log
[11:47:30] Running: /old_Users/[--]/miniconda3/envs/snippy/bin/snippy-vcf_extract_subs snps.filt.vcf > snps.subs.vcf 2>> snps.log
[11:47:31] Running: bcftools convert -Oz -o snps.vcf.gz snps.vcf 2>> snps.log
[11:47:31] Running: bcftools index -f snps.vcf.gz 2>> snps.log
[11:47:31] Running: bcftools consensus --sample snp_test -f reference/ref.fa -o snps.consensus.fa snps.vcf.gz 2>> snps.log
[11:47:31] Running: bcftools convert -Oz -o snps.subs.vcf.gz snps.subs.vcf 2>> snps.log
[11:47:31] Running: bcftools index -f snps.subs.vcf.gz 2>> snps.log
[11:47:31] Running: bcftools consensus --sample snp_test -f reference/ref.fa -o snps.consensus.subs.fa snps.subs.vcf.gz 2>> snps.log
[11:47:31] Running: rm -f snps.subs.vcf.gz snps.subs.vcf.gz.csi snps.subs.vcf.gz.tbi 2>> snps.log
[11:47:31] Running: samtools view -h -q 60 snps.bam | samtools sort -l 0 -T /localscratch/583419.1.[-] --threads 3 -m 2000M > /localscratch/583419.1.[-]/snippy.41947.Q60.bam 2>> snps.log
[11:47:34] Running: samtools index /localscratch/583419.1.[-]/snippy.41947.Q60.bam 2>> snps.log
[11:47:35] Running: /old_Users/[--]/miniconda3/envs/snippy/bin/snippy-vcf_report --cpus 8 --bam /localscratch/583419.1.[-]/snippy.41947.Q60.bam --ref reference/ref.fa --vcf snps.vcf > snps.report.txt 2>> snps.log
[11:47:35] Running: rm -f /localscratch/583419.1.[-]/snippy.41947.Q60.bam /localscratch/583419.1.[-]/snippy.41947.Q60.bam.bai 2>> snps.log
[11:47:35] Generating reference aligned/masked FASTA relative to reference: snps.aligned.fa
[11:47:39] Marked 1552 heterozygous sites with 'n'
[11:47:39] Creating extra output files: BED GFF CSV TXT HTML
[11:47:39] Identified 0 variants.
[11:47:39] Result folder: snp_test
[11:47:39] Result files:
[11:47:39] * snp_test/snps.aligned.fa
[11:47:39] * snp_test/snps.bam
[11:47:39] * snp_test/snps.bam.bai
[11:47:39] * snp_test/snps.bed
[11:47:39] * snp_test/snps.consensus.fa
[11:47:39] * snp_test/snps.consensus.subs.fa
[11:47:39] * snp_test/snps.csv
[11:47:39] * snp_test/snps.filt.vcf
[11:47:39] * snp_test/snps.gff
[11:47:39] * snp_test/snps.html
[11:47:39] * snp_test/snps.log
[11:47:39] * snp_test/snps.raw.vcf
[11:47:39] * snp_test/snps.report.txt
[11:47:39] * snp_test/snps.subs.vcf
[11:47:39] * snp_test/snps.tab
[11:47:39] * snp_test/snps.txt
[11:47:39] * snp_test/snps.vcf
[11:47:39] * snp_test/snps.vcf.gz
[11:47:39] * snp_test/snps.vcf.gz.csi
[11:47:39] Walltime used: 1 minute, 14 seconds
[11:47:39] Have a suggestion? Tell me at https://github.com/tseemann/snippy/issues
[11:47:39] Done.

--- snp.log file:

echo snippy 4.6.0

cd /old_Users/[--]/downgraded_Snippy_version

/Users/[--]/miniconda3/envs/snippy/bin/snippy --outdir snp_test -report --ref NC_007795_NCTC8325.gb --R1 [sample]-IA-M06434-210331_S30_L001_R1_001.fastq --R2 [sample]-IA-M06434-210331_S30_L001_R2_001.fastq

samtools faidx reference/ref.fa

bwa index reference/ref.fa

[bwa_index] Pack FASTA... 0.04 sec
[bwa_index] Construct BWT for the packed sequence...
[bwa_index] 0.59 seconds elapse.
[bwa_index] Update BWT... 0.01 sec
[bwa_index] Pack forward-only FASTA... 0.01 sec
[bwa_index] Construct SA from BWT and Occ... 0.31 sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa index reference/ref.fa
[main] Real time: 1.057 sec; CPU: 0.972 sec

mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa

ln -sf reference/ref.fa .

ln -sf reference/ref.fa.fai .

mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz

snpEff build -c reference/snpeff.config -dataDir . -gff3 ref

WARNING: All frames are zero! This seems rather odd, please check that 'frame' information in your 'genes' file is accurate.

bwa mem -Y -M -R '@rg\tID:snp_test\tSM:snp_test' -t 8 reference/ref.fa /old_Users/[--]/downgraded_Snippy_version/[sample]-IA-M06434-210331_S30_L001_R1_001.fastq /old_Users/[--]/downgraded_Snippy_version/[sample]-IA-M06434-210331_S30_L001_R2_001.fastq | samclip --max 10 --ref reference/ref.fa.fai | samtools sort -n -l 0 -T /localscratch/583419.1.[-] --threads 3 -m 2000M | samtools fixmate -m --threads 3 - - | samtools sort -l 0 -T /localscratch/583419.1.[-] --threads 3 -m 2000M | samtools markdup -T /localscratch/583419.1.[-] --threads 3 -r -s - - > snps.bam

COMMAND: samtools markdup -T /localscratch/583419.1.[-] --threads 3 -r -s - -
READ: 876686
WRITTEN: 854971
EXCLUDED: 49845
EXAMINED: 826841
PAIRED: 805794
SINGLE: 21047
DUPLICATE PAIR: 10578
DUPLICATE SINGLE: 11137
DUPLICATE PAIR OPTICAL: 0
DUPLICATE SINGLE OPTICAL: 0
DUPLICATE NON PRIMARY: 0
DUPLICATE NON PRIMARY OPTICAL: 0
DUPLICATE PRIMARY TOTAL: 21715
DUPLICATE TOTAL: 21715
ESTIMATED_LIBRARY_SIZE: 15211027

samtools index snps.bam

fasta_generate_regions.py reference/ref.fa.fai 191231 > reference/ref.txt

freebayes-parallel reference/ref.txt 8 -p 2 -P 0 -C 2 -F 0.05 --min-coverage 10 --min-repeat-entropy 1.0 -q 13 -m 60 --strict-vcf -f reference/ref.fa snps.bam > snps.raw.vcf

bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa - | bcftools annotate --remove '^INFO/TYPE,^INFO/DP,^INFO/RO,^INFO/AO,^INFO/AB,^FORMAT/GT,^FORMAT/DP,^FORMAT/RO,^FORMAT/AO,^FORMAT/QR,^FORMAT/QA,^FORMAT/GL' > snps.filt.vcf

snpEff ann -noLog -noStats -no-downstream -no-upstream -no-utr -c reference/snpeff.config -dataDir . ref snps.filt.vcf > snps.vcf

/old_Users/[--]/miniconda3/envs/snippy/bin/snippy-vcf_to_tab --gff reference/ref.gff --ref reference/ref.fa --vcf snps.vcf > snps.tab

Loading reference: reference/ref.fa
Loaded 1 sequences.
Loading features: reference/ref.gff
Parsing variants: snps.vcf
Converted 0 SNPs to TAB format.

/old_Users/[--]/miniconda3/envs/snippy/bin/snippy-vcf_extract_subs snps.filt.vcf > snps.subs.vcf

bcftools convert -Oz -o snps.vcf.gz snps.vcf

bcftools index -f snps.vcf.gz

bcftools consensus --sample snp_test -f reference/ref.fa -o snps.consensus.fa snps.vcf.gz

Applied 0 variants

bcftools convert -Oz -o snps.subs.vcf.gz snps.subs.vcf

bcftools index -f snps.subs.vcf.gz

bcftools consensus --sample snp_test -f reference/ref.fa -o snps.consensus.subs.fa snps.subs.vcf.gz

Applied 0 variants

rm -f snps.subs.vcf.gz snps.subs.vcf.gz.csi snps.subs.vcf.gz.tbi

samtools view -h -q 60 snps.bam | samtools sort -l 0 -T /localscratch/583419.1.[-] --threads 3 -m 2000M > /localscratch/583419.1.[-]/snippy.41947.Q60.bam

[bam_sort_core] merging from 0 files and 3 in-memory blocks...

samtools index /localscratch/583419.1.[-]/snippy.41947.Q60.bam

/old_Users/[--]/miniconda3/envs/snippy/bin/snippy-vcf_report --cpus 8 --bam /localscratch/583419.1.[-]/snippy.41947.Q60.bam --ref reference/ref.fa --vcf snps.vcf > snps.report.txt

Parsing: snps.vcf
Running: parallel -j 8 -k -a /localscratch/583419.1.[-]/lPSTn1wTLR

rm -f /localscratch/583419.1.[-]/snippy.41947.Q60.bam /localscratch/583419.1.[-]/snippy.41947.Q60.bam.bai

@LuXX6666
Copy link

LuXX6666 commented Aug 7, 2024

Have you solved the problem? I also encountered the same problem

@ZoseJapata
Copy link
Author

@LuXX6666 No I haven't been able to solve the problem!
I still haven't been able to get SNIPPY running!!!

I used the bactopia SNIPPY workflow tool successfully a few times (but it runs a little slower) - if you're wanting to try that?

@HenriqueDaSilvaVieira
Copy link

I had the same problem. I tried all command lines, different species, but the problem remains. 0 variants compared to the reference genome.

@Sl4PP
Copy link

Sl4PP commented Aug 11, 2024

Hi, I encountered the same problem. The solution in this comment here helped me resolve it.

Basically, you need to downgrade one of the tools that is part of Snippy. I hope this helps you as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants