-
Notifications
You must be signed in to change notification settings - Fork 113
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
samtools excluded most reads #574
Comments
Any change this is amplicon data? If so, samtools just removed a lot of your reads because they're marked as pcr duplicates. I don't think snippy is the right tool for you. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Hi tseemann,
I am use snippy 4.6.0, and I am wondering why samtools excluded most of the reads. In the end, I got no variant (I am expecting variant), and when I calculate the read depth per position (using samtools depth), I only have about 1-2 read per position (I am expecting at least 50+ reads). There is a samtools markdup warning but I am not sure if it is related.
Thanks,
Dorothy
Here is the complete log file:
echo snippy 4.6.0
cd /Users/tyl205/Documents/Ssonnei_read_download_fastq_18
/Users/tyl205/snippy/bin/snippy --outdir SRR10874279_ctg1_metG_snippy --ref /Users/tyl205/Documents/plasmids/SRR5024067_2034_1251_Bottom_ctg1_metG_2652578to2654612.fasta --R1 SRR10874279_1_trimmed.fastq.gz --R2 SRR10874279_2_trimmed.fastq.gz
samtools faidx reference/ref.fa
bwa index reference/ref.fa
[bwa_index] Pack FASTA... 0.00 sec
[bwa_index] Construct BWT for the packed sequence...
[bwa_index] 0.00 seconds elapse.
[bwa_index] Update BWT... 0.00 sec
[bwa_index] Pack forward-only FASTA... 0.00 sec
[bwa_index] Construct SA from BWT and Occ... 0.00 sec
[main] Version: 0.7.17-r1188
[main] CMD: bwa index reference/ref.fa
[main] Real time: 0.006 sec; CPU: 0.003 sec
mkdir -p reference/genomes && cp -f reference/ref.fa reference/genomes/ref.fa
ln -sf reference/ref.fa .
ln -sf reference/ref.fa.fai .
mkdir -p reference/ref && gzip -c reference/ref.gff > reference/ref/genes.gff.gz
bwa mem -Y -M -R '@rg\tID:SRR10874279_ctg1_metG_snippy\tSM:SRR10874279_ctg1_metG_snippy' -t 8 reference/ref.fa /Users/tyl205/Documents/Ssonnei_read_download_fastq_18/SRR10874279_1_trimmed.fastq.gz /Users/tyl205/Documents/Ssonnei_read_download_fastq_18/SRR10874279_2_trimmed.fastq.gz | samclip --max 10 --ref reference/ref.fa.fai | samtools sort -n -l 0 -T /private/var/folders/5j/6qfs2qpx3s777t4pc0yntz380000gq/T --threads 3 -m 2000M | samtools fixmate -m --threads 3 - - | samtools sort -l 0 -T /private/var/folders/5j/6qfs2qpx3s777t4pc0yntz380000gq/T --threads 3 -m 2000M | samtools markdup -T /private/var/folders/5j/6qfs2qpx3s777t4pc0yntz380000gq/T --threads 3 -r -s - - > snps.bam
samtools markdup: warning, unable to calculate estimated library size. Read pairs 12 should be greater than duplicate pairs 0, which should both be non zero.
COMMAND: samtools markdup -T /private/var/folders/5j/6qfs2qpx3s777t4pc0yntz380000gq/T --threads 3 -r -s - -
READ: 34700
WRITTEN: 34700
EXCLUDED: 34670
EXAMINED: 30
PAIRED: 24
SINGLE: 6
DUPLICATE PAIR: 0
DUPLICATE SINGLE: 0
DUPLICATE PAIR OPTICAL: 0
DUPLICATE SINGLE OPTICAL: 0
DUPLICATE NON PRIMARY: 0
DUPLICATE NON PRIMARY OPTICAL: 0
DUPLICATE PRIMARY TOTAL: 0
DUPLICATE TOTAL: 0
ESTIMATED_LIBRARY_SIZE: 0
samtools index snps.bam
fasta_generate_regions.py reference/ref.fa.fai 1000 > reference/ref.txt
freebayes-parallel reference/ref.txt 8 -p 2 -P 0 -C 2 -F 0.05 --min-coverage 10 --min-repeat-entropy 1.0 -q 13 -m 60 --strict-vcf -f reference/ref.fa snps.bam > snps.raw.vcf
bcftools view --include 'FMT/GT="1/1" && QUAL>=100 && FMT/DP>=10 && (FMT/AO)/(FMT/DP)>=0' snps.raw.vcf | vt normalize -r reference/ref.fa - | bcftools annotate --remove '^INFO/TYPE,^INFO/DP,^INFO/RO,^INFO/AO,^INFO/AB,^FORMAT/GT,^FORMAT/DP,^FORMAT/RO,^FORMAT/AO,^FORMAT/QR,^FORMAT/QA,^FORMAT/GL' > snps.filt.vcf
cp snps.filt.vcf snps.vcf
/Users/tyl205/snippy/bin/snippy-vcf_to_tab --gff reference/ref.gff --ref reference/ref.fa --vcf snps.vcf > snps.tab
Loading reference: reference/ref.fa
Loaded 1 sequences.
Loading features: reference/ref.gff
Parsing variants: snps.vcf
Converted 0 SNPs to TAB format.
/Users/tyl205/snippy/bin/snippy-vcf_extract_subs snps.filt.vcf > snps.subs.vcf
bcftools convert -Oz -o snps.vcf.gz snps.vcf
bcftools index -f snps.vcf.gz
bcftools consensus --sample SRR10874279_ctg1_metG_snippy -f reference/ref.fa -o snps.consensus.fa snps.vcf.gz
Applied 0 variants
bcftools convert -Oz -o snps.subs.vcf.gz snps.subs.vcf
bcftools index -f snps.subs.vcf.gz
bcftools consensus --sample SRR10874279_ctg1_metG_snippy -f reference/ref.fa -o snps.consensus.subs.fa snps.subs.vcf.gz
Applied 0 variants
rm -f snps.subs.vcf.gz snps.subs.vcf.gz.csi snps.subs.vcf.gz.tbi
The text was updated successfully, but these errors were encountered: