Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Minimap2 to remove host reads. Unmapped read number is higher than input read number #1228

Open
s4251484 opened this issue Jul 10, 2024 · 0 comments

Comments

@s4251484
Copy link

s4251484 commented Jul 10, 2024

Hi there

I am using minimap2 to align the microbiome reads against the host genome, to separate the host reads and to use the unmapped reads for my downstream microbiome analysis.

However my unmapped read number is higher than the input read number.

Below is the commands that I ran:

minimap2 -t 24 -ax map-ont $host $infolder/$filename > $outfolder/$idname.sam

samtools view -@ 24 -f 4 $outfolder/$idname.sam > $outfolder/$idname-nonhost.bam

samtools sort --threads 24 $outfolder/$idname-nonhost.bam > $outfolder/$idname-nonhost-sorted.bam

bamToFastq -i $outfolder/$idname-nonhost-sorted.bam -fq $outfolder/$idname-nonhost.fastq

here is the output determined by Nanoplot

<style> </style>
Sample ID input # Reads (K) input Total Bases (Mb) unmapped # Reads (K) unmapped Total Bases (Mb) unmapped  read perct unmapped  bases perct
3815 1514.504 2114.67753 1836.906 2919.44312 121.29% 138.06%
3816 1815.924 2024.97881 1479.988 2250.97226 81.50% 111.16%

I would appreciate if you could explain why that is so and if there is any way to achieve my plan to separate the host reads while retaining the number of unmapped (potentially microbiome) reads. thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant