You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello everybody,
I have 10 MiSeq-runs which included 5 negative controls and 30 samples each. I want to analyze them all together with Qiime2, however, the deontam step has to be done for each run individually. Therefore, I tried the "batch" option of the "isContaminant" command. However, the resulting table including the information if an ASV is classified as a contaminant or not does not distinguish between the runs anymore and sums up the prevalence of the ASVs from the different runs.
For example:
I have the runs R1 and R2 which were both analyzed with Qiime2 and resulted in the phyloseq-object physeq1. I first created a dataframe df where the samplesnames and Run-IDs are listed:
>head(contam)
freq prev p.freq p.prev p contaminant
6f3f68e5c8e2a11b388ddbbea9fa182d 0.052955868 49 NA 0.4278075 0.4278075 TRUE
3ebe761bfb1238c87195d431f41bf976 0.010071447 29 NA 0.6080978 0.6080978 FALSE
e8386d3a307c208c4b9f0a756259cd6b 0.006279860 11 NA 0.6898396 0.6898396 FALSE
6de4a253e36f3d4e6a2e3acb26a0c030 0.022417636 45 NA 0.9721925 0.9721925 FALSE
ad4cba5280fb47cbebea01e7031af61c 0.007089322 28 NA 0.8013751 0.8013751 FALSE
651d9a773d5fc2e2b20411b9d7c28e0b 0.018931103 50 NA 0.9908327 0.9908327 FALSE
So, the batch information isn't included anymore. I checked for one ASV that is included in R1 and R2 and the prevalence of this ASV is summed up in the "contam" table in comparison to when I run the analysis separately for each run. The problem is, that in R1 this ASV is detected as contamination, in R2 it is not. So it shouldn't be deleted from all runs, only from samples that are included in R1.
I wasn't able to find more details about the "batch" function and how I could modify it so that I can run "prune_taxa" per run in the end.
I hope I was able to explain my problem and that someone has a solution to that.
Thank you in advance!
Martinique
The text was updated successfully, but these errors were encountered:
So, the batch information isn't included anymore. I checked for one ASV that is included in R1 and R2 and the prevalence of this ASV is summed up in the "contam" table in comparison to when I run the analysis separately for each run. The problem is, that in R1 this ASV is detected as contamination, in R2 it is not. So it shouldn't be deleted from all runs, only from samples that are included in R1.
If this is the functionality you want, you will have to run decontam and ASV removal per batch "by hand", as you outlined here. There is no automated way to perform per-batch decontamination in the package, you'll need to do some simple R looping.
That said, I would probably not do this. If a contaminant is identifed and removed in one batch, it should typically be removed from all batches, so as to keep a consistent set of non-contaminant ASVs in the potentially detected universe in all your batches. That is, consistent treatment across batches of contaminants is usually preferable than doing it on a per-batch basis.
Thank you for your answer.
I'm working with low-template samples which are very likely to be highly diverse in their microbial community. So we assume, that what might be a contamination in one run might actually belong to the microbial community in another run. That's why it's important for us to run this analysis separately. But it's good to know that the "batch" option doesn't have the goal to achieve this, so thank you!
Hello everybody,
I have 10 MiSeq-runs which included 5 negative controls and 30 samples each. I want to analyze them all together with Qiime2, however, the deontam step has to be done for each run individually. Therefore, I tried the "batch" option of the "isContaminant" command. However, the resulting table including the information if an ASV is classified as a contaminant or not does not distinguish between the runs anymore and sums up the prevalence of the ASVs from the different runs.
For example:
I have the runs R1 and R2 which were both analyzed with Qiime2 and resulted in the phyloseq-object physeq1. I first created a dataframe df where the samplesnames and Run-IDs are listed:
I created a named vector V1 like this:
Then I ran the following commands:
The result "contam" looks like this:
So, the batch information isn't included anymore. I checked for one ASV that is included in R1 and R2 and the prevalence of this ASV is summed up in the "contam" table in comparison to when I run the analysis separately for each run. The problem is, that in R1 this ASV is detected as contamination, in R2 it is not. So it shouldn't be deleted from all runs, only from samples that are included in R1.
I wasn't able to find more details about the "batch" function and how I could modify it so that I can run "prune_taxa" per run in the end.
I hope I was able to explain my problem and that someone has a solution to that.
Thank you in advance!
Martinique
The text was updated successfully, but these errors were encountered: