You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I have one control sample, and the prevalence method in Decontam is not effectively identifying contaminants. The p-values seem distributed evenly, and the isContaminant() function isn’t marking many sequences as contaminants even with an aggressive threshold. I need advice on how to proceed or any alternative approaches or changes to the following code.
attempted -> (e.g., using threshold=0.3).
identification of control samples
sample_data(physeq)$is.neg <- sample_data(physeq)$Sample_or_Control == "Control Sample"
Identify contaminants using the prevalence method with an aggressive threshold
contamdf.prev <- isContaminant(physeq, method = "prevalence", neg = "is.neg", threshold = 0.3)
table(contamdf.prev$contaminant)
The text was updated successfully, but these errors were encountered:
JayalalKJ
changed the title
prevalence Method in Decontam not Identifying contaminants with a single control sample
prevalence method in Decontam not Identifying contaminants with a single control sample
Oct 30, 2024
decontam-prevalence is not an appropriate method for use when you have only one negative control sample. It relies on repeated observation of contaminants across multiple negative controls. We recommend a minimum of 5, see our original paper for more on that. https://doi.org/10.1186/s40168-018-0605-2
Hi, I have one control sample, and the prevalence method in Decontam is not effectively identifying contaminants. The p-values seem distributed evenly, and the isContaminant() function isn’t marking many sequences as contaminants even with an aggressive threshold. I need advice on how to proceed or any alternative approaches or changes to the following code.
attempted -> (e.g., using threshold=0.3).
identification of control samples
sample_data(physeq)$is.neg <- sample_data(physeq)$Sample_or_Control == "Control Sample"
Identify contaminants using the prevalence method with an aggressive threshold
contamdf.prev <- isContaminant(physeq, method = "prevalence", neg = "is.neg", threshold = 0.3)
table(contamdf.prev$contaminant)
visualize prevalence in positive vs negative controls
ps.pa <- transform_sample_counts(physeq, function(abund) 1 * (abund > 0))
ps.pa.neg <- prune_samples(sample_data(ps.pa)$Sample_or_Control == "Control Sample", ps.pa)
ps.pa.pos <- prune_samples(sample_data(ps.pa)$Sample_or_Control == "True Sample", ps.pa)
Create a data frame for visualization
df.pa <- data.frame(pa.pos = taxa_sums(ps.pa.pos), pa.neg = taxa_sums(ps.pa.neg),
contaminant = contamdf.prev$contaminant)
Plot the prevalence of taxa in positive vs negative controls
ggplot(data = df.pa, aes(x = pa.neg, y = pa.pos, color = contaminant)) +
geom_point() +
xlab("Prevalence (Negative Controls)") +
ylab("Prevalence (True Samples)")
Prune contaminants
physeq_clean <- prune_taxa(!contamdf.prev$contaminant, physeq)
The text was updated successfully, but these errors were encountered: