Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cluster to OTU, not ASVs? #1966

Open
mya-darsan opened this issue Jun 2, 2024 · 3 comments
Open

Cluster to OTU, not ASVs? #1966

mya-darsan opened this issue Jun 2, 2024 · 3 comments

Comments

@mya-darsan
Copy link

mya-darsan commented Jun 2, 2024

I am currently working with a data set where i want to cluster them into OTUs not ASVs in R. Is there a way to do this with DADA2? (I have already previously gone through the whole pipeline and got them as ASVs, but I now know I need them as OTU)

@benjjneb
Copy link
Owner

benjjneb commented Jun 4, 2024

DADA2 does not do OTU clustering. You can further cluster the denoised ASVs into OTUs. DECIPHER is a package in R that will help you do that, and there are other alternatives outside of R that can provide OTU clustering.

@mya-darsan
Copy link
Author

Got it thank you!! I was able to cluster into OTUs via DECIPHER, but I am now having trouble assigning taxonomy

I am currently running into some issues assigning taxonomy after using my seqtab.nochim (from DADA2) to cluster OTUs via decipher. My merged sequence table now looks like this (i removed the OTU1, OTU2, etc etc and kept the sequences as the header names):
Screenshot 2024-06-12 at 3 54 45 PM

I tried to assign taxonomy with this code:
taxa <- assignTaxonomy(merged_seqtab2, unite.ref, multithread = TRUE, tryRC = TRUE)

I am receiving the following error:
Error in h(simpleError(msg, call)) :
error in evaluating the argument 'x' in selecting a method for function 'reverseComplement': key 53 (char '5') not in lookup table
In addition: Warning message:
In assignTaxonomy(merged_seqtab2, unite.ref, multithread = TRUE, :
Some sequences were shorter than 50 nts and will not receive a taxonomic classification.

I have checked that all the sequences are not shorter the 50 and that the are made of only ATGC. I even turned it into a vector.

@benjjneb
Copy link
Owner

assignTaxonomy is getting confused by the merged_seqtab2 object and is extracting the sample names instead of the sequences. You could try to reformat merged_seqtab2 to look just like a dada2 sequence table (integer matrix with samples as rows, columns as ASVs, and columns named by sequence), but it would be easier to just pull out the sequence vector and give that to assignTaxonomy. So something like sq <- colnames(merged_seqtab2); assignTaxonomy(sq, ...).

You can also apply DADA2's internal ACGT checking with dada2:::C_isACGT(sq).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants