-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
How to get the full taxa of predicted hosts #13
Comments
Hi, Thanks for using our tools. the full taxa can be found in the file "prokaryote.csv" You can also search for it with the ETE3 toolkit: http://etetoolkit.org/docs/latest/tutorial/tutorial_ncbitaxonomy.html Best, |
Thanks Jiayu, One more suggestion here: you may note that the gtdbtk has updated its database, which has a new taxa name system. Is it possible for you to update phabox and assign the phage-host names in alignment with gtdbtk (such as v2.3 release 214) |
Good to know. We will consider to update the gtdbtk in the near future. (I am sorry that we have to catch up ddl recently If you are in a hurry, you can convert it by yourself. We provided a script to convert the current results from NCBI taxa to GTDB (in the GTDB folder). If there is a table to align GTDB to gtdbtk, it can be done easily. Or maybe if you want to share this table with us, that will be helpful to write a script for that. Best, |
Thanks, Jiayu, Here is the taxa table (v2.3.0 release 214): |
Hi there, I am sorry, but it seems the provided taxa table cannot be used to convert RefSeq into GTDB. I found that many of the sequences in the RefSeq taxa cannot find their corresponding taxa in the provided file (missing many accession maps). However, I suddenly found that my provided scripts in the GTDB folder had some problems in the previous release. I have fixed the problems with a readme file. Hope it can help you to convert the RefSeq into the wanted GTDB. The GTDB taxa are also downloaded from the official website. Best, |
Hey developer,
I noted that the Cherry gives hosts of phage sequences. However, it only lists the species names. How can I get the full taxa? Is there any taxa list?
Thanks.
PS: there are two files (virus.csv & prokaryote.csv) in the './database/cherry' file. Which one should I use?
The text was updated successfully, but these errors were encountered: