Replies: 9 comments 7 replies
-
That's a really good question. For public users, most of the the details at https://www.onezoom.org/data_sources.html. For more technical users, sources for the data are noted at the top of the .phy files in a comment. Here, for example, it's this file: In this case it's noted that we are using the arrangement from Bininda-Emonds et. al (https://doi.org/10.1038/nature05634), for most of the placental mammals except the primates & colugos. So we aren't taking the OpenTree topologies into account for this part of the OneZoom tree. Unfortunately, since this annotation is done by hand, often the exact details are missing. I've just looked at the Bininda-Emonds tree (https://static-content.springer.com/esm/art%3A10.1038%2Fnature05634/MediaObjects/41586_2007_BFnature05634_MOESM362_ESM.txt) and it looks like all of Tupaia is in a big polytomy. It could be that we tweaked it by hand and forgot to annotate the files, or it could be that we took a slightly different tree from the publicly published one in that paper, as this is one of the files from the original OneZoom mammal tree (@jrosindell might know a bit better here) A clue can be found if you search for those species in the .phy file. You'll find them embedded like this:
It looks like there are a lot of 0-length internal branches, which implies that (most) of the polytomies have been arbitrarily broken (apart from the pairing of Tupaia_palawanensis & Tupaia_moellendorffi). Some of the files form the original OneZoom had arbitrarily broken polytomies like this. I don't know where the palawanensis/moellendorffi growing come from, sorry. It's a bit ugly that the polytomy breaking is hand-coded in the file: this should probably be removed (nevertheless, we do need to make sure that the polytomies are always resolved in the same way every time we run the tree-building script: this is handled in the code, I think). Zero-length branches will appear as a polytomy when switching to the "polytomy" view on OneZoom: So hopefully the final outcome for the user in this case is to note that it is a polytomy. This is also shown in the traditional (bifurcating) tree view because zero length internal branches cause that part of the tree to be shown in a lighter grey colour, with no highlighted internal nodes: |
Beta Was this translation helpful? Give feedback.
-
Ideally, of course, we would find a Tupaia phylogeny online that broke at least some of those polytomies, and input it here https://tree.opentreeoflife.org/curator There's a mtDNA and 16S rRNA tree for a few of them here: https://www.nature.com/articles/s41598-022-04907-7 a tree based on 13 protein-coding genes is in their supplementary material too, but as usual (and highly annoyingly) they don't actually provide the Newick file for download in the data availability section, so if we wanted to input that paper into OpenTree we would need to digitise it by hand (I would tend to use the protein-coding tree, I think). There might be other published trees of Tupaia elsewhere too. I haven't looked. I sometimes do that for areas of particular interest, but usually hit problems like the one above that stop the trees being able to be easily incorporated automatically into OneZoom or even input into the OpenTree curator. |
Beta Was this translation helpful? Give feedback.
-
Thanks @hyanwong for the detailed explanation! I'm realizing that there is a lot of great material on the OneZoom site that I have not read, so I definitely need to do this. I also did not realize that there was a polytomy view. I just tried it and it's fantastic! Tupaia link for my own reference.
I assume this is to preserve tree stability and not confuse users, as any resolution is technically correct. So if we were to come up with logic to auto-resolve, the constraint would be:
As for the mtDNA/rRNA article, it's indeed unfortunate that they don't include a Newick. At least they have a diagram which makes it easy to come up with a Newick. Maybe I'll try doing that for the fun, and to learn about the EoL submission process. |
Beta Was this translation helpful? Give feedback.
-
This is what I came up with based on that diagram:
But now I just realized that they have another diagram which has more species, but doesn't have proper date ranges (most branch lengths are 1). Maybe it's a matter of combining the two in a clever way? Well, this is obviously a complex problem, and I'm way out of my league :) |
Beta Was this translation helpful? Give feedback.
-
Actually, I probably got the palawanensis/moellendorffi grouping from this paper, and forgot to record it: https://www.sciencedirect.com/science/article/abs/pii/S1055790311002156 That claims to have a phylogeny of all the species (pasted here as behind a paywall):
Again, no computer-readable trees, although there is a NEXUS file (without trees in it0 of the data that will allow you to run your own analysis. |
Beta Was this translation helpful? Give feedback.
-
Indeed, I can see that it would be an enormous task given the scope of the problem. e.g. I analyzed mammalia for genera with >= 50 polytomies and found a few (from OpenTree newick):
For Tupaia, that latest study you found is indeed much more complete! I'm surprised that these things are behind paywalls, as I naively expected most things in the science world to be wide open. For what it's worth, I turned it into a newick, which I will put here for the record, whether we use it or not. I added indentation for readability. A few notes:
Rendering using http://etetoolkit.org/treeview/: This leaves the following uncategorized species, for the record:
Anyway, I will leave it to you to decide whether it's worth contributing to OpenTree. I'm happy to do it, if only as a learning exercise. |
Beta Was this translation helpful? Give feedback.
-
Hmmm, looking at OpenTree, Scandentia is closer to the Glires than to the Primates, so the way I placed the outgroups is probably going to clash (I followed the OneZoom model). Anyway, easy to adjust to match theirs. |
Beta Was this translation helpful? Give feedback.
-
Thanks for all the feedback, Yan! I think I understand all the needed changes for the tree, and I'll make that a weekend project. Thanks for also explaining the reasoning behind
By 'both trees', do you mean both the first one I had derived from this diagram (with just Tupaia 6 species), and the much more complete one from the paywall article? As far as I can see, the second one is a superset of the first (i.e. with no clashes), which makes the first redundant. But I may be misunderstanding. For the branch lengths, one challenge is that the study only shows the time scale, but not the branch lengths (as I understand, it's called an ultrametric trees or dendrogram). So I need convert that to a tree with branch lengths (an "additive tree"). One possible plan for doing that in a less error prone way is to author a transitional tree that directly matches the dendrogram, and then write code to convert it into an additive tree. e.g. if A split from B 7 MYA, and C split from A&B 20 MYA, I'd write:
And the code (e.g. using dendropy) would turn that into a proper newick tree:
The first is a clear abuse of the newick branch length syntax, but no one would see that one. Hope I'm making sense! :) |
Beta Was this translation helpful? Give feedback.
-
Ok, here is the next iteration, with the following changes:
Here is the resulting newick:
Which looks like this (I switched to https://beta.phylo.io/ for rendering): Questions:
|
Beta Was this translation helpful? Give feedback.
-
Alright, I get to start the first discussion! 😄
We briefly discussed this, and you probably answered it, but I didn't retain everything. I'm trying to better understands how OneZoom handles polytomies.
Taking for example the genus Tupaia (treeshrews), EoL has it as a pure polytomy:
While OneZoom's bespoke tree has a complex structure (extracted from my generated
AllLife_full_tree.phy
):My question is whether this resolution is happening because:
Beta Was this translation helpful? Give feedback.
All reactions