Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[for later, just curious] Question about Ancestral State Reconstruction #41

Open
lutteropp opened this issue Feb 4, 2021 · 10 comments
Open

Comments

@lutteropp
Copy link
Owner

lutteropp commented Feb 4, 2021

How is the ancestral state defined in networks?

If we have the ancestral states for all displayed trees, how do we get the ancestral states for the network? Is it just a weighted average of the per-displayed-tree ancestral states?

@lutteropp lutteropp changed the title [for later,just curious] Question about Ancestral State Reconstruction [for later, just curious] Question about Ancestral State Reconstruction Feb 4, 2021
@stamatak
Copy link
Collaborator

stamatak commented Feb 4, 2021 via email

@lutteropp
Copy link
Owner Author

lutteropp commented Feb 4, 2021

That sounds even better. I've taken a look at the ancestral state computation for pll_utree_t, adapting this to the network displayed trees looks pretty straight-forward.

If we find a way how to choose an arc insertion move based on per-displayed-tree ancestral states, I expect the coding+testing work to be manageable (about 2 weeks?).

@lutteropp
Copy link
Owner Author

I don't see yet how the ancestral states for the displayed trees of a r-reticulation network help in finding the arc insertion move to take in order to get to (r+1) reticulations. But anyway, it is a question to explore for later.

@lutteropp
Copy link
Owner Author

lutteropp commented Feb 4, 2021

Found the old notes: #30 (comment)

The idea is to find two nodes whose ancestral states have the smallest distance, but are located in different places in the network. Makes sense...

@stamatak
Copy link
Collaborator

stamatak commented Feb 4, 2021 via email

@lutteropp
Copy link
Owner Author

Sounds like we can simply reuse the fake-node-approach here. (adding the fake node as the second child with brlen zero, the fake node having all-equal ancestral state probabilities)

The fake-node-approach works well for likelihood computation already, and saved lots of ugly special case handling there.

@lutteropp
Copy link
Owner Author

lutteropp commented Feb 16, 2021

This should work:

  • for each network node u, we store a set AS(u) of ancestral states (one per displayed tree)
  • the ancestral distance anc_dist(u, v) between two network nodes u and v is the minimum over the distances {dist((u, s1), (v, s2)) for s1 in AS(u) and s2 in AS(v)}.
  • we do an arc insertion at a pair of nodes {u,v} where anc_dist(u,v) is maximal.

What we still need to figure out: Given two ancestral states s1 and s2 at nodes u and v, we need to define dist((u, s1), (v, s2)) as some function of

  • dist(s1,s2) (being the distance between two ancestral states), and
  • dist(u,v), being the number of edges on a shortest path from u to v in the network.

We want dist((u,s1),(v,s2)) to be large if dist(s1, s2) is small and dist(u,v) is large.

@lutteropp
Copy link
Owner Author

lutteropp commented Feb 16, 2021

A possible definition for dist((u,s1),(v,s2)) would be:

(1.0 - dist(s1,s2)) * dist(u,v).

Here, I assume that dist(s1, s2) is in range [0,1]. If it is not, we can normalize it by using the minimum and maximum value encountered in the network.

@lutteropp
Copy link
Owner Author

It is still unclear to me whether dist(u,v) should be the number of edges on a edge-minimal path between u and v in the network, or whether the sum of edge weights fits better.

I tend towards using the number of edges though, because we will do branch length optimization after inserting an arc anyway...

@stamatak
Copy link
Collaborator

stamatak commented Feb 16, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants