Choice of number of hybridizations

Model selection tools are necessary to estimate the number of hybridizations (h). We use the log pseudolikelihood profile with h. A sharp improvement is expected until h reaches the best value and a slower, linear improvement thereafter.

Unfortunately, there is no clear-cut criterion to determine what constitutes a large and 'significant' improvement in the pseudolikelihood score. Even with a regular likelihood, AIC and BIC are not quite appropriate, because they are not meant to accommodate the exploding number of models when h gets bigger (the network space grows very fast). That being said, the pattern of score improvement can be used heuristically.

The negative log pseudolikelihood score is an attribute of the network object: net.loglik. The lower the better.

scores = [net0.loglik, net1.loglik, net2.loglik]
using Gadfly
plot(x=collect(0:2), y=scores, Geom.point, Geom.line)

Below are examples from 2 different data sets.

Next: perform and summarize a bootstrap analysis

external links:

PhyloNetworks Workshop

home
example data
TICR pipeline: from sequences to quartet CFs
- the data
- MrBayes on all genes
- BUCKy
- Quartet MaxCut
- RAxML & ASTRAL
PhyloNetworks: from quartet CFs or gene trees to phylogenetic networks
TICR test: is a population tree with ILS sufficient (vs network)?
Continuous trait evolution on a network

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choice of number of hybridizations

Clone this wiki locally