-
Notifications
You must be signed in to change notification settings - Fork 49
Gene Trees: MrBayes
We want to analyze each of the 30 loci with MrBayes. First, make sure you have MrBayes installed:
$ which mb
/Users/ane/bin/mb
Next, choose settings for MrBayes: model, prior for branch lengths etc. Save them in a MrBayes block. Below: HKY model, 100,000 generations 2 chains (1 cold & 1 heated), 2 independent runs. These settings were chosen to run things fast during this tutorial, but for a real data set different setting should be chosen (such as 1 million generations, 3 chains and 3 runs).
$ cat ../scripts/mbblock.txt
begin mrbayes;
set nowarnings=yes;
set autoclose=yes;
lset nst=2;
mcmcp ngen=100000 burninfrac=.25 samplefreq=50 printfreq=10000 [increase these for real]
diagnfreq=10000 nruns=2 nchains=2 temp=0.40 swapfreq=10; [increase for real analysis]
mcmc;
sumt;
end;
We are ready to analyze all loci with MrBayes:
$ ../scripts/mb.pl input/1_seqgen.tar.gz -m ../scripts/mbblock.txt -o mb-output
Script was called as follows:
perl mb.pl input/1_seqgen.tar.gz -m ../scripts/mbblock.txt -o mb-output
Appending MrBayes block to each gene... done.
Job server successfully created.
Analyses complete: 30/30.
All connections closed.
Total execution time: 46 seconds.
If a cluster is available with different machines, analyses can be parallelized
across machines (not just across nodes of the same machine) by adding an option
--machine-file hosts.txt
, where hosts.txt
is a simple text
file listing the machines available to use, in the format user_name@machine_address
.
This file might look like this:
The script created a new directory named mb-output
(like we asked above),
which contains a compressed tarball of all MrBayes output: mb-output/1_seqgen.mb.tar
$ ls
input mb-output
$ ls mb-output/
1_seqgen.mb.tar 1_seqgen.tar.gz
$ tar -tf mb-output/1_seqgen.mb.tar
1_seqgen12.nex.tar.gz
1_seqgen11.nex.tar.gz
1_seqgen10.nex.tar.gz
...
1_seqgen7.nex.tar.gz
1_seqgen8.nex.tar.gz
1_seqgen9.nex.tar.gz
Decompressing and looking into the result file for the first locus, we find a bunch of output
including the log from MrBayes (useful to track down bugs, if any) and the sample of
trees from each run (*.t
), which will serve as input for BUCKy.
$ ls mb-output/1_seqgen.mb/1_seqgen1.nex
1_seqgen1.nex.ckp 1_seqgen1.nex.mcmc 1_seqgen1.nex.run2.t 1_seqgen1.nex.vstat
1_seqgen1.nex.ckp~ 1_seqgen1.nex.parts 1_seqgen1.nex.run2.p
1_seqgen1.nex.con.tre 1_seqgen1.nex.run1.p 1_seqgen1.nex.trprobs
1_seqgen1.nex.log 1_seqgen1.nex.run1.t 1_seqgen1.nex.tstat
Next: combining gene trees samples to get concordance factors with BUCKy.
PhyloNetworks Workshop
- home
- example data
-
TICR pipeline:
from sequences to quartet CFs
- the data
- MrBayes on all genes
- BUCKy
- Quartet MaxCut
- RAxML & ASTRAL
- PhyloNetworks: from quartet CFs or gene trees to phylogenetic networks
- TICR test: is a population tree with ILS sufficient (vs network)?
- Continuous trait evolution on a network