Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Computation of Transfer Bootstrap Extra Information #70

Open
wants to merge 31 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
d4feb52
updated submodule
lutteropp May 13, 2019
7618f86
added extra tbe output
lutteropp Jun 12, 2019
92e14cb
fixed compile errors
lutteropp Jun 12, 2019
4b5d3be
updated submodule
lutteropp Jun 12, 2019
7991296
implemented basic tbe extra table and tbe extra array output
lutteropp Jun 12, 2019
641c26e
adapted to this weird cutoff value from booster
lutteropp Jun 12, 2019
64a5810
uncommented some code
lutteropp Jun 12, 2019
674a310
fixed dynamic casts
lutteropp Jun 12, 2019
160f2d2
added dirty TBE extra tree output
lutteropp Jun 12, 2019
c48a749
fixed extra table output and tbe computation call
lutteropp Jun 12, 2019
7ff2000
fixed tbe extra tree output
lutteropp Jun 12, 2019
b87c136
changed extra taxa table
lutteropp Jun 12, 2019
56b988b
updated submodule
lutteropp Jun 17, 2019
a974b0a
updated submodule and fixed tbe extra outputs
lutteropp Jun 17, 2019
f9ba4f9
made tbe extra table output more like in booster
lutteropp Jun 17, 2019
537e7bb
tbe extra table output now also prints support values
lutteropp Jun 18, 2019
7ae199f
updated submodule
lutteropp Jun 18, 2019
cfce896
updated submodule
lutteropp Jun 18, 2019
318c1d2
updated submoudle
lutteropp Jun 21, 2019
2f5ad4b
updated submodule, added OpenMP support
lutteropp Jul 1, 2019
15c6fad
updated submodule
lutteropp Jul 1, 2019
0243b31
added setting number of OpenMP threads with --threads option
lutteropp Jul 4, 2019
b5d64dd
added missing initialization of extra_info
lutteropp Jul 8, 2019
fb57023
updated submodule
lutteropp Jul 8, 2019
6b4e181
slightly faster table output
lutteropp Jul 8, 2019
35d2af9
updated submodule
lutteropp Jul 8, 2019
b3ed825
removed unneccessary rebuilds of tip_label_list in output of tbe extr…
lutteropp Jul 9, 2019
757e9be
faster output of zeros in tbe extra table
lutteropp Jul 9, 2019
b5411e1
updated README
lutteropp Jul 26, 2019
980842f
added number of threads
lutteropp Jul 26, 2019
d104506
Update README.md
lutteropp Sep 5, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .gitmodules
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[submodule "libs/pll-modules"]
path = libs/pll-modules
url = https://github.com/ddarriba/pll-modules.git
url = https://github.com/lutteropp/pll-modules.git
[submodule "libs/terraphast"]
path = libs/terraphast
url = https://github.com/amkozlov/terraphast-one
8 changes: 4 additions & 4 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -39,10 +39,10 @@ endif()
#set(CMAKE_CXX_EXTENSIONS OFF)

# set these flags globally for all subprojects (libpll etc.)
set (CMAKE_CXX_FLAGS_DEBUG "-O3 -g" CACHE INTERNAL "")
set (CMAKE_CXX_FLAGS_RELEASE "-O3" CACHE INTERNAL "")
set (CMAKE_C_FLAGS_DEBUG "-O3 -g" CACHE INTERNAL "")
set (CMAKE_C_FLAGS_RELEASE "-O3" CACHE INTERNAL "")
set (CMAKE_CXX_FLAGS_DEBUG "-O3 -g -fopenmp" CACHE INTERNAL "")
set (CMAKE_CXX_FLAGS_RELEASE "-O3 -fopenmp" CACHE INTERNAL "")
set (CMAKE_C_FLAGS_DEBUG "-O3 -g -fopenmp" CACHE INTERNAL "")
set (CMAKE_C_FLAGS_RELEASE "-O3 -fopenmp" CACHE INTERNAL "")

project (raxml-ng C CXX)

Expand Down
90 changes: 9 additions & 81 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,18 @@
# RAxML Next Generation

[![Build Status](https://www.travis-ci.org/amkozlov/raxml-ng.svg?branch=master)](https://www.travis-ci.org/amkozlov/raxml-ng) [![DOI](https://zenodo.org/badge/75947982.svg)](https://zenodo.org/badge/latestdoi/75947982) [![License](https://img.shields.io/badge/license-AGPL-blue.svg)](http://www.gnu.org/licenses/agpl-3.0.en.html)
# RAxML Next Generation with parallel computation of Transfer Bootstrap Expectation (TBE) scores and computation of TBE extra information

## Introduction

RAxML-NG is a phylogenetic tree inference tool which uses maximum-likelihood (ML) optimality criterion. Its search heuristic is based on iteratively performing a series of Subtree Pruning and Regrafting (SPR) moves, which allows to quickly navigate to the best-known ML tree. RAxML-NG is a successor of RAxML (Stamatakis 2014) and leverages the highly optimized likelihood computation implemented in [*libpll*](https://github.com/xflouris/libpll) (Flouri et al. 2014).
Greetings! If you are not here for parallel computation of TBE support or computation of the TBE extra information, we strongly advise you to use the most up-to-date version of RAxML-NG instead. It can be found here: https://github.com/amkozlov/raxml-ng

RAxML-NG offers improvements in speed, flexibility and user-friendliness over the previous RAxML versions. It also implements some of the features previously available in ExaML (Kozlov et al. 2015), including checkpointing and efficient load balancing for partitioned alignments (Kobert et al. 2014).
## Computing TBE support with extra information

RAxML-NG is currently under active development, and the mid-term goal is to have most functionality of RAxML 8.x covered.
You can see some of the planned features [here](https://github.com/amkozlov/raxml-ng/issues).
Here is one example call which computes TBE support scores as well as the extra table and the extra array. It uses the cutoff value 0.3 for that. It uses 10 threads:
```
./raxml-ng --support --tree REF.nw --bs-trees BS.nw --bs-metric TBE --extra tbe_extra_table,tbe_extra_array,tbe-cutoff{0.3} --threads 10
```

Documentation: [github wiki](https://github.com/amkozlov/raxml-ng/wiki)
If you don't understand what this is, please read our paper at https://www.biorxiv.org/content/10.1101/734848v2
<!-- (TODO: Insert DOI) and its supplementary text (TODO: Insert DOI). -->

## Installation instructions

Expand Down Expand Up @@ -65,77 +66,4 @@ cmake -DSTATIC_BUILD=ON -DENABLE_RAXML_SIMD=OFF -DENABLE_PLLMOD_SIMD=OFF ..
make
```

## Documentation and Support

Documentation can be found in the [github wiki](https://github.com/amkozlov/raxml-ng/wiki).
For a quick start, please check out the [hands-on tutorial](https://github.com/amkozlov/raxml-ng/wiki/Tutorial).

Also please check the online help with `raxml-ng -h`.

If still in doubt, please feel free to post to the [RAxML google group](https://groups.google.com/forum/#!forum/raxml).

## Usage examples

1. Perform single tree inference on DNA alignment
(random starting tree, general time-reversible model, ML estimate of substitution rates and
nucleotide frequencies, discrete GAMMA model of rate heterogeneity with 4 categories):

`./raxml-ng --msa testDNA.fa --model GTR+G`

2. Perform an all-in-one analysis (ML tree search + non-parametric bootstrap)
(10 randomized parsimony starting trees, fixed empirical substitution matrix (LG),
empirical aminoacid frequencies from alignment, 8 discrete GAMMA categories,
200 bootstrap replicates):

`./raxml-ng --all --msa testAA.fa --model LG+G8+F --tree pars{10} --bs-trees 200`


3. Optimize branch lengths and free model parameters on a fixed topology
(using multiple partitions with proportional branch lengths)

`./raxml-ng --evaluate --msa testAA.fa --model partitions.txt --tree test.tree --brlen scaled`

4. Map support values from existing set of replicate trees:

`./raxml-ng --support --tree bestML.tree --bs-trees bootstraps.tree`

## License and citation

The code is currently licensed under the GNU Affero General Public License version 3.

When using RAxML-NG, please cite [this preprint](https://www.biorxiv.org/content/early/2018/10/18/447110):

Alexey M. Kozlov, Diego Darriba, Tom&aacute;&scaron; Flouri, Benoit Morel, and Alexandros Stamatakis (2018)
**RAxML-NG: A fast, scalable, and user-friendly tool for maximum likelihood phylogenetic inference.**
*bioRxiv.*
doi:[10.1101/447110](https://doi.org/10.1101/447110)

## The team

* Alexey Kozlov
* Alexandros Stamatakis
* Diego Darriba
* Tom&aacute;&scaron; Flouri
* Benoit Morel

## References

* Stamatakis A. (2014)
**RAxML version 8: a tool for phylogenetic analysis and post-analysis of large phylogenies.**
*Bioinformatics*, 30(9): 1312-1313.
doi:[10.1093/bioinformatics/btu033](http://dx.doi.org/10.1093/bioinformatics/btu033)

* Flouri T., Izquierdo-Carrasco F., Darriba D., Aberer AJ, Nguyen LT, Minh BQ, von Haeseler A., Stamatakis A. (2014)
**The Phylogenetic Likelihood Library.**
*Systematic Biology*, 64(2): 356-362.
doi:[10.1093/sysbio/syu084](http://dx.doi.org/10.1093/sysbio/syu084)

* Kozlov A.M., Aberer A.J., Stamatakis A. (2015)
**ExaML version 3: a tool for phylogenomic analyses on supercomputers.**
*Bioinformatics (2015) 31 (15): 2577-2579.*
doi:[10.1093/bioinformatics/btv184](https://doi.org/10.1093/bioinformatics/btv184)

* Kobert K., Flouri T., Aberer A., Stamatakis A. (2014)
**The divisible load balance problem and its application to phylogenetic inference.**
*Brown D., Morgenstern B., editors. (eds.) Algorithms in Bioinformatics, Vol. 8701 of Lecture Notes in Computer Science. Springer, Berlin, pp. 204–216*

2 changes: 1 addition & 1 deletion libs/pll-modules
16 changes: 16 additions & 0 deletions src/CommandLineParser.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -6,6 +6,10 @@
#include <thread>
#endif

#if defined(_OPENMP)
#include <omp.h>
#endif

using namespace std;

static struct option long_options[] =
Expand Down Expand Up @@ -737,6 +741,14 @@ void CommandLineParser::parse_options(int argc, char** argv, Options &opts)
opts.tbe_naive = true;
else if (eopt == "tbe-nature")
opts.tbe_naive = false;
else if (sscanf(eopt.c_str(), "tbe-cutoff{%lf}", &opts.tbe_extra_cutoff) == 1)
void();
else if (eopt == "tbe_extra_table")
opts.tbe_extra_table = true;
else if (eopt == "tbe_extra_array")
opts.tbe_extra_array = true;
else if (eopt == "tbe_extra_tree")
opts.tbe_extra_tree = true;
else
throw InvalidOptionValueException("Unknown extra option: " + string(optarg));
}
Expand Down Expand Up @@ -839,6 +851,10 @@ void CommandLineParser::parse_options(int argc, char** argv, Options &opts)
}
}

#if defined(_OPENMP)
omp_set_num_threads(opts.num_threads);
#endif

if (c != -1)
exit(EXIT_FAILURE);

Expand Down
3 changes: 3 additions & 0 deletions src/Options.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -30,6 +30,9 @@ void Options::set_default_outfiles()
set_default_outfile(outfile_names.support_tree, "support");
set_default_outfile(outfile_names.fbp_support_tree, "supportFBP");
set_default_outfile(outfile_names.tbe_support_tree, "supportTBE");
set_default_outfile(outfile_names.tbe_extra_table, "tbeExtraTable");
set_default_outfile(outfile_names.tbe_extra_array, "tbeExtraArray");
set_default_outfile(outfile_names.tbe_extra_tree, "tbeExtraTree");
set_default_outfile(outfile_names.terrace, "terrace");
set_default_outfile(outfile_names.binary_msa, "rba");
set_default_outfile(outfile_names.bootstrap_msa, "bootstrapMSA");
Expand Down
11 changes: 11 additions & 0 deletions src/Options.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,9 @@ struct OutputFileNames
std::string bootstrap_trees;
std::string support_tree;
std::string tbe_support_tree;
std::string tbe_extra_table;
std::string tbe_extra_array;
std::string tbe_extra_tree;
std::string fbp_support_tree;
std::string terrace;
std::string binary_msa;
Expand All @@ -40,6 +43,7 @@ class Options
spr_cutoff(1.0),
brlen_linkage(PLLMOD_COMMON_BRLEN_SCALED), brlen_opt_method(PLLMOD_OPT_BLO_NEWTON_FAST),
brlen_min(RAXML_BRLEN_MIN), brlen_max(RAXML_BRLEN_MAX),
tbe_extra_cutoff(0.3), tbe_extra_table(false), tbe_extra_array(false), tbe_extra_tree(false),
num_searches(1), terrace_maxsize(100),
num_bootstraps(1000), bootstop_criterion(BootstopCriterion::none), bootstop_cutoff(0.03),
bootstop_interval(RAXML_BOOTSTOP_INTERVAL), bootstop_permutations(RAXML_BOOTSTOP_PERMUTES),
Expand Down Expand Up @@ -82,6 +86,10 @@ class Options
int brlen_opt_method;
double brlen_min;
double brlen_max;
double tbe_extra_cutoff;
bool tbe_extra_table;
bool tbe_extra_array;
bool tbe_extra_tree;

unsigned int num_searches;
unsigned long long terrace_maxsize;
Expand Down Expand Up @@ -135,6 +143,9 @@ class Options
std::string bootstrap_partition_file() const;
const std::string rfdist_file() const { return outfile_names.rfdist; }
const std::string cons_tree_file() const { return outfile_names.cons_tree + consense_type_name(); }
const std::string& tbe_extra_table_file() const { return outfile_names.tbe_extra_table; }
const std::string& tbe_extra_array_file() const { return outfile_names.tbe_extra_array; }
const std::string& tbe_extra_tree_file() const { return outfile_names.tbe_extra_tree; }

const std::string asr_tree_file() const { return outfile_names.asr_tree; }
const std::string asr_probs_file() const { return outfile_names.asr_probs; }
Expand Down
11 changes: 11 additions & 0 deletions src/ParallelContext.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,17 @@ class ParallelContext
static void mpi_gather_custom(std::function<int(void*,int)> prepare_send_cb,
std::function<void(void*,int)> process_recv_cb);

// WARNING: This function is like a private parking lot. Only the transfer bootstrap computation is allowed to use it. Wrongdoers will be punished. PLEASE REFACTOR ME.
static void pll_lock(bool b)
{
static std::mutex mtx;
if (b) {
mtx.lock();
} else {
mtx.unlock();
}
}

static bool master() { return proc_id() == 0; }
static bool master_rank() { return _rank_id == 0; }
static bool master_thread() { return _thread_id == 0; }
Expand Down
4 changes: 2 additions & 2 deletions src/bootstrap/BootstrapTree.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
BootstrapTree::BootstrapTree (const Tree& tree) : SupportTree(tree)
{
assert(num_splits() > 0);
_node_split_map.resize(num_splits());
_split_node_map.resize(num_splits());

/* extract reference tree splits and add them into hashtable */
add_tree(pll_utree_root());
Expand All @@ -18,7 +18,7 @@ BootstrapTree::~BootstrapTree ()
void BootstrapTree::add_tree(const pll_unode_t& root)
{
bool ref_tree = (_num_bs_trees == 0);
pll_unode_t ** node_split_map = ref_tree ? _node_split_map.data() : nullptr;
pll_unode_t ** node_split_map = ref_tree ? _split_node_map.data() : nullptr;
int update_only = ref_tree ? 0 : 1;
doubleVector support(num_splits(), ref_tree ? 0. : 1.);

Expand Down
4 changes: 2 additions & 2 deletions src/bootstrap/ConsensusTree.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -70,13 +70,13 @@ bool ConsensusTree::compute_support()
pll_utree(_num_tips, *cons_tree->tree);

/* map pll_unodes to splits */
_node_split_map.resize(_pll_utree->inner_count);
_split_node_map.resize(_pll_utree->inner_count);
_support.resize(_pll_utree->inner_count);
for (unsigned int i = 0; i < _pll_utree->inner_count; ++i)
{
auto node = _pll_utree->nodes[_pll_utree->tip_count + i];
assert(node->data);
_node_split_map[i] = node;
_split_node_map[i] = node;
_support[i] = ((pll_consensus_data_t *) node->data)->support;
}

Expand Down
6 changes: 5 additions & 1 deletion src/bootstrap/SupportTree.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -121,9 +121,13 @@ void SupportTree::draw_support(bool support_in_pct)
// printf("node_id %d, split_id %d\n", _node_split_map[i]->node_index, i);
// printf("\n\n");

pll_unode_t ** node_map = _node_split_map.empty() ? nullptr : _node_split_map.data();
pll_unode_t ** node_map = _split_node_map.empty() ? nullptr : _split_node_map.data();
pllmod_utree_draw_support(_pll_utree.get(), _support.data(), node_map,
support_in_pct ? support_fmt_pct : support_fmt_prop);

LOG_DEBUG_TS << "Done!" << endl << endl;
}

const doubleVector& SupportTree::get_support() const {
return _support;
}
4 changes: 3 additions & 1 deletion src/bootstrap/SupportTree.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -18,6 +18,8 @@ class SupportTree : public Tree

void draw_support(bool support_in_pct = true);

const doubleVector& get_support() const;

protected:
PllSplitSharedPtr extract_splits_from_tree(const pll_unode_t& root,
pll_unode_t ** node_split_map);
Expand All @@ -34,7 +36,7 @@ class SupportTree : public Tree
protected:
size_t _num_bs_trees;
bitv_hashtable_t* _pll_splits_hash;
std::vector<pll_unode_t*> _node_split_map;
std::vector<pll_unode_t*> _split_node_map;
doubleVector _support;
};

Expand Down
49 changes: 42 additions & 7 deletions src/bootstrap/TransferBootstrapTree.cpp
Original file line number Diff line number Diff line change
@@ -1,25 +1,60 @@
#include "TransferBootstrapTree.hpp"

TransferBootstrapTree::TransferBootstrapTree(const Tree& tree, bool naive) :
SupportTree (tree), _split_info(nullptr), _naive_method(naive)
/*typedef unsigned int TBEFlags;

const unsigned int TBE_DO_TABLE = 1;
const unsigned int TBE_DO_ARRAY = 2;
const unsigned int TBE_DO_OTHER = 4;

TransferBootstrapTree bstree(tree, true, 1., TBE_DO_TABLE | TBE_DO_OTHER);

TransferBootstrapTree bstree(tree, true, 1., false, true, false);
*/

TransferBootstrapTree::TransferBootstrapTree(const Tree& tree, bool naive, double tbe_cutoff, bool doTable, bool doArray, bool doTree) :
SupportTree (tree), _split_info(nullptr), _extra_info(nullptr), _naive_method(naive)
{
assert(num_splits() > 0);
_node_split_map.resize(num_splits());
_split_node_map.resize(num_splits());

/* extract reference tree splits and add them into hashtable */
add_tree(pll_utree_root());

if (!_naive_method)
{
_split_info = pllmod_utree_tbe_nature_init((pll_unode_t*) &pll_utree_root(), _num_tips,
(const pll_unode_t**) _node_split_map.data());
(const pll_unode_t**) _split_node_map.data());
if (doTable || doArray || doTree) {
_extra_info = pllmod_tbe_extra_info_create(num_splits(), _num_tips, tbe_cutoff, doTable, doArray, doTree);
}
}
}

const std::vector<pll_unode_t*> TransferBootstrapTree::get_split_node_map() const {
return _split_node_map;
}

pllmod_tbe_extra_info_t* TransferBootstrapTree::get_extra_info() const {
return _extra_info;
}

pllmod_tbe_split_info_t* TransferBootstrapTree::get_split_info() const {
return _split_info;
}

/*
void TransferBootstrapTree::collect_support() {
SupportTree::collect_support();
// do the postprocessing of extra info
}
*/

TransferBootstrapTree::~TransferBootstrapTree()
{
if (_split_info)
free(_split_info);
if (_extra_info)
pllmod_tbe_extra_info_destroy(_extra_info, num_splits());
}

void TransferBootstrapTree::add_tree(const pll_unode_t& root)
Expand All @@ -29,7 +64,7 @@ void TransferBootstrapTree::add_tree(const pll_unode_t& root)

if (ref_tree)
{
_ref_splits = extract_splits_from_tree(root, _node_split_map.data());
_ref_splits = extract_splits_from_tree(root, _split_node_map.data());

add_splits_to_hashtable(_ref_splits, support, 0);
}
Expand All @@ -44,8 +79,8 @@ void TransferBootstrapTree::add_tree(const pll_unode_t& root)
pllmod_utree_tbe_naive(_ref_splits.get(), splits.get(), _num_tips, support.data());
else
{
pllmod_utree_tbe_nature(_ref_splits.get(), splits.get(), (pll_unode_t*) &root,
_num_tips, support.data(), _split_info);
pllmod_utree_tbe_nature_extra(_ref_splits.get(), splits.get(), (pll_unode_t*) &root,
_num_tips, support.data(), _split_info, _extra_info);
}

add_splits_to_hashtable(_ref_splits, support, 1);
Expand Down
Loading