19 Dubious Ways to Compute the Marginal Likelihood of a Phylogenetic Tree Topology.

Fourment, M; Magee, AF; Whidden, C; Bilge, A; Matsen, FA; Minin, VN

19 Dubious Ways to Compute the Marginal Likelihood of a Phylogenetic Tree Topology.

Fourment, M

Magee, AF Whidden, C Bilge, A Matsen, FA Minin, VN

Permalink

Publication Type:: Journal Article
Citation:: Syst Biol, 2020, 69 (2), pp. 209 - 220
Issue Date:: 2020-03-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Adobe PDF

Download Accepted Manuscript VersionAdobe PDF (2.3 MB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Fourment, M https://orcid.org/0000-0001-8153-9822	en_US
dc.contributor.author	Magee, AF	en_US
dc.contributor.author	Whidden, C	en_US
dc.contributor.author	Bilge, A	en_US
dc.contributor.author	Matsen, FA	en_US
dc.contributor.author	Minin, VN	en_US
dc.date.available	2019-07-02	en_US
dc.date.issued	2020-03-01	en_US
dc.identifier.citation	Syst Biol, 2020, 69 (2), pp. 209 - 220	en_US
dc.identifier.uri	http://hdl.handle.net/10453/136720
dc.description.abstract	The marginal likelihood of a model is a key quantity for assessing the evidence provided by the data in support of a model. The marginal likelihood is the normalizing constant for the posterior density, obtained by integrating the product of the likelihood and the prior with respect to model parameters. Thus, the computational burden of computing the marginal likelihood scales with the dimension of the parameter space. In phylogenetics, where we work with tree topologies that are high-dimensional models, standard approaches to computing marginal likelihoods are very slow. Here, we study methods to quickly compute the marginal likelihood of a single fixed tree topology. We benchmark the speed and accuracy of 19 different methods to compute the marginal likelihood of phylogenetic topologies on a suite of real data sets under the JC69 model. These methods include several new ones that we develop explicitly to solve this problem, as well as existing algorithms that we apply to phylogenetic models for the first time. Altogether, our results show that the accuracy of these methods varies widely, and that accuracy does not necessarily correlate with computational burden. Our newly developed methods are orders of magnitude faster than standard approaches, and in some cases, their accuracy rivals the best established estimators.	en_US
dc.language	eng	en_US
dc.relation.ispartof	Syst Biol	en_US
dc.relation.isbasedon	10.1093/sysbio/syz046	en_US
dc.subject.classification	Evolutionary Biology	en_US
dc.title	19 Dubious Ways to Compute the Marginal Likelihood of a Phylogenetic Tree Topology.	en_US
dc.type	Journal Article
utslib.citation.volume	2	en_US
utslib.citation.volume	69	en_US
utslib.location.activity	England	en_US
utslib.for	0603 Evolutionary Biology	en_US
utslib.for	0604 Genetics	en_US
pubs.embargo.period	Not known	en_US
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Science
pubs.organisational-group	/University of Technology Sydney/Strength - ithree - Institute of Infection, Immunity and Innovation
utslib.copyright.status	open_access
pubs.issue	2	en_US
pubs.publication-status	Published	en_US
pubs.volume	69	en_US

Abstract:

The marginal likelihood of a model is a key quantity for assessing the evidence provided by the data in support of a model. The marginal likelihood is the normalizing constant for the posterior density, obtained by integrating the product of the likelihood and the prior with respect to model parameters. Thus, the computational burden of computing the marginal likelihood scales with the dimension of the parameter space. In phylogenetics, where we work with tree topologies that are high-dimensional models, standard approaches to computing marginal likelihoods are very slow. Here, we study methods to quickly compute the marginal likelihood of a single fixed tree topology. We benchmark the speed and accuracy of 19 different methods to compute the marginal likelihood of phylogenetic topologies on a suite of real data sets under the JC69 model. These methods include several new ones that we develop explicitly to solve this problem, as well as existing algorithms that we apply to phylogenetic models for the first time. Altogether, our results show that the accuracy of these methods varies widely, and that accuracy does not necessarily correlate with computational burden. Our newly developed methods are orders of magnitude faster than standard approaches, and in some cases, their accuracy rivals the best established estimators.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/136720