Automatic Differentiation is no Panacea for Phylogenetic Gradient Computation.

Fourment, M; Swanepoel, CJ; Galloway, JG; Ji, X; Gangavarapu, K; Suchard, MA; Matsen Iv, FA

Automatic Differentiation is no Panacea for Phylogenetic Gradient Computation.

Fourment, M Swanepoel, CJ Galloway, JG Ji, X Gangavarapu, K Suchard, MA Matsen Iv, FA

Permalink

Publisher:: OXFORD UNIV PRESS
Publication Type:: Journal Article
Citation:: Genome Biol Evol, 2023, 15, (6), pp. evad099
Issue Date:: 2023-06-01

Open Access

Copyright Clearance Process

Recently Added
In Progress
Open Access

This item is open access.

Download full textAdobe PDF (287.06 kB)

View on publisher's site

View statistics

Full metadata record

Field	Value	Language
dc.contributor.author	Fourment, M
dc.contributor.author	Swanepoel, CJ
dc.contributor.author	Galloway, JG
dc.contributor.author	Ji, X
dc.contributor.author	Gangavarapu, K
dc.contributor.author	Suchard, MA
dc.contributor.author	Matsen Iv, FA
dc.contributor.editor	Williams, T
dc.date.accessioned	2024-01-11T05:50:33Z
dc.date.available	2023-05-25
dc.date.available	2024-01-11T05:50:33Z
dc.date.issued	2023-06-01
dc.identifier.citation	Genome Biol Evol, 2023, 15, (6), pp. evad099
dc.identifier.issn	1759-6653
dc.identifier.issn	1759-6653
dc.identifier.uri	http://hdl.handle.net/10453/174321
dc.description.abstract	Gradients of probabilistic model likelihoods with respect to their parameters are essential for modern computational statistics and machine learning. These calculations are readily available for arbitrary models via "automatic differentiation" implemented in general-purpose machine-learning libraries such as TensorFlow and PyTorch. Although these libraries are highly optimized, it is not clear if their general-purpose nature will limit their algorithmic complexity or implementation speed for the phylogenetic case compared to phylogenetics-specific code. In this paper, we compare six gradient implementations of the phylogenetic likelihood functions, in isolation and also as part of a variational inference procedure. We find that although automatic differentiation can scale approximately linearly in tree size, it is much slower than the carefully implemented gradient calculation for tree likelihood and ratio transformation operations. We conclude that a mixed approach combining phylogenetic libraries with machine learning libraries will provide the optimal combination of speed and model flexibility moving forward.
dc.format	Print
dc.language	eng
dc.publisher	OXFORD UNIV PRESS
dc.relation.ispartof	Genome Biol Evol
dc.relation.isbasedon	10.1093/gbe/evad099
dc.rights	info:eu-repo/semantics/openAccess
dc.subject	0601 Biochemistry and Cell Biology, 0603 Evolutionary Biology, 0604 Genetics
dc.subject.classification	Developmental Biology
dc.subject.classification	3101 Biochemistry and cell biology
dc.subject.classification	3104 Evolutionary biology
dc.subject.classification	3105 Genetics
dc.subject.mesh	Phylogeny
dc.subject.mesh	Likelihood Functions
dc.subject.mesh	Models, Statistical
dc.subject.mesh	Machine Learning
dc.subject.mesh	Algorithms
dc.subject.mesh	Models, Statistical
dc.subject.mesh	Likelihood Functions
dc.subject.mesh	Phylogeny
dc.subject.mesh	Algorithms
dc.subject.mesh	Machine Learning
dc.subject.mesh	Phylogeny
dc.subject.mesh	Likelihood Functions
dc.subject.mesh	Models, Statistical
dc.subject.mesh	Machine Learning
dc.subject.mesh	Algorithms
dc.title	Automatic Differentiation is no Panacea for Phylogenetic Gradient Computation.
dc.type	Journal Article
utslib.citation.volume	15
utslib.location.activity	England
utslib.for	0601 Biochemistry and Cell Biology
utslib.for	0603 Evolutionary Biology
utslib.for	0604 Genetics
pubs.organisational-group	/University of Technology Sydney
pubs.organisational-group	/University of Technology Sydney/Faculty of Science
utslib.copyright.status	open_access	*
dc.date.updated	2024-01-11T05:50:32Z
pubs.issue	6
pubs.publication-status	Published
pubs.volume	15
utslib.citation.issue	6

Abstract:

Gradients of probabilistic model likelihoods with respect to their parameters are essential for modern computational statistics and machine learning. These calculations are readily available for arbitrary models via "automatic differentiation" implemented in general-purpose machine-learning libraries such as TensorFlow and PyTorch. Although these libraries are highly optimized, it is not clear if their general-purpose nature will limit their algorithmic complexity or implementation speed for the phylogenetic case compared to phylogenetics-specific code. In this paper, we compare six gradient implementations of the phylogenetic likelihood functions, in isolation and also as part of a variational inference procedure. We find that although automatic differentiation can scale approximately linearly in tree size, it is much slower than the carefully implemented gradient calculation for tree likelihood and ratio transformation operations. We conclude that a mixed approach combining phylogenetic libraries with machine learning libraries will provide the optimal combination of speed and model flexibility moving forward.

Please use this identifier to cite or link to this item:

http://hdl.handle.net/10453/174321