Automatic Differentiation is no Panacea for Phylogenetic Gradient Computation.

Mathieu Fourment, Christiaan J Swanepoel, Jared G Galloway, Xiang Ji, Karthik Gangavarapu, Marc A Suchard, Frederick A Matsen Iv
Author Information
  1. Mathieu Fourment: Australian Institute for Microbiology and Infection, University of Technology Sydney, Ultimo, NSW, Australia. ORCID
  2. Christiaan J Swanepoel: Centre for Computational Evolution, The University of Auckland, Auckland, New Zealand.
  3. Jared G Galloway: Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA.
  4. Xiang Ji: Department of Mathematics, Tulane University, New Orleans, Louisiana, USA.
  5. Karthik Gangavarapu: Department of Human Genetics, University of California, Los Angeles, California, USA.
  6. Marc A Suchard: Department of Human Genetics, University of California, Los Angeles, California, USA. ORCID
  7. Frederick A Matsen Iv: Public Health Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, Washington, USA. ORCID

Abstract

Gradients of probabilistic model likelihoods with respect to their parameters are essential for modern computational statistics and machine learning. These calculations are readily available for arbitrary models via "automatic differentiation" implemented in general-purpose machine-learning libraries such as TensorFlow and PyTorch. Although these libraries are highly optimized, it is not clear if their general-purpose nature will limit their algorithmic complexity or implementation speed for the phylogenetic case compared to phylogenetics-specific code. In this paper, we compare six gradient implementations of the phylogenetic likelihood functions, in isolation and also as part of a variational inference procedure. We find that although automatic differentiation can scale approximately linearly in tree size, it is much slower than the carefully implemented gradient calculation for tree likelihood and ratio transformation operations. We conclude that a mixed approach combining phylogenetic libraries with machine learning libraries will provide the optimal combination of speed and model flexibility moving forward.

Keywords

References

  1. Virus Evol. 2018 Jun 08;4(1):vey016 [PMID: 29942656]
  2. Mol Biol Evol. 2022 Aug 3;39(8): [PMID: 35816422]
  3. Virus Evol. 2018 Jan 08;4(1):vex042 [PMID: 29340210]
  4. Genome Res. 1998 Mar;8(3):222-33 [PMID: 9521926]
  5. Mol Biol Evol. 2020 Oct 1;37(10):3047-3060 [PMID: 32458974]
  6. Genome Res. 2021 Nov;31(11):2107-2119 [PMID: 34426513]
  7. Elife. 2014;3:e01914 [PMID: 24497547]
  8. Nat Biotechnol. 2017 Apr 11;35(4):316-319 [PMID: 28398311]
  9. Syst Biol. 2019 Nov 1;68(6):1052-1061 [PMID: 31034053]
  10. Syst Biol. 2021 Feb 10;70(2):258-267 [PMID: 32687171]
  11. Stat Appl Genet Mol Biol. 2012 Sep 25;11(4):Article 14 [PMID: 23023698]
  12. Mol Biol Evol. 2019 Apr 1;36(4):825-833 [PMID: 30715448]
  13. J Mol Evol. 1981;17(6):368-76 [PMID: 7288891]
  14. Syst Biol. 2020 Mar 1;69(2):209-220 [PMID: 31504998]
  15. BMC Evol Biol. 2014 Jul 24;14:163 [PMID: 25055743]
  16. J Stat Softw. 2017;76: [PMID: 36568334]
  17. PeerJ. 2019 Dec 18;7:e8272 [PMID: 31976168]

Grants

  1. R01 AI153044/NIAID NIH HHS
  2. R01 AI162611/NIAID NIH HHS
  3. /Howard Hughes Medical Institute
  4. S10 OD028685/NIH HHS

MeSH Term

Phylogeny
Likelihood Functions
Models, Statistical
Machine Learning
Algorithms

Word Cloud

Created with Highcharts 10.0.0librariesphylogeneticgradientinferencemodelmachinelearningimplementedgeneral-purposewillspeedlikelihoodvariationaltreeGradientsprobabilisticlikelihoodsrespectparametersessentialmoderncomputationalstatisticscalculationsreadilyavailablearbitrarymodelsvia"automaticdifferentiation"machine-learningTensorFlowPyTorchAlthoughhighlyoptimizedclearnaturelimitalgorithmiccomplexityimplementationcasecomparedphylogenetics-specificcodepapercomparesiximplementationsfunctionsisolationalsopartprocedurefindalthoughautomaticdifferentiationcanscaleapproximatelylinearlysizemuchslowercarefullycalculationratiotransformationoperationsconcludemixedapproachcombiningprovideoptimalcombinationflexibilitymovingforwardAutomaticDifferentiationPanaceaPhylogeneticGradientComputationBayesianphylogenetics

Similar Articles

Cited By