Maximum Likelihood Estimation for Unrooted 3-Leaf Trees: An Analytic Solution for the CFN Model.

Max Hill, Sebastien Roch, Jose Israel Rodriguez
Author Information
  1. Max Hill: Department of Mathematics, University of California, Riverside, 900 University Avenue, Riverside, CA, 92521, USA. max.hill1@ucr.edu. ORCID
  2. Sebastien Roch: Department of Mathematics, University of Wisconsin-Madison, 480 Lincoln Drive, Madison, WI, 53706-1388, USA.
  3. Jose Israel Rodriguez: Department of Mathematics, University of Wisconsin-Madison, 480 Lincoln Drive, Madison, WI, 53706-1388, USA.

Abstract

Maximum likelihood estimation is among the most widely-used methods for inferring phylogenetic trees from sequence data. This paper solves the problem of computing solutions to the maximum likelihood problem for 3-leaf trees under the 2-state symmetric mutation model (CFN model). Our main result is a closed-form solution to the maximum likelihood problem for unrooted 3-leaf trees, given generic data; this result characterizes all of the ways that a maximum likelihood estimate can fail to exist for generic data and provides theoretical validation for predictions made in Parks and Goldman (Syst Biol 63(5):798-811, 2014). Our proof makes use of both classical tools for studying group-based phylogenetic models such as Hadamard conjugation and reparameterization in terms of Fourier coordinates, as well as more recent results concerning the semi-algebraic constraints of the CFN model. To be able to put these into practice, we also give a complete characterization to test genericity.

Keywords

References

  1. Mol Biol Evol. 2000 Oct;17(10):1529-41 [PMID: 11018159]
  2. Syst Biol. 2014 Sep;63(5):798-811 [PMID: 24996414]
  3. Syst Biol. 2021 Jun 16;70(4):838-843 [PMID: 33528562]
  4. Bioinformatics. 2014 May 1;30(9):1312-3 [PMID: 24451623]
  5. Cladistics. 2005 Apr;21(2):163-193 [PMID: 34892859]
  6. Bull Math Biol. 2019 Feb;81(2):337-360 [PMID: 30357599]
  7. Proc Natl Acad Sci U S A. 1994 Apr 12;91(8):3339-43 [PMID: 8159749]
  8. IEEE/ACM Trans Comput Biol Bioinform. 2009 Jan-Mar;6(1):89-95 [PMID: 19179701]
  9. PLoS One. 2010 Mar 10;5(3):e9490 [PMID: 20224823]
  10. Theor Popul Biol. 2024 Apr;156:1-4 [PMID: 38184209]
  11. Mol Biol Evol. 2006 Mar;23(3):626-32 [PMID: 16319091]
  12. J Math Biol. 2021 Sep 9;83(3):33 [PMID: 34499233]
  13. Proc Biol Sci. 2000 Jan 22;267(1439):109-16 [PMID: 10687814]
  14. J R Soc Interface. 2016 Oct;13(123): [PMID: 27733697]
  15. Mol Biol Evol. 2007 Aug;24(8):1586-91 [PMID: 17483113]
  16. Mol Biol Evol. 2015 Jan;32(1):268-74 [PMID: 25371430]
  17. Mol Phylogenet Evol. 2004 Nov;33(2):440-51 [PMID: 15336677]
  18. Syst Biol. 2004 Dec;53(6):963-7 [PMID: 15764563]
  19. J Comput Biol. 2005 Mar;12(2):204-28 [PMID: 15767777]

Grants

  1. DMS-1929348/National Science Foundation
  2. DMS-1929348/National Science Foundation

MeSH Term

Likelihood Functions
Phylogeny
Mathematical Concepts
Models, Genetic
Mutation
Algorithms

Word Cloud

Created with Highcharts 10.0.0likelihoodtreesmodelCFNMaximumdataproblemmaximumphylogenetic3-leafresultgenericmodelsconstraintsestimationamongwidely-usedmethodsinferringsequencepapersolvescomputingsolutions2-statesymmetricmutationmainclosed-formsolutionunrootedgivencharacterizeswaysestimatecanfailexistprovidestheoreticalvalidationpredictionsmadeParksGoldmanSystBiol635:798-8112014proofmakesuseclassicaltoolsstudyinggroup-basedHadamardconjugationreparameterizationtermsFouriercoordinateswellrecentresultsconcerningsemi-algebraicableputpracticealsogivecompletecharacterizationtestgenericityLikelihoodEstimationUnrooted3-LeafTrees:AnalyticSolutionModelGroupbasedPhylogeneticPhylogeneticsSemi-algebraic

Similar Articles

Cited By