Improving Tree Probability Estimation with Stochastic Optimization and Variance Reduction.

Tianyu Xie, Musu Yuan, Minghua Deng, Cheng Zhang
Author Information
  1. Tianyu Xie: School of Mathematical Sciences, Peking University, Beijing, 100871, China.
  2. Musu Yuan: Center for Quantitative Biology, Peking University, Beijing, 100871, China.
  3. Minghua Deng: Center for Quantitative Biology, School of Mathematical Sciences, and Center for Statistical Science, Peking University, Beijing, 100871, China.
  4. Cheng Zhang: School of Mathematical Sciences and Center for Statistical Science, Peking University, Beijing, 100871, China.

Abstract

Probability estimation of tree topologies is one of the fundamental tasks in phylogenetic inference. The recently proposed subsplit Bayesian networks (SBNs) provide a powerful probabilistic graphical model for tree topology probability estimation by properly leveraging the hierarchical structure of phylogenetic trees. However, the expectation maximization (EM) method currently used for learning SBN parameters does not scale up to large data sets. In this paper, we introduce several computationally efficient methods for training SBNs and show that variance reduction could be the key for better performance. Furthermore, we also introduce the variance reduction technique to improve the optimization of SBN parameters for variational Bayesian phylogenetic inference (VBPI). Extensive synthetic and real data experiments demonstrate that our methods outperform previous baseline methods on the tasks of tree topology probability estimation as well as Bayesian phylogenetic inference using SBNs.

Keywords

References

  1. Syst Biol. 2013 Jul;62(4):501-11 [PMID: 23479066]
  2. J Mol Evol. 1981;17(6):368-76 [PMID: 7288891]
  3. Syst Biol. 2008 Feb;57(1):86-103 [PMID: 18278678]
  4. Syst Biol. 2012 Jan;61(1):1-11 [PMID: 21828081]
  5. Bioinformatics. 2001 Aug;17(8):754-5 [PMID: 11524383]
  6. Biometrics. 1999 Mar;55(1):1-12 [PMID: 11318142]
  7. Syst Biol. 2012 May;61(3):539-42 [PMID: 22357727]
  8. Syst Biol. 2015 May;64(3):472-91 [PMID: 25631175]
  9. Mol Biol Evol. 1997 Jul;14(7):717-24 [PMID: 9214744]
  10. Mol Biol Evol. 2013 May;30(5):1188-95 [PMID: 23418397]

Grants

  1. R01 AI162611/NIAID NIH HHS

Word Cloud

Created with Highcharts 10.0.0phylogeneticestimationtreeinferenceBayesianSBNsprobabilitymethodsvariancereductionProbabilitytasksprobabilisticgraphicaltopologyexpectationmaximizationSBNparametersdataintroducevariationaltopologiesonefundamentalrecentlyproposedsubsplitnetworksprovidepowerfulmodelproperlyleveraginghierarchicalstructuretreesHoweverEMmethodcurrentlyusedlearningscalelargesetspaperseveralcomputationallyefficienttrainingshowkeybetterperformanceFurthermorealsotechniqueimproveoptimizationVBPIExtensivesyntheticrealexperimentsdemonstrateoutperformpreviousbaselinewellusingImprovingTreeEstimationStochasticOptimizationVarianceReductionmodelsstochastic

Similar Articles

Cited By