Forcing external constraints on tree inference using ASTRAL.

Maryam Rabiee, Siavash Mirarab
Author Information
  1. Maryam Rabiee: Department of Computer Science and Engineering, UC San Diego, 9500 Gilman Dr, La Jolla, 92093, USA.
  2. Siavash Mirarab: Department of Electrical and Computer Engineering, UC San Diego, 9500 Gilman Dr, La Jolla, 92093, USA. smirarab@ucsd.edu.

Abstract

BACKGROUND: To account for genome-wide discordance among gene trees, several widely-used methods seek to find a species tree with the minimum distance to input gene trees. To efficiently explore the large space of species trees, some of these methods, including ASTRAL, use dynamic programming (DP). The DP paradigm can restrict the search space, and thus, ASTRAL and similar methods use heuristic methods to define a restricted search space. However, arbitrary constraints provided by the user on the output tree cannot be trivially incorporated into such restrictions. The ability to infer trees that honor user-defined constraints is needed for many phylogenetic analyses, but no solution currently exists for constraining the output of ASTRAL.
RESULTS: We introduce methods that enable the ASTRAL dynamic programming to infer constrained trees in an effective and scalable manner. To do so, we adopt a recently developed tree completion algorithm and extend it to allow multifurcating input and output trees. In simulation studies, we show that the approach for honoring constraints is both effective and fast. On real data, we show that constrained searches can help interrogate branches not recovered in the optimal ASTRAL tree to reveal support for alternative hypotheses.
CONCLUSIONS: The new algorithm is added ASTRAL to all user-provided constraints on the species tree.

Keywords

References

  1. J Bioinform Comput Biol. 2013 Oct;11(5):1342005 [PMID: 24131054]
  2. Syst Biol. 2013 Jan 1;62(1):110-20 [PMID: 22949484]
  3. Bioinformatics. 2015 Jun 15;31(12):i44-52 [PMID: 26072508]
  4. Am J Bot. 2004 Oct;91(10):1599-613 [PMID: 21652311]
  5. BMC Evol Biol. 2010 Oct 11;10:302 [PMID: 20937096]
  6. Genome Biol Evol. 2016 Jan 05;8(2):330-44 [PMID: 26733575]
  7. Syst Biol. 2020 Mar 1;69(2):384-391 [PMID: 31290974]
  8. Mol Phylogenet Evol. 2018 Jul;124:122-136 [PMID: 29530498]
  9. Science. 2014 Dec 12;346(6215):1320-31 [PMID: 25504713]
  10. J Biomed Inform. 2006 Feb;39(1):86-102 [PMID: 16243006]
  11. Mol Phylogenet Evol. 2018 May;122:110-115 [PMID: 29421312]
  12. BMC Evol Biol. 2011 Jul 13;11:205 [PMID: 21752249]
  13. Syst Biol. 2019 Mar 1;68(2):365-369 [PMID: 30165689]
  14. Science. 2014 Dec 12;346(6215):1250463 [PMID: 25504728]
  15. Syst Biol. 2016 Mar;65(2):334-44 [PMID: 26526427]
  16. Mol Biol Evol. 1988 Sep;5(5):568-83 [PMID: 3193878]
  17. Bioinformatics. 2017 Mar 1;33(5):631-639 [PMID: 27663499]
  18. Bioinformatics. 2014 Sep 1;30(17):i541-8 [PMID: 25161245]
  19. Algorithms Mol Biol. 2018 Mar 15;13:6 [PMID: 29568323]
  20. Proc Natl Acad Sci U S A. 2001 Aug 14;98(17):9707-12 [PMID: 11504944]
  21. PLoS Comput Biol. 2009 Sep;5(9):e1000501 [PMID: 19749978]
  22. Mol Phylogenet Evol. 2019 Oct;139:106539 [PMID: 31226465]
  23. Mol Phylogenet Evol. 2019 Jan;130:286-296 [PMID: 30393186]
  24. Mol Biol Evol. 2016 Jul;33(7):1654-68 [PMID: 27189547]
  25. Bioinformatics. 2014 Dec 15;30(24):3548-55 [PMID: 25359891]
  26. Nat Ecol Evol. 2017 Jan 13;1(2):20 [PMID: 28812610]
  27. PLoS One. 2010 Mar 10;5(3):e9490 [PMID: 20224823]
  28. BMC Bioinformatics. 2018 May 8;19(Suppl 6):153 [PMID: 29745866]
  29. J Comput Biol. 1996 Summer;3(2):275-88 [PMID: 8811487]
  30. BMC Bioinformatics. 2010 Oct 30;11:538 [PMID: 21034504]
  31. Evolution. 2004 Feb;58(2):404-15 [PMID: 15068356]

MeSH Term

Algorithms
Computer Simulation
Databases, Genetic
Evolution, Molecular
Genetic Speciation
Genomics
Heuristics
Multigene Family
Phylogeny
Research Design
Software

Word Cloud

Created with Highcharts 10.0.0ASTRALtreetreesmethodsconstraintsspeciesspacesearchoutputgenedistanceinputusedynamicprogrammingDPcaninferconstrainedeffectivecompletionalgorithmshowBACKGROUND:accountgenome-widediscordanceamongseveralwidely-usedseekfindminimumefficientlyexplorelargeincludingparadigmrestrictthussimilarheuristicdefinerestrictedHoweverarbitraryprovidedusertriviallyincorporatedrestrictionsabilityhonoruser-definedneededmanyphylogeneticanalysessolutioncurrentlyexistsconstrainingRESULTS:introduceenablescalablemanneradoptrecentlydevelopedextendallowmultifurcatingsimulationstudiesapproachhonoringfastrealdatasearcheshelpinterrogatebranchesrecoveredoptimalrevealsupportalternativehypothesesCONCLUSIONS:newaddeduser-providedForcingexternalinferenceusingConstrainedRFTree

Similar Articles

Cited By