Many-core algorithms for statistical phylogenetics.

Marc A Suchard, Andrew Rambaut
Author Information
  1. Marc A Suchard: Department of Biomathematics, University of California, Los Angeles, CA 90095, USA. msuchard@ucla.edu

Abstract

MOTIVATION: Statistical phylogenetics is computationally intensive, resulting in considerable attention meted on techniques for parallelization. Codon-based models allow for independent rates of synonymous and replacement substitutions and have the potential to more adequately model the process of protein-coding sequence evolution with a resulting increase in phylogenetic accuracy. Unfortunately, due to the high number of codon states, computational burden has largely thwarted phylogenetic reconstruction under codon models, particularly at the genomic-scale. Here, we describe novel algorithms and methods for evaluating phylogenies under arbitrary molecular evolutionary models on graphics processing units (GPUs), making use of the large number of processing cores to efficiently parallelize calculations even for large state-size models.
RESULTS: We implement the approach in an existing Bayesian framework and apply the algorithms to estimating the phylogeny of 62 complete mitochondrial genomes of carnivores under a 60-state codon model. We see a near 90-fold speed increase over an optimized CPU-based computation and a >140-fold increase over the currently available implementation, making this the first practical use of codon models for phylogenetic inference over whole mitochondrial or microorganism genomes.
AVAILABILITY AND IMPLEMENTATION: Source code provided in BEAGLE: Broad-platform Evolutionary Analysis General Likelihood Evaluator, a cross-platform/processor library for phylogenetic likelihood computation (http://beagle-lib.googlecode.com/). We employ a BEAGLE-implementation using the Bayesian phylogenetics framework BEAST (http://beast.bio.ed.ac.uk/).

References

  1. Mol Biol Evol. 1994 Sep;11(5):725-36 [PMID: 7968486]
  2. J Mol Evol. 1985;22(2):160-74 [PMID: 3934395]
  3. Genome Res. 1998 Mar;8(3):222-33 [PMID: 9521926]
  4. BMC Bioinformatics. 2008 Mar 26;9 Suppl 2:S10 [PMID: 18387198]
  5. PLoS Biol. 2006 May;4(5):e88 [PMID: 16683862]
  6. Bioinformatics. 2002 Mar;18(3):502-4 [PMID: 11934758]
  7. Mol Phylogenet Evol. 2000 Nov;17(2):190-9 [PMID: 11083933]
  8. Genetics. 2000 May;155(1):431-49 [PMID: 10790415]
  9. J Mol Evol. 1984;20(1):86-93 [PMID: 6429346]
  10. Bioinformatics. 2005 Feb 15;21(4):456-63 [PMID: 15608047]
  11. Mol Biol Evol. 1994 Sep;11(5):715-24 [PMID: 7968485]
  12. Biol Rev Camb Philos Soc. 1999 May;74(2):143-75 [PMID: 10396181]
  13. J Mol Evol. 2000 Nov;51(5):423-32 [PMID: 11080365]
  14. Mol Biol Evol. 2006 Jan;23(1):7-9 [PMID: 16177232]
  15. J Mol Evol. 1981;17(6):368-76 [PMID: 7288891]
  16. Mol Phylogenet Evol. 2005 Oct;37(1):192-201 [PMID: 15964215]
  17. Syst Biol. 2005 Oct;54(5):808-18 [PMID: 16243764]
  18. Mol Phylogenet Evol. 2004 Dec;33(3):694-705 [PMID: 15522797]
  19. Bioinformatics. 2005 Oct 1;21(19):3794-6 [PMID: 16046495]
  20. Bioinformatics. 2005 Apr 1;21(7):969-74 [PMID: 15513992]
  21. J Mol Evol. 1994 Jul;39(1):105-11 [PMID: 8064867]
  22. Bioinformatics. 2004 Feb 12;20(3):407-15 [PMID: 14960467]

Grants

  1. R01 GM086887/NIGMS NIH HHS

MeSH Term

Algorithms
Bayes Theorem
Codon
Computational Biology
Evolution, Molecular
Genome, Mitochondrial
Phylogeny

Chemicals

Codon

Word Cloud

Created with Highcharts 10.0.0modelsphylogeneticcodonphylogeneticsincreasealgorithmsresultingmodelnumberprocessingmakinguselargeBayesianframeworkmitochondrialgenomescomputationMOTIVATION:StatisticalcomputationallyintensiveconsiderableattentionmetedtechniquesparallelizationCodon-basedallowindependentratessynonymousreplacementsubstitutionspotentialadequatelyprocessprotein-codingsequenceevolutionaccuracyUnfortunatelyduehighstatescomputationalburdenlargelythwartedreconstructionparticularlygenomic-scaledescribenovelmethodsevaluatingphylogeniesarbitrarymolecularevolutionarygraphicsunitsGPUscoresefficientlyparallelizecalculationsevenstate-sizeRESULTS:implementapproachexistingapplyestimatingphylogeny62completecarnivores60-stateseenear90-foldspeedoptimizedCPU-based>140-foldcurrentlyavailableimplementationfirstpracticalinferencewholemicroorganismAVAILABILITYANDIMPLEMENTATION:SourcecodeprovidedBEAGLE:Broad-platformEvolutionaryAnalysisGeneralLikelihoodEvaluatorcross-platform/processorlibrarylikelihoodhttp://beagle-libgooglecodecom/employBEAGLE-implementationusingBEASThttp://beastbioedacuk/Many-corestatistical

Similar Articles

Cited By