PALM: a paralleled and integrated framework for phylogenetic inference with automatic likelihood model selectors.

Shu-Hwa Chen, Sheng-Yao Su, Chen-Zen Lo, Kuei-Hsien Chen, Teng-Jay Huang, Bo-Han Kuo, Chung-Yen Lin
Author Information
  1. Shu-Hwa Chen: Institute of Information Science, Academia Sinica, Taipei, Taiwan.

Abstract

BACKGROUND: Selecting an appropriate substitution model and deriving a tree topology for a given sequence set are essential in phylogenetic analysis. However, such time consuming, computationally intensive tasks rely on knowledge of substitution model theories and related expertise to run through all possible combinations of several separate programs. To ensure a thorough and efficient analysis and avert tedious manipulations of various programs, this work presents an intuitive framework, the phylogenetic reconstruction with automatic likelihood model selectors (PALM), with convincing, updated algorithms and a best-fit model selection mechanism for seamless phylogenetic analysis.
METHODOLOGY: As an integrated framework of ClustalW, PhyML, MODELTEST, ProtTest, and several in-house programs, PALM evaluates the fitness of 56 substitution models for nucleotide sequences and 112 substitution models for protein sequences with scores in various criteria. The input for PALM can be either sequences in FASTA format or a sequence alignment file in PHYLIP format. To accelerate the computing of maximum likelihood and bootstrapping, this work integrates MPICH2/PhyML, PalmMonitor and Palm job controller across several machines with multiple processors and adopts the task parallelism approach. Moreover, an intuitive and interactive web component, PalmTree, is developed for displaying and operating the output tree with options of tree rooting, branches swapping, viewing the branch length values, and viewing bootstrapping score, as well as removing nodes to restart analysis iteratively.
SIGNIFICANCE: The workflow of PALM is straightforward and coherent. Via a succinct, user-friendly interface, researchers unfamiliar with phylogenetic analysis can easily use this server to submit sequences, retrieve the output, and re-submit a job based on a previous result if some sequences are to be deleted or added for phylogenetic reconstruction. PALM results in an inference of phylogenetic relationship not only by vanquishing the computation difficulty of ML methods but also providing statistic methods for model selection and bootstrapping. The proposed approach can reduce calculation time, which is particularly relevant when querying a large data set. PALM can be accessed online at http://palm.iis.sinica.edu.tw.

References

  1. Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W553-6 [PMID: 15980533]
  2. Mol Biol Evol. 2002 Oct;19(10):1717-26 [PMID: 12270898]
  3. Mol Biol Evol. 2007 Aug;24(8):1586-91 [PMID: 17483113]
  4. Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W465-9 [PMID: 18424797]
  5. Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W33-6 [PMID: 15215344]
  6. Syst Biol. 2001 Aug;50(4):580-601 [PMID: 12116655]
  7. Syst Biol. 2003 Oct;52(5):696-704 [PMID: 14530136]
  8. Bioinformatics. 2003 Aug 12;19(12):1572-4 [PMID: 12912839]
  9. Nucleic Acids Res. 2005 Jul 1;33(Web Server issue):W557-9 [PMID: 15980534]
  10. Brief Bioinform. 2008 Jul;9(4):286-98 [PMID: 18372315]
  11. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W38-42 [PMID: 17452346]
  12. Curr Opin Struct Biol. 2007 Jun;17(3):337-41 [PMID: 17572082]
  13. Bioinformation. 2008 Jul 31;2(10):452-5 [PMID: 18841241]
  14. Trends Genet. 2001 May;17(5):262-72 [PMID: 11335036]
  15. CSH Protoc. 2008 Apr 01;2008:pdb.ip49 [PMID: 21356800]
  16. J Mol Biol. 2000 Sep 8;302(1):205-17 [PMID: 10964570]
  17. Mol Biol Evol. 2008 Jul;25(7):1253-6 [PMID: 18397919]
  18. Biol J Linn Soc Lond. 2007 Dec;92(4):669-674 [PMID: 32287391]
  19. Mol Biol Evol. 2003 Oct;20(10):1692-704 [PMID: 12885968]
  20. Methods Enzymol. 1996;266:418-27 [PMID: 8743697]
  21. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W33-7 [PMID: 17553837]
  22. Bioinformatics. 2005 May 1;21(9):2104-5 [PMID: 15647292]
  23. Bioinformatics. 2005 Oct 1;21(19):3794-6 [PMID: 16046495]
  24. Bioinformatics. 2005 Apr 1;21(7):969-74 [PMID: 15513992]
  25. Algorithms Mol Biol. 2008 May 27;3:6 [PMID: 18505568]
  26. Bioinformatics. 1998;14(9):817-8 [PMID: 9918953]
  27. Comput Appl Biosci. 1997 Oct;13(5):555-6 [PMID: 9367129]
  28. Curr Protoc Bioinformatics. 2002 Aug;Chapter 2:Unit 2.3 [PMID: 18792934]
  29. BMC Bioinformatics. 2004 Aug 19;5:113 [PMID: 15318951]
  30. Mol Biol Evol. 2004 Aug;21(8):1565-71 [PMID: 15163768]
  31. Curr Protoc Bioinformatics. 2003 Feb;Chapter 6:Unit 6.4 [PMID: 18428704]
  32. Syst Biol. 2001 Aug;50(4):465-7 [PMID: 12116645]
  33. Bioinformatics. 2005 Feb 15;21(4):456-63 [PMID: 15608047]
  34. Bioinformatics. 2005 Jun;21 Suppl 1:i97-106 [PMID: 15961504]

MeSH Term

Algorithms
Likelihood Functions
Models, Genetic
Phylogeny

Word Cloud

Created with Highcharts 10.0.0phylogeneticmodelPALManalysissequencessubstitutioncantreeseveralprogramsframeworklikelihoodbootstrappingsequencesettimevariousworkintuitivereconstructionautomaticselectorsselectionintegratedmodelsformatjobapproachoutputviewinginferencemethodsBACKGROUND:SelectingappropriatederivingtopologygivenessentialHoweverconsumingcomputationallyintensivetasksrelyknowledgetheoriesrelatedexpertiserunpossiblecombinationsseparateensurethoroughefficientaverttediousmanipulationspresentsconvincingupdatedalgorithmsbest-fitmechanismseamlessMETHODOLOGY:ClustalWPhyMLMODELTESTProtTestin-houseevaluatesfitness56nucleotide112proteinscorescriteriainputeitherFASTAalignmentfilePHYLIPacceleratecomputingmaximumintegratesMPICH2/PhyMLPalmMonitorPalmcontrolleracrossmachinesmultipleprocessorsadoptstaskparallelismMoreoverinteractivewebcomponentPalmTreedevelopeddisplayingoperatingoptionsrootingbranchesswappingbranchlengthvaluesscorewellremovingnodesrestartiterativelySIGNIFICANCE:workflowstraightforwardcoherentViasuccinctuser-friendlyinterfaceresearchersunfamiliareasilyuseserversubmitretrievere-submitbasedpreviousresultdeletedaddedresultsrelationshipvanquishingcomputationdifficultyMLalsoprovidingstatisticproposedreducecalculationparticularlyrelevantqueryinglargedataaccessedonlinehttp://palmiissinicaedutwPALM:paralleled

Similar Articles

Cited By