Phylogenetic analysis of gene expression.

Casey W Dunn, Xi Luo, Zhijin Wu
Author Information
  1. Casey W Dunn: *Department of Ecology and Evolutionary Biology, Brown University, Providence, RI, USA; Department of Biostatistics and Center for Statistical Sciences, Brown University, Providence, RI 02903, USA.

Abstract

Phylogenetic analyses of gene expression have great potential for addressing a wide range of questions. These analyses will, for example, identify genes that have evolutionary shifts in expression that are correlated with evolutionary changes in morphological, physiological, and developmental characters of interest. This will provide entirely new opportunities to identify genes related to particular phenotypes. There are, however, 3 key challenges that must be addressed for such studies to realize their potential. First, data on gene expression must be measured from multiple species, some of which may be field-collected, and parameterized in such a way that they can be compared across species. Second, it will be necessary to develop comparative phylogenetic methods suitable for large multidimensional datasets. In most phylogenetic comparative studies to date, the number n of independent observations (independent contrasts) has been greater than the number p of variables (characters). The behavior of comparative methods for these classic problems is now well understood under a wide variety of conditions. In studies of gene expression, and in studies based on other high-throughput tools, the number n of samples is dwarfed by the number p of variables. The estimated covariance matrices will be singular, complicating their analysis and interpretation, and prone to spurious results. Third, new approaches are needed to investigate the expression of the many genes whose phylogenies are not congruent with species phylogenies due to gene loss, gene duplication, and incomplete lineage sorting. Here we outline general considerations of project design for phylogenetic analyses of gene expression and suggest solutions to these three categories of challenges. These topics are relevant to high-throughput phenotypic data well beyond gene expression.

References

  1. Mol Biol Evol. 2011 Jan;28(1):273-90 [PMID: 20660489]
  2. Nat Rev Genet. 2012 Jun 18;13(7):505-16 [PMID: 22705669]
  3. Syst Biol. 2009 Aug;58(4):411-24 [PMID: 20525594]
  4. Genetics. 2010 Jun;185(2):405-16 [PMID: 20439781]
  5. Syst Biol. 2013 Jan 1;62(1):110-20 [PMID: 22949484]
  6. Genetics. 1999 Apr;151(4):1531-45 [PMID: 10101175]
  7. Annu Rev Genomics Hum Genet. 2011;12:327-46 [PMID: 21721942]
  8. Genome Biol. 2010;11(3):R25 [PMID: 20196867]
  9. Proc Natl Acad Sci U S A. 2009 Jan 27;106(4):1133-8 [PMID: 19139403]
  10. Nucleic Acids Res. 2010 Jul;38(12):e131 [PMID: 20395217]
  11. PLoS One. 2011;6(7):e22953 [PMID: 21829563]
  12. Philos Trans R Soc Lond B Biol Sci. 1989 Dec 21;326(1233):119-57 [PMID: 2575770]
  13. Trends Ecol Evol. 2009 Dec;24(12):649-58 [PMID: 19699549]
  14. Biostatistics. 2012 Apr;13(2):204-16 [PMID: 22285995]
  15. Trends Genet. 2006 Aug;22(8):456-61 [PMID: 16806568]
  16. PLoS Comput Biol. 2011 Jun;7(6):e1002073 [PMID: 21695233]
  17. Proc Natl Acad Sci U S A. 2009 Apr 7;106(14):5714-9 [PMID: 19299507]
  18. Nature. 2011 Oct 19;478(7369):343-8 [PMID: 22012392]
  19. Nat Genet. 2003 Feb;33(2):138-44 [PMID: 12548287]
  20. Am Nat. 2008 Jun;171(6):713-25 [PMID: 18419518]
  21. Nucleic Acids Res. 2008 Dec;36(21):e141 [PMID: 18927111]
  22. Syst Biol. 2007 Apr;56(2):252-70 [PMID: 17464881]
  23. PLoS Genet. 2007 Sep;3(9):1724-35 [PMID: 17907809]

Grants

  1. P01 AA019072/NIAAA NIH HHS
  2. P30 AI042853/NIAID NIH HHS
  3. P01-AA019072/NIAAA NIH HHS
  4. P30-AI042853/NIAID NIH HHS

MeSH Term

Classification
Gene Expression Regulation
High-Throughput Screening Assays
Models, Genetic
Phylogeny
Sequence Analysis, RNA
Species Specificity
Transcriptome

Word Cloud

Created with Highcharts 10.0.0geneexpressionwillstudiesnumberanalysesgenesspeciescomparativephylogeneticPhylogeneticpotentialwideidentifyevolutionarycharactersnewchallengesmustdatamethodsnindependentpvariableswellhigh-throughputanalysisphylogeniesgreataddressingrangequestionsexampleshiftscorrelatedchangesmorphologicalphysiologicaldevelopmentalinterestprovideentirelyopportunitiesrelatedparticularphenotypeshowever3keyaddressedrealizeFirstmeasuredmultiplemayfield-collectedparameterizedwaycancomparedacrossSecondnecessarydevelopsuitablelargemultidimensionaldatasetsdateobservationscontrastsgreaterbehaviorclassicproblemsnowunderstoodvarietyconditionsbasedtoolssamplesdwarfedestimatedcovariancematricessingularcomplicatinginterpretationpronespuriousresultsThirdapproachesneededinvestigatemanywhosecongruentduelossduplicationincompletelineagesortingoutlinegeneralconsiderationsprojectdesignsuggestsolutionsthreecategoriestopicsrelevantphenotypicbeyond

Similar Articles

Cited By