Pathway analysis with next-generation sequencing data.

Jinying Zhao, Yun Zhu, Eric Boerwinkle, Momiao Xiong
Author Information
  1. Jinying Zhao: Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA.
  2. Yun Zhu: Department of Epidemiology, Tulane University School of Public Health and Tropical Medicine, New Orleans, LA, USA.
  3. Eric Boerwinkle: Human Genetics Center, Division of Biostatistics, University of Texas Health Science Center at Houston, Houston, TX, USA.
  4. Momiao Xiong: Human Genetics Center, Division of Biostatistics, University of Texas Health Science Center at Houston, Houston, TX, USA.

Abstract

Although pathway analysis methods have been developed and successfully applied to association studies of common variants, the statistical methods for pathway-based association analysis of rare variants have not been well developed. Many investigators observed highly inflated false-positive rates and low power in pathway-based tests of association of rare variants. The inflated false-positive rates and low true-positive rates of the current methods are mainly due to their lack of ability to account for gametic phase disequilibrium. To overcome these serious limitations, we develop a novel statistic that is based on the smoothed functional principal component analysis (SFPCA) for pathway association tests with next-generation sequencing data. The developed statistic has the ability to capture position-level variant information and account for gametic phase disequilibrium. By intensive simulations, we demonstrate that the SFPCA-based statistic for testing pathway association with either rare or common or both rare and common variants has the correct type 1 error rates. Also the power of the SFPCA-based statistic and 22 additional existing statistics are evaluated. We found that the SFPCA-based statistic has a much higher power than other existing statistics in all the scenarios considered. To further evaluate its performance, the SFPCA-based statistic is applied to pathway analysis of exome sequencing data in the early-onset myocardial infarction (EOMI) project. We identify three pathways significantly associated with EOMI after the Bonferroni correction. In addition, our preliminary results show that the SFPCA-based statistic has much smaller P-values to identify pathway association than other existing methods.

References

  1. Am J Hum Genet. 2010 Jun 11;86(6):832-8 [PMID: 20471002]
  2. Nucleic Acids Res. 2010 Jul;38(Web Server issue):W749-54 [PMID: 20501604]
  3. BMC Bioinformatics. 2009 Dec 18;10:429 [PMID: 20021635]
  4. BMC Proc. 2011 Nov 29;5 Suppl 9:S121 [PMID: 22373425]
  5. BMC Proc. 2011 Nov 29;5 Suppl 9:S52 [PMID: 22373052]
  6. Bioinformatics. 2012 Jul 1;28(13):1797-9 [PMID: 22513993]
  7. Arterioscler Thromb. 1992 May;12(5):569-83 [PMID: 1576119]
  8. Am J Hum Genet. 2010 Nov 12;87(5):728-35 [PMID: 21055717]
  9. Nat Rev Genet. 2010 Dec;11(12):843-54 [PMID: 21085203]
  10. Am J Hum Genet. 2009 Jul;85(1):13-24 [PMID: 19539887]
  11. Am J Hum Genet. 2011 Jul 15;89(1):82-93 [PMID: 21737059]
  12. Invest New Drugs. 2011 Dec;29(6):1497-9 [PMID: 20676744]
  13. Cell. 2010 Apr 16;141(2):210-7 [PMID: 20403315]
  14. Am J Hum Genet. 2008 Sep;83(3):311-21 [PMID: 18691683]
  15. Genet Epidemiol. 2010 Apr;34(3):222-231 [PMID: 20013942]
  16. Physiol Rev. 2006 Apr;86(2):515-81 [PMID: 16601268]
  17. Heart. 2011 Feb;97(3):181-9 [PMID: 20884790]
  18. BMC Proc. 2011 Nov 29;5 Suppl 9:S90 [PMID: 22373113]
  19. Nutr Metab Cardiovasc Dis. 2012 Jan;22(1):1-7 [PMID: 22176921]
  20. Bioinformatics. 2009 Oct 15;25(20):2762-3 [PMID: 19620097]
  21. Gene Ther. 2012 Jun;19(6):622-9 [PMID: 22378343]
  22. BMC Proc. 2011 Nov 29;5 Suppl 9:S18 [PMID: 22373100]
  23. BMC Bioinformatics. 2009 Apr 03;10:102 [PMID: 19344520]
  24. PLoS Genet. 2009 Feb;5(2):e1000384 [PMID: 19214210]
  25. BMC Proc. 2011 Nov 29;5 Suppl 9:S119 [PMID: 22373354]
  26. Genet Epidemiol. 2008 Nov;32(7):658-68 [PMID: 18481796]
  27. Curr Pharm Biotechnol. 2012 Jan;13(1):37-45 [PMID: 21470163]
  28. Proc Natl Acad Sci U S A. 2005 Oct 25;102(43):15545-50 [PMID: 16199517]
  29. Bioinformatics. 2009 Jan 15;25(2):237-42 [PMID: 19029127]
  30. Nucleic Acids Res. 1999 Jan 1;27(1):29-34 [PMID: 9847135]
  31. BMC Bioinformatics. 2009 Feb 03;10:47 [PMID: 19192285]
  32. Nucleic Acids Res. 2010 Jul;38(Web Server issue):W90-5 [PMID: 20435672]
  33. Diagn Interv Radiol. 2011 Sep;17(3):290-6 [PMID: 20954112]
  34. BMC Proc. 2011 Nov 29;5 Suppl 9:S48 [PMID: 22373429]
  35. PLoS Genet. 2010 Aug 12;6(8): [PMID: 20714348]
  36. Eur J Hum Genet. 2010 Sep;18(9):1045-53 [PMID: 20442747]
  37. Biostatistics. 2011 Jan;12(1):18-32 [PMID: 20601626]
  38. Radiother Oncol. 2012 Jan;102(1):115-21 [PMID: 22100658]
  39. Trends Cardiovasc Med. 2002 Apr;12(3):108-14 [PMID: 12007735]
  40. Genet Epidemiol. 2009 Dec;33(8):700-9 [PMID: 19333968]
  41. Nature. 2013 Jan 10;493(7431):216-20 [PMID: 23201682]

Grants

  1. RC2 HL102923/NHLBI NIH HHS
  2. RC2 HL102926/NHLBI NIH HHS
  3. RC2 HL-102926/NHLBI NIH HHS
  4. 1R01AR057120-01/NIAMS NIH HHS
  5. RC2 HL-102923/NHLBI NIH HHS
  6. R01 HL106034/NHLBI NIH HHS
  7. R01 AR057120/NIAMS NIH HHS
  8. RC2 HL-102925/NHLBI NIH HHS
  9. RC2 HL103010/NHLBI NIH HHS
  10. R01 GM104411/NIGMS NIH HHS
  11. 1R01HL106034-01/NHLBI NIH HHS
  12. RC2 HL-102924/NHLBI NIH HHS
  13. RC2 HL102924/NHLBI NIH HHS
  14. RC2 HL-103010/NHLBI NIH HHS
  15. RC2 HL102925/NHLBI NIH HHS

MeSH Term

Black or African American
Case-Control Studies
Computer Simulation
Databases, Genetic
Exome
Gene Frequency
Genetic Association Studies
High-Throughput Nucleotide Sequencing
Humans
Models, Genetic
Myocardial Infarction
Polymorphism, Single Nucleotide
Principal Component Analysis
Sequence Analysis, DNA
Signal Transduction
Transforming Growth Factor beta
White People

Chemicals

Transforming Growth Factor beta

Word Cloud

Created with Highcharts 10.0.0statisticassociationpathwayanalysisSFPCA-basedmethodsvariantsrareratesdevelopedcommonpowersequencingdataexistingappliedpathway-basedinflatedfalse-positivelowtestsabilityaccountgameticphasedisequilibriumnext-generationstatisticsmuchEOMIidentifyAlthoughsuccessfullystudiesstatisticalwellManyinvestigatorsobservedhighlytrue-positivecurrentmainlyduelackovercomeseriouslimitationsdevelopnovelbasedsmoothedfunctionalprincipalcomponentSFPCAcaptureposition-levelvariantinformationintensivesimulationsdemonstratetestingeithercorrecttype1errorAlso22additionalevaluatedfoundhigherscenariosconsideredevaluateperformanceexomeearly-onsetmyocardialinfarctionprojectthreepathwayssignificantlyassociatedBonferronicorrectionadditionpreliminaryresultsshowsmallerP-valuesPathway

Similar Articles

Cited By