On the simultaneous association analysis of large genomic regions: a massive multi-locus association test.

Dandi Qiao, Michael H Cho, Heide Fier, Per S Bakke, Amund Gulsvik, Edwin K Silverman, Christoph Lange
Author Information
  1. Dandi Qiao: Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 20115, USA, Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA, Department of Genomic Mathematics, University of Bonn, 53113 Bonn, Germany and Department of Thoracic Medicine, Haukeland University Hospital and Section for Respiratory Medicine Institute of Medicine, University of Bergen, 5006 Bergen, Norway.

Abstract

MOTIVATION: For samples of unrelated individuals, we propose a general analysis framework in which hundred thousands of genetic loci can be tested simultaneously for association with complex phenotypes. The approach is built on spatial-clustering methodology, assuming that genetic loci that are associated with the target phenotype cluster in certain genomic regions. In contrast to standard methodology for multilocus analysis, which has focused on the dimension reduction of the data, our multilocus association-clustering test profits from the availability of large numbers of genetic loci by detecting clusters of loci that are associated with the phenotype.
RESULTS: The approach is computationally fast and powerful, enabling the simultaneous association testing of large genomic regions. Even the entire genome or certain chromosomes can be tested simultaneously. Using simulation studies, the properties of the approach are evaluated. In an application to a genome-wide association study for chronic obstructive pulmonary disease, we illustrate the practical relevance of the proposed method by simultaneously testing all genotyped loci of the genome-wide association study and by testing each chromosome individually. Our findings suggest that statistical methodology that incorporates spatial-clustering information will be especially useful in whole-genome sequencing studies in which millions or billions of base pairs are recorded and grouped by genomic regions or genes, and are tested jointly for association.
AVAILABILITY AND IMPLEMENTATION: Implementation of the approach is available upon request.

References

  1. Am J Hum Genet. 2008 Sep;83(3):311-21 [PMID: 18691683]
  2. PLoS Genet. 2007 Jul;3(7):e114 [PMID: 17676998]
  3. PLoS Genet. 2009 Mar;5(3):e1000421 [PMID: 19300482]
  4. BMC Med Inform Decis Mak. 2005 Jun 21;5:19 [PMID: 15969749]
  5. Biostatistics. 2012 Sep;13(4):762-75 [PMID: 22699862]
  6. Hum Mol Genet. 2012 Feb 15;21(4):947-57 [PMID: 22080838]
  7. Nat Methods. 2010 Apr;7(4):248-9 [PMID: 20354512]
  8. Science. 2009 Apr 17;324(5925):387-9 [PMID: 19264985]
  9. PLoS Genet. 2011 Feb 03;7(2):e1001289 [PMID: 21304886]
  10. Nat Rev Genet. 2008 Apr;9(4):255-66 [PMID: 18319743]
  11. Hum Mol Genet. 2002 Oct 1;11(20):2417-23 [PMID: 12351577]
  12. Mutat Res. 2007 Feb 3;615(1-2):28-56 [PMID: 17101154]
  13. Am J Hum Genet. 2007 Dec;81(6):1278-83 [PMID: 17966091]
  14. Am J Hum Genet. 2010 Jul 9;87(1):139-45 [PMID: 20598278]
  15. PLoS Genet. 2009 Feb;5(2):e1000384 [PMID: 19214210]
  16. PLoS Genet. 2011 Jul;7(7):e1002177 [PMID: 21829371]
  17. Am J Hum Genet. 2004 Sep;75(3):353-62 [PMID: 15272419]
  18. Nat Genet. 2010 Mar;42(3):200-2 [PMID: 20173748]
  19. Proc Natl Acad Sci U S A. 2004 Nov 9;101(45):15992-7 [PMID: 15520370]
  20. Am J Hum Genet. 2007 Apr;80(4):727-39 [PMID: 17357078]
  21. Nature. 2010 Oct 14;467(7317):832-8 [PMID: 20881960]
  22. Proc Natl Acad Sci U S A. 2006 Feb 7;103(6):1810-5 [PMID: 16449388]
  23. PLoS Genet. 2010 Oct 14;6(10):e1001156 [PMID: 20976247]
  24. Am J Hum Genet. 2007 Sep;81(3):559-75 [PMID: 17701901]
  25. N Engl J Med. 2009 Apr 23;360(17):1759-68 [PMID: 19369657]
  26. Am J Hum Genet. 2011 Jul 15;89(1):82-93 [PMID: 21737059]
  27. Bioinformatics. 2009 Mar 15;25(6):714-21 [PMID: 19176549]
  28. COPD. 2010 Feb;7(1):32-43 [PMID: 20214461]
  29. J Clin Invest. 2008 May;118(5):1590-605 [PMID: 18451988]
  30. Nat Genet. 2011 Sep 25;43(11):1082-90 [PMID: 21946350]
  31. PLoS Biol. 2010 Jan 26;8(1):e1000294 [PMID: 20126254]
  32. Comput Stat Data Anal. 2009 Aug 1;53(10):3640-3649 [PMID: 20161224]
  33. PLoS Genet. 2009 Mar;5(3):e1000429 [PMID: 19300500]

Grants

  1. R01 HL113264/NHLBI NIH HHS
  2. U01 HL089897/NHLBI NIH HHS
  3. R01 HL089856/NHLBI NIH HHS
  4. U01 HL089856/NHLBI NIH HHS
  5. R01 MH081862/NIMH NIH HHS
  6. R01 HL089897/NHLBI NIH HHS
  7. R01HL113264/NHLBI NIH HHS
  8. R01HL089856/NHLBI NIH HHS
  9. R01HL089897/NHLBI NIH HHS

MeSH Term

Case-Control Studies
Chromosomes, Human
Cluster Analysis
Computer Simulation
Genetic Loci
Genetic Markers
Genetic Predisposition to Disease
Genome, Human
Genome-Wide Association Study
Genomics
Genotype
Humans
Phenotype
Polymorphism, Single Nucleotide
Pulmonary Disease, Chronic Obstructive

Chemicals

Genetic Markers

Word Cloud

Created with Highcharts 10.0.0associationlociapproachgenomicanalysisgenetictestedsimultaneouslymethodologyregionslargetestingcanspatial-clusteringassociatedphenotypecertainmultilocustestsimultaneousstudiesgenome-widestudyMOTIVATION:samplesunrelatedindividualsproposegeneralframeworkhundredthousandscomplexphenotypesbuiltassumingtargetclustercontraststandardfocuseddimensionreductiondataassociation-clusteringprofitsavailabilitynumbersdetectingclustersRESULTS:computationallyfastpowerfulenablingEvenentiregenomechromosomesUsingsimulationpropertiesevaluatedapplicationchronicobstructivepulmonarydiseaseillustratepracticalrelevanceproposedmethodgenotypedchromosomeindividuallyfindingssuggeststatisticalincorporatesinformationwillespeciallyusefulwhole-genomesequencingmillionsbillionsbasepairsrecordedgroupedgenesjointlyAVAILABILITYANDIMPLEMENTATION:Implementationavailableuponrequestregions:massivemulti-locus

Similar Articles

Cited By