On the simultaneous association analysis of large genomic regions: a massive multi-locus association test.
Dandi Qiao, Michael H Cho, Heide Fier, Per S Bakke, Amund Gulsvik, Edwin K Silverman, Christoph Lange
Author Information
Dandi Qiao: Department of Biostatistics, Harvard School of Public Health, 655 Huntington Avenue, Boston, MA 20115, USA, Channing Division of Network Medicine, Department of Medicine, Brigham and Women's Hospital and Harvard Medical School, Boston, MA 02115, USA, Department of Genomic Mathematics, University of Bonn, 53113 Bonn, Germany and Department of Thoracic Medicine, Haukeland University Hospital and Section for Respiratory Medicine Institute of Medicine, University of Bergen, 5006 Bergen, Norway.
MOTIVATION: For samples of unrelated individuals, we propose a general analysis framework in which hundred thousands of genetic loci can be tested simultaneously for association with complex phenotypes. The approach is built on spatial-clustering methodology, assuming that genetic loci that are associated with the target phenotype cluster in certain genomic regions. In contrast to standard methodology for multilocus analysis, which has focused on the dimension reduction of the data, our multilocus association-clustering test profits from the availability of large numbers of genetic loci by detecting clusters of loci that are associated with the phenotype. RESULTS: The approach is computationally fast and powerful, enabling the simultaneous association testing of large genomic regions. Even the entire genome or certain chromosomes can be tested simultaneously. Using simulation studies, the properties of the approach are evaluated. In an application to a genome-wide association study for chronic obstructive pulmonary disease, we illustrate the practical relevance of the proposed method by simultaneously testing all genotyped loci of the genome-wide association study and by testing each chromosome individually. Our findings suggest that statistical methodology that incorporates spatial-clustering information will be especially useful in whole-genome sequencing studies in which millions or billions of base pairs are recorded and grouped by genomic regions or genes, and are tested jointly for association. AVAILABILITY AND IMPLEMENTATION: Implementation of the approach is available upon request.
References
Am J Hum Genet. 2008 Sep;83(3):311-21
[PMID: 18691683]