Introduction

Quantifying the distribution of fitness effects among newly arising mutations in the human genome is key to resolving important debates in medical and evolutionary genetics. Here, we present a method for inferring this distribution using Single Nucleotide Polymorphism (SNP) data from a population with non-stationary demographic history (such as that of modern humans). Application of our method to 47,576 coding SNPs found by direct resequencing of 11,404 protein coding-genes in 35 individuals (20 European Americans and 15 African Americans) allows us to assess the relative contribution of demographic and selective effects to patterning amino acid variation in the human genome. We find evidence of an ancient population expansion in the sample with African ancestry and a relatively recent bottleneck in the sample with European ancestry. After accounting for these demographic effects, we find strong evidence for great variability in the selective effects of new amino acid replacing mutations. In both populations, the patterns of variation are consistent with a leptokurtic distribution of selection coefficients (e.g., gamma or log-normal) peaked near neutrality. Specifically, we predict 27-29% of amino acid changing (nonsynonymous) mutations are neutral or nearly neutral (|s|<0.01%), 30-42% are moderately deleterious (0.01%<|s|<1%), and nearly all the remainder are highly deleterious or lethal (|s|>1%). Our results are consistent with 10-20% of amino acid differences between humans and chimpanzees having been fixed by positive selection with the remainder of differences being neutral or nearly neutral. Our analysis also predicts that many of the alleles identified via whole-genome association mapping may be selectively neutral or (formerly) positively selected, implying that deleterious genetic variation affecting disease phenotype may be missed by this widely used approach for mapping genes underlying complex traits.

Publications

  1. Assessing the evolutionary impact of amino acid mutations in the human genome.
    Cite this
    Boyko AR, Williamson SH, Indap AR, Degenhardt JD, Hernandez RD, Lohmueller KE, Adams MD, Schmidt S, Sninsky JJ, Sunyaev SR, White TJ, Nielsen R, Clark AG, Bustamante CD, 2008-05-01 - PLoS genetics

Credits

  1. Adam R Boyko
    Developer

    Department of Biological Statistics and Computational Biology, Cornell University, United States of America

  2. Scott H Williamson
    Developer

  3. Amit R Indap
    Developer

  4. Jeremiah D Degenhardt
    Developer

  5. Ryan D Hernandez
    Developer

  6. Kirk E Lohmueller
    Developer

  7. Mark D Adams
    Developer

  8. Steffen Schmidt
    Developer

  9. John J Sninsky
    Developer

  10. Shamil R Sunyaev
    Developer

  11. Thomas J White
    Developer

  12. Rasmus Nielsen
    Developer

  13. Andrew G Clark
    Developer

  14. Carlos D Bustamante
    Investigator

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT005033
Tool TypeApplication
Category
PlatformsLinux/Unix
TechnologiesC
User InterfaceTerminal Command Line
Download Count0
Submitted ByCarlos D Bustamante