Machine learning dissection of human accelerated regions in primate neurodevelopment.

Sean Whalen, Fumitaka Inoue, Hane Ryu, Tyler Fair, Eirene Markenscoff-Papadimitriou, Kathleen Keough, Martin Kircher, Beth Martin, Beatriz Alvarado, Orry Elor, Dianne Laboy Cintron, Alex Williams, Md Abul Hassan Samee, Sean Thomas, Robert Krencik, Erik M Ullian, Arnold Kriegstein, John L Rubenstein, Jay Shendure, Alex A Pollen, Nadav Ahituv, Katherine S Pollard
Author Information
  1. Sean Whalen: Gladstone Institutes, San Francisco, CA 94158, USA.
  2. Fumitaka Inoue: Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA.
  3. Hane Ryu: Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA; Pharmaceutical Sciences and Pharmacogenomics Graduate Program, University of California, San Francisco, San Francisco, CA, USA.
  4. Tyler Fair: Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA.
  5. Eirene Markenscoff-Papadimitriou: Department of Psychiatry, University of California, San Francisco, San Francisco, CA, USA.
  6. Kathleen Keough: Gladstone Institutes, San Francisco, CA 94158, USA; Pharmaceutical Sciences and Pharmacogenomics Graduate Program, University of California, San Francisco, San Francisco, CA, USA.
  7. Martin Kircher: Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Berlin Institute of Health at Charité - Universitätsmedizin Berlin, 10117 Berlin, Germany; Institute of Human Genetics, University Medical Center Schleswig-Holstein, University of Lübeck, 23562 Lübeck, Germany.
  8. Beth Martin: Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA.
  9. Beatriz Alvarado: Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA.
  10. Orry Elor: Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA.
  11. Dianne Laboy Cintron: Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA.
  12. Alex Williams: Gladstone Institutes, San Francisco, CA 94158, USA.
  13. Md Abul Hassan Samee: Gladstone Institutes, San Francisco, CA 94158, USA.
  14. Sean Thomas: Gladstone Institutes, San Francisco, CA 94158, USA.
  15. Robert Krencik: Department of Neurosurgery, Center for Neuroregeneration, Houston Methodist Research Institute, Houston, TX, USA.
  16. Erik M Ullian: Departments of Ophthalmology and Physiology, University of California, San Francisco, San Francisco, CA, USA; Kavli Institute for Fundamental Neuroscience, University of California, San Francisco, San Francisco, CA, USA.
  17. Arnold Kriegstein: Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA.
  18. John L Rubenstein: Department of Psychiatry, University of California, San Francisco, San Francisco, CA, USA.
  19. Jay Shendure: Department of Genome Sciences, University of Washington, Seattle, WA 98195, USA; Howard Hughes Medical Institute, Seattle, WA 98195, USA; Brotman Baty Institute for Precision Medicine, Seattle, WA 98195, USA.
  20. Alex A Pollen: Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA; Eli and Edythe Broad Center of Regeneration Medicine and Stem Cell Research, University of California, San Francisco, San Francisco, CA 94143, USA; Department of Neurology, University of California, San Francisco, San Francisco, CA 94158, USA.
  21. Nadav Ahituv: Department of Bioengineering and Therapeutic Sciences, University of California, San Francisco, San Francisco, CA, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA. Electronic address: nadav.ahituv@ucsf.edu.
  22. Katherine S Pollard: Gladstone Institutes, San Francisco, CA 94158, USA; Institute for Human Genetics, University of California, San Francisco, San Francisco, CA, USA; Department of Epidemiology and Biostatistics and Institute for Computational Health Sciences, University of California, San Francisco, San Francisco, CA, USA; Chan-Zuckerberg Biohub, San Francisco, CA, USA. Electronic address: katherine.pollard@gladstone.ucsf.edu.

Abstract

Using machine learning (ML), we interrogated the function of all human-chimpanzee variants in 2,645 human accelerated regions (HARs), finding 43% of HARs have variants with large opposing effects on chromatin state and 14% on neurodevelopmental enhancer activity. This pattern, consistent with compensatory evolution, was confirmed using massively parallel reporter assays in chimpanzee and human neural progenitor cells. The species-specific enhancer activity of HARs was accurately predicted from the presence and absence of transcription factor footprints in each species. Despite these striking cis effects, activity of a given HAR sequence was nearly identical in human and chimpanzee cells. This suggests that HARs did not evolve to compensate for changes in the trans environment but instead altered their ability to bind factors present in both species. Thus, ML prioritized variants with functional effects on human neurodevelopment and revealed an unexpected reason why HARs may have evolved so rapidly.

Keywords

References

Nature. 2019 Oct;574(7778):418-422 [PMID: 31619793]
BMC Biol. 2017 Oct 2;15(1):89 [PMID: 28969617]
Science. 2006 Nov 3;314(5800):786 [PMID: 17082449]
Nat Commun. 2020 Jan 16;11(1):301 [PMID: 31949148]
Mol Biol Evol. 2010 Oct;27(10):2322-32 [PMID: 20494938]
Nat Methods. 2021 Oct;18(10):1196-1203 [PMID: 34608324]
Proc Natl Acad Sci U S A. 2021 Apr 20;118(16): [PMID: 33850016]
Genomics. 2015 Sep;106(3):159-164 [PMID: 26072433]
Nat Genet. 2022 Jul;54(7):940-949 [PMID: 35817977]
Stem Cells. 2013 Mar;31(3):458-66 [PMID: 23193063]
PLoS Comput Biol. 2014 Jun 26;10(6):e1003677 [PMID: 24967590]
Genome Biol. 2020 Sep 30;21(1):256 [PMID: 32998764]
Curr Biol. 2015 Mar 16;25(6):772-779 [PMID: 25702574]
Nature. 2007 Jan 11;445(7124):168-76 [PMID: 17151600]
Philos Trans R Soc Lond B Biol Sci. 2013 Nov 11;368(1632):20130025 [PMID: 24218637]
Cell Syst. 2016 Jul;3(1):95-8 [PMID: 27467249]
Am J Hum Genet. 2017 May 4;100(5):789-802 [PMID: 28475861]
Cell. 2019 Dec 12;179(7):1469-1482.e11 [PMID: 31835028]
Behav Brain Sci. 2004 Dec;27(6):831-55; discussion 855-85 [PMID: 16035403]
Nature. 2006 Nov 23;444(7118):499-502 [PMID: 17086198]
Mol Biol Evol. 2013 May;30(5):1088-102 [PMID: 23408798]
Genome Biol. 2015 Jan 05;16:22 [PMID: 25723102]
Nat Biotechnol. 2016 May;34(5):525-7 [PMID: 27043002]
Cell Rep. 2020 Apr 7;31(1):107489 [PMID: 32268104]
Elife. 2021 Apr 22;10: [PMID: 33885362]
Cell Stem Cell. 2019 Nov 7;25(5):713-727.e10 [PMID: 31631012]
Nat Methods. 2014 Mar;11(3):291-3 [PMID: 24509632]
Nucleic Acids Res. 2013 May 1;41(10):e108 [PMID: 23558742]
Nature. 2020 Nov;587(7833):240-245 [PMID: 33177664]
Nat Genet. 2018 May;50(5):668-681 [PMID: 29700475]
Nat Genet. 2019 Jan;51(1):63-75 [PMID: 30478444]
Nucleic Acids Res. 2019 Jul 2;47(W1):W191-W198 [PMID: 31066453]
Cell. 2019 Feb 7;176(4):743-756.e17 [PMID: 30735633]
Nature. 2020 Nov;587(7835):644-649 [PMID: 33057195]
Nucleic Acids Res. 2015 Apr 20;43(7):e47 [PMID: 25605792]
Mol Biol Evol. 2022 Jan 7;39(1): [PMID: 34662402]
Genome Res. 2014 Oct;24(10):1595-602 [PMID: 25035418]
Nucleic Acids Res. 2022 Jan 7;50(D1):D316-D325 [PMID: 34751401]
Am J Psychiatry. 2019 Mar 1;176(3):217-227 [PMID: 30818990]
EMBO J. 2013 May 29;32(11):1613-25 [PMID: 23591430]
Mol Psychiatry. 2018 May;23(5):1181-1188 [PMID: 28761083]
BMC Genomics. 2023 Jan 13;24(1):17 [PMID: 36639739]
Methods Mol Biol. 2019;1874:17-41 [PMID: 30353506]
Brain Behav Immun. 2017 Mar;61:259-265 [PMID: 27940260]
Genome Biol. 2020 Aug 20;21(1):210 [PMID: 32819422]
Nature. 2011 Oct 12;478(7370):476-82 [PMID: 21993624]
Nat Neurosci. 2021 Jul;24(7):941-953 [PMID: 34017130]
Nat Genet. 2018 Aug;50(8):1171-1179 [PMID: 30013180]
Genome Res. 2012 Sep;22(9):1790-7 [PMID: 22955989]
Nat Genet. 2013 Jun;45(6):580-5 [PMID: 23715323]
Science. 2018 Dec 14;362(6420): [PMID: 30545857]
Nucleic Acids Res. 2007 Jan;35(Database issue):D88-92 [PMID: 17130149]
Genome Res. 2017 Jan;27(1):38-52 [PMID: 27831498]
Science. 2008 Sep 5;321(5894):1346-50 [PMID: 18772437]
Schizophr Res. 1997 Dec 19;28(2-3):127-41 [PMID: 9468348]
Nat Methods. 2013 Dec;10(12):1213-8 [PMID: 24097267]
Cell Rep. 2020 Aug 18;32(7):108029 [PMID: 32814038]
Nat Genet. 2019 Aug;51(8):1252-1262 [PMID: 31367015]
Cell. 2016 Oct 6;167(2):341-354.e12 [PMID: 27667684]
Nat Commun. 2022 Jan 13;13(1):304 [PMID: 35027568]
Sci Rep. 2016 Jul 25;6:30337 [PMID: 27452274]
Proc Natl Acad Sci U S A. 2015 Oct 6;112(40):12516-21 [PMID: 26392547]
J Vis Exp. 2009 Oct 02;(32): [PMID: 19801965]
Bioinformatics. 2014 Nov 15;30(22):3143-51 [PMID: 25086003]
Cell Stem Cell. 2017 Apr 6;20(4):435-449.e4 [PMID: 28111201]
Hum Genet. 2020 Oct;139(10):1285-1297 [PMID: 32385526]
Nature. 2015 Oct 1;526(7571):68-74 [PMID: 26432245]
Neuron. 2021 Oct 20;109(20):3239-3251.e7 [PMID: 34478631]
Am J Hum Genet. 2007 Sep;81(3):559-75 [PMID: 17701901]
Am J Psychiatry. 2018 Jan 1;175(1):15-27 [PMID: 28969442]
Nature. 2022 Mar;603(7901):455-463 [PMID: 35264797]
Nat Methods. 2015 Apr;12(4):357-60 [PMID: 25751142]
Nat Genet. 2019 Mar;51(3):404-413 [PMID: 30617256]
Proc Biol Sci. 2011 Apr 7;278(1708):961-9 [PMID: 21177690]
Nature. 2016 Oct 27;538(7626):523-527 [PMID: 27760116]
Proc Natl Acad Sci U S A. 2021 Jan 12;118(2): [PMID: 33372131]
Dev Cell. 2015 Feb 23;32(4):423-34 [PMID: 25710529]
Nature. 2006 Sep 14;443(7108):167-72 [PMID: 16915236]
PLoS Genet. 2006 Oct 13;2(10):e168 [PMID: 17040131]
Curr Opin Genet Dev. 2014 Dec;29:15-21 [PMID: 25156517]
Stem Cell Reports. 2017 Oct 10;9(4):1221-1233 [PMID: 28966121]
Cell. 2020 Aug 6;182(3):754-769.e18 [PMID: 32610082]
Nat Genet. 2021 Jun;53(6):817-829 [PMID: 34002096]
Brain Res Brain Res Rev. 2000 Mar;31(2-3):118-29 [PMID: 10719140]
Methods Mol Biol. 2012;840:197-228 [PMID: 22237537]
Nat Protoc. 2017 Dec;12(12):2478-2492 [PMID: 29120462]
Mol Biol Evol. 2012 Mar;29(3):1047-57 [PMID: 22075116]
Mol Autism. 2017 May 22;8:21 [PMID: 28540026]

Grants

  1. U01 MH116438/NIMH NIH HHS
  2. T32 GM136547/NIGMS NIH HHS
  3. UM1 HG009408/NHGRI NIH HHS
  4. F31 HG011569/NHGRI NIH HHS
  5. R01 MH109907/NIMH NIH HHS
  6. R35 NS097305/NINDS NIH HHS
  7. UM1 HG011966/NHGRI NIH HHS
  8. R01 NS099099/NINDS NIH HHS
  9. R01 MH123178/NIMH NIH HHS
  10. DP2 MH122400/NIMH NIH HHS

MeSH Term

Animals
Humans
Chromatin
Enhancer Elements, Genetic
Machine Learning
Pan troglodytes
Transcription Factors
Brain

Chemicals

Chromatin
Transcription Factors