Machine-guided design of synthetic cell type-specific -regulatory elements.
S J Gosai, R I Castro, N Fuentes, J C Butts, S Kales, R R Noche, K Mouri, P C Sabeti, S K Reilly, R Tewhey
Author Information
S J Gosai: Broad Institute of MIT and Harvard, Cambridge, MA, USA.
R I Castro: The Jackson Laboratory, Bar Harbor, ME, USA.
N Fuentes: The Jackson Laboratory, Bar Harbor, ME, USA.
J C Butts: The Jackson Laboratory, Bar Harbor, ME, USA.
S Kales: The Jackson Laboratory, Bar Harbor, ME, USA.
R R Noche: Department of Comparative Medicine, Yale School of Medicine, New Haven, CT, USA.
K Mouri: The Jackson Laboratory, Bar Harbor, ME, USA.
P C Sabeti: Broad Institute of MIT and Harvard, Cambridge, MA, USA.
S K Reilly: Department of Genetics, Yale School of Medicine, New Haven, CT, USA.
R Tewhey: The Jackson Laboratory, Bar Harbor, ME, USA.
中文译文
English
-regulatory elements (CREs) control gene expression, orchestrating tissue identity, developmental timing, and stimulus responses, which collectively define the thousands of unique cell types in the body. While there is great potential for strategically incorporating CREs in therapeutic or biotechnology applications that require tissue specificity, there is no guarantee that an optimal CRE for an intended purpose has arisen naturally through evolution. Here, we present a platform to engineer and validate synthetic CREs capable of driving gene expression with programmed cell type specificity. We leverage innovations in deep neural network modeling of CRE activity across three cell types, efficient optimization, and massively parallel reporter assays (MPRAs) to design and empirically test thousands of CREs. Through and validation, we show that synthetic sequences outperform natural sequences from the human genome in driving cell type-specific expression. Synthetic sequences leverage unique sequence syntax to promote activity in the on-target cell type and simultaneously reduce activity in off-target cells. Together, we provide a generalizable framework to prospectively engineer CREs and demonstrate the required literacy to write regulatory code that is fit-for-purpose across vertebrates.
Nature. 2020 Aug;584(7820):244-251
[PMID: 32728217 ]
Nat Rev Mol Cell Biol. 2015 Mar;16(3):144-54
[PMID: 25650801 ]
Nat Commun. 2020 Jul 29;11(1):3696
[PMID: 32728046 ]
N Engl J Med. 2022 Feb 3;386(5):415-427
[PMID: 34891223 ]
Nature. 2012 Sep 6;489(7414):75-82
[PMID: 22955617 ]
BMC Dev Biol. 2015 Nov 06;15:40
[PMID: 26545946 ]
Nat Genet. 2003 Jul;34(3):292-6
[PMID: 12808453 ]
Nat Rev Genet. 2012 Jun 18;13(7):469-83
[PMID: 22705667 ]
Diabetes Obes Metab. 2016 Sep;18 Suppl 1:23-32
[PMID: 27615128 ]
Cell. 2016 Jun 2;165(6):1530-1545
[PMID: 27259154 ]
Proc Natl Acad Sci U S A. 2016 Jun 7;113(23):6508-13
[PMID: 27155014 ]
Mol Ther. 2020 Aug 5;28(8):1753-1755
[PMID: 32710826 ]
Nat Methods. 2021 Oct;18(10):1196-1203
[PMID: 34608324 ]
Nucleic Acids Res. 2022 Jul 5;50(W1):W670-W676
[PMID: 35544234 ]
Nat Rev Drug Discov. 2021 Feb;20(2):101-124
[PMID: 33277608 ]
Nat Rev Genet. 2011 Dec 06;13(1):59-69
[PMID: 22143240 ]
Nat Rev Genet. 2014 Jul;15(7):453-68
[PMID: 24913666 ]
Dev Cell. 2004 Jul;7(1):133-44
[PMID: 15239961 ]
Mol Ther. 2001 Jan;3(1):28-35
[PMID: 11162308 ]
Science. 2023 Apr 28;380(6643):eabn2253
[PMID: 37104592 ]
Bioinformatics. 2009 Jun 1;25(11):1422-3
[PMID: 19304878 ]
Science. 2016 Nov 11;354(6313):769-773
[PMID: 27708057 ]
Nat Rev Drug Discov. 2018 Oct;17(10):767
[PMID: 30206384 ]
Bioinformatics. 2011 Apr 1;27(7):1017-8
[PMID: 21330290 ]
Nature. 1999 Oct 21;401(6755):788-91
[PMID: 10548103 ]
Nucleic Acids Res. 2016 Jun 20;44(11):e107
[PMID: 27084946 ]
Nature. 2024 Jan;625(7993):41-50
[PMID: 38093018 ]
Front Genet. 2020 Oct 26;11:591099
[PMID: 33193732 ]
PLoS One. 2009 Nov 16;4(11):e7855
[PMID: 19924231 ]
N Engl J Med. 2017 Nov 2;377(18):1713-1722
[PMID: 29091557 ]
Cell. 2020 Mar 19;180(6):1262-1271.e15
[PMID: 32169219 ]
Nat Rev Genet. 2020 May;21(5):292-310
[PMID: 31988385 ]
Bioinformatics. 2011 Dec 15;27(24):3423-4
[PMID: 21949271 ]
Nucleic Acids Res. 2015 Jul 1;43(W1):W39-49
[PMID: 25953851 ]
Nat Genet. 2021 Mar;53(3):354-366
[PMID: 33603233 ]
Hum Gene Ther. 2009 Jan;20(1):11-20
[PMID: 18828728 ]
Blood. 2023 Mar 2;141(9):1007-1022
[PMID: 36332160 ]
Cell. 2019 Jan 24;176(3):535-548.e24
[PMID: 30661751 ]
Science. 2015 Oct 16;350(6258):325-8
[PMID: 26472909 ]
Nature. 2020 Jul;583(7818):699-710
[PMID: 32728249 ]
Nat Med. 2020 Feb;26(2):200-206
[PMID: 31988463 ]
Hum Gene Ther. 2018 Mar;29(3):285-298
[PMID: 29378426 ]
PLoS Pathog. 2011 Sep;7(9):e1002281
[PMID: 21980295 ]
Cell Genom. 2022 Nov 9;2(11):
[PMID: 36742369 ]
Cell. 2021 Sep 16;184(19):4919-4938.e22
[PMID: 34506722 ]
Nat Biotechnol. 2012 Feb 26;30(3):271-7
[PMID: 22371084 ]
Genome Res. 2018 May;28(5):739-750
[PMID: 29588361 ]
Nucleic Acids Res. 2018 Jan 4;46(D1):D252-D259
[PMID: 29140464 ]
Mol Syst Biol. 2006;2:2006.0017
[PMID: 16738562 ]
Mol Cell Biol. 1996 Aug;16(8):4024-34
[PMID: 8754800 ]
Cell. 2016 Jun 2;165(6):1519-1529
[PMID: 27259153 ]
Mol Cell. 2010 May 28;38(4):576-89
[PMID: 20513432 ]
Nat Methods. 2015 Oct;12(10):931-4
[PMID: 26301843 ]
Genome Biol. 2017 Nov 20;18(1):219
[PMID: 29151363 ]
BMC Bioinformatics. 2021 Oct 20;22(1):510
[PMID: 34670493 ]
Cell Death Dis. 2014 Sep 11;5:e1417
[PMID: 25210800 ]
Nat Biotechnol. 2016 Nov;34(11):1180-1190
[PMID: 27701403 ]
Nat Genet. 2021 Nov;53(11):1564-1576
[PMID: 34650237 ]
PLoS One. 2019 Jun 17;14(6):e0218073
[PMID: 31206543 ]
Nucleic Acids Res. 2020 Jan 8;48(D1):D882-D889
[PMID: 31713622 ]
Elife. 2022 May 16;11:
[PMID: 35576146 ]
Nucleic Acids Res. 2005 Sep 21;33(16):5331-42
[PMID: 16177182 ]
Nature. 2022 Mar;603(7901):455-463
[PMID: 35264797 ]
Nat Methods. 2020 Nov;17(11):1083-1091
[PMID: 33046894 ]
Bioinformatics. 2021 Sep 29;37(18):2834-2840
[PMID: 33760053 ]
Nat Biotechnol. 2019 Jul;37(7):803-809
[PMID: 31267113 ]
G3 (Bethesda). 2012 Oct;2(10):1243-56
[PMID: 23050235 ]
Nat Commun. 2022 Aug 30;13(1):5099
[PMID: 36042233 ]
Cell. 2019 Jun 27;178(1):91-106.e23
[PMID: 31178116 ]
PLoS Comput Biol. 2023 Jan 31;19(1):e1010863
[PMID: 36719906 ]
Bioinformatics. 2023 Aug 1;39(8):
[PMID: 37490428 ]
Nat Genet. 2022 May;54(5):613-624
[PMID: 35551305 ]
Genome Biol. 2007;8(2):R24
[PMID: 17324271 ]
Bioinformatics. 2010 Mar 15;26(6):841-2
[PMID: 20110278 ]
Genome Res. 2016 Jul;26(7):990-9
[PMID: 27197224 ]
Genome Res. 2012 Jun;22(6):1059-68
[PMID: 22442009 ]
Development. 2009 Oct;136(19):3311-22
[PMID: 19736326 ]
Cell. 2015 Oct 22;163(3):698-711
[PMID: 26496609 ]
Cell. 2018 Feb 8;172(4):650-665
[PMID: 29425488 ]
Nucleic Acids Res. 2022 Jan 7;50(D1):D165-D173
[PMID: 34850907 ]
R00 HG010669/NHGRI NIH HHS
R01 HG012872/NHGRI NIH HHS
R35 HG011329/NHGRI NIH HHS
UM1 HG009435/NHGRI NIH HHS