Quentin Piet, Gaetan Droc, William Marande, Gautier Sarah, Stéphanie Bocs, Christophe Klopp, Mickael Bourge, Sonja Siljak-Yakovlev, Olivier Bouchez, Céline Lopez-Roques, Sandra Lepers-Andrzejewski, Laurent Bourgois, Joseph Zucca, Michel Dron, Pascale Besse, Michel Grisoni, Cyril Jourda, Carine Charron
Vanilla planifolia, the species cultivated to produce one of the world's most popular flavors, is highly prone to partial genome endoreplication, which leads to highly unbalanced DNA content in cells. We report here the first molecular evidence of partial endoreplication at the chromosome scale by the assembly and annotation of an accurate haplotype-phased genome of V. planifolia. Cytogenetic data demonstrated that the diploid genome size is 4.09 Gb, with 16 chromosome pairs, although aneuploid cells are frequently observed. Using PacBio HiFi and optical mapping, we assembled and phased a diploid genome of 3.4 Gb with a scaffold N50 of 1.2 Mb and 59 128 predicted protein-coding genes. The atypical k-mer frequencies and the uneven sequencing depth observed agreed with our expectation of unbalanced genome representation. Sixty-seven percent of the genes were scattered over only 30% of the genome, putatively linking gene-rich regions and the endoreplication phenomenon. By contrast, low-coverage regions (non-endoreplicated) were rich in repeated elements but also contained 33% of the annotated genes. Furthermore, this assembly showed distinct haplotype-specific sequencing depth variation patterns, suggesting complex molecular regulation of endoreplication along the chromosomes. This high-quality, anchored assembly represents 83% of the estimated V. planifolia genome. It provides a significant step toward the elucidation of this complex genome. To support post-genomics efforts, we developed the Vanilla Genome Hub, a user-friendly integrated web portal that enables centralized access to high-throughput genomic and other omics data and interoperable use of bioinformatics tools.
Genes Dev. 1996 Oct 1;10(19):2514-26
[PMID:
8843202]
PLoS One. 2014 May 02;9(5):e91929
[PMID:
24786468]
Plant Cell Rep. 2015 Sep;34(9):1477-88
[PMID:
26123291]
Nucleic Acids Res. 2003 Nov 15;31(22):6633-9
[PMID:
14602924]
Bioinformatics. 2018 Sep 15;34(18):3094-3100
[PMID:
29750242]
Plant Biotechnol J. 2018 Dec;16(12):2027-2041
[PMID:
29704444]
Bioinformatics. 2012 Dec 1;28(23):3150-2
[PMID:
23060610]
PLoS One. 2011 Jan 31;6(1):e16526
[PMID:
21304975]
Nat Ecol Evol. 2020 Jun;4(6):841-852
[PMID:
32231327]
J Exp Bot. 2019 Feb 20;70(4):1069-1076
[PMID:
30590678]
Nat Commun. 2019 Oct 10;10(1):4604
[PMID:
31601818]
Cytometry A. 2015 Oct;87(10):958-66
[PMID:
25929591]
Plant Cell. 2018 Oct;30(10):2330-2351
[PMID:
30115738]
Trends Plant Sci. 2011 Nov;16(11):624-34
[PMID:
21889902]
Bioinformatics. 2014 May 1;30(9):1236-40
[PMID:
24451626]
Development. 2012 Oct;139(20):3817-26
[PMID:
22991446]
Hortic Res. 2021 Sep 1;8(1):183
[PMID:
34465765]
Chromosome Res. 2020 Jun;28(2):183-194
[PMID:
32219602]
Genes Dev. 2009 Nov 1;23(21):2461-77
[PMID:
19884253]
Nat Biotechnol. 2019 Aug;37(8):907-915
[PMID:
31375807]
Genes (Basel). 2019 Jan 29;10(2):
[PMID:
30700014]
Chromosome Res. 2019 Sep;27(3):153-165
[PMID:
30852707]
Nat Food. 2020 Dec;1(12):811-819
[PMID:
37128067]
Genome Biol Evol. 2016 Jul 02;8(6):1996-2005
[PMID:
27324917]
Methods Mol Biol. 2019;1962:97-120
[PMID:
31020556]
Genome. 2008 Oct;51(10):816-26
[PMID:
18923533]
Bioinformatics. 2015 Oct 1;31(19):3210-2
[PMID:
26059717]
Cell. 2013 Jan 31;152(3):406-16
[PMID:
23374338]
Genetics. 2015 Jul;200(3):771-9
[PMID:
25971668]
Proc Natl Acad Sci U S A. 2020 Apr 28;117(17):9451-9457
[PMID:
32300014]
Bioinformatics. 2017 Feb 15;33(4):574-576
[PMID:
27797770]
Genome Biol Evol. 2017 Apr 1;9(4):1051-1071
[PMID:
28419219]
Curr Opin Plant Biol. 2020 Apr;54:85-92
[PMID:
32217456]
Nature. 2018 Nov;563(7732):501-507
[PMID:
30429615]
New Phytol. 2019 Dec;224(4):1642-1656
[PMID:
31215648]
Genome Biol. 2019 Nov 14;20(1):238
[PMID:
31727128]
Cytometry A. 2003 Feb;51(2):127-8; author reply 129
[PMID:
12541287]
Curr Protoc Bioinformatics. 2009 Mar;Chapter 4:4.10.1-4.10.14
[PMID:
19274634]
Plant Biotechnol J. 2021 Oct;19(10):1967-1978
[PMID:
33960617]
Nat Methods. 2021 Feb;18(2):170-175
[PMID:
33526886]
Nat Genet. 2018 Feb;50(2):285-296
[PMID:
29358651]
Bioinformatics. 2013 Apr 15;29(8):1072-5
[PMID:
23422339]
J Mol Biol. 2016 Feb 22;428(4):726-731
[PMID:
26585406]
Am J Bot. 2011 Jun;98(6):986-97
[PMID:
21613071]
Nat Genet. 2015 Jan;47(1):65-72
[PMID:
25420146]
Bioinformatics. 2011 Mar 15;27(6):764-70
[PMID:
21217122]
Annu Rev Plant Biol. 2021 Jun 17;72:273-296
[PMID:
33689401]
Plant J. 2021 Jul;107(2):511-524
[PMID:
33960537]
J Plant Res. 2021 Nov;134(6):1291-1300
[PMID:
34282484]
Gigascience. 2020 Aug 1;9(8):
[PMID:
32808665]
Bioinformatics. 2010 Mar 15;26(6):841-2
[PMID:
20110278]
PLoS Biol. 2021 Jul 29;19(7):e3001309
[PMID:
34324490]
Nat Rev Genet. 2007 Dec;8(12):973-82
[PMID:
17984973]