A phased Vanilla planifolia genome enables genetic improvement of flavour and production.
Tomas Hasing, Haibao Tang, Maria Brym, Fayaz Khazi, Tengfang Huang, Alan H Chambers
Author Information
Tomas Hasing: Elo Life Systems, Durham, NC, USA.
Haibao Tang: Center for Genomics and Biotechnology, Fujian Agricultural and Forestry University, Fuzhou, China. ORCID
Maria Brym: Tropical Research and Education Center, Horticultural Sciences Department, University of Florida, Homestead, FL, USA.
Fayaz Khazi: Elo Life Systems, Durham, NC, USA.
Tengfang Huang: Elo Life Systems, Durham, NC, USA. thuang@elolife.ag. ORCID
Alan H Chambers: Tropical Research and Education Center, Horticultural Sciences Department, University of Florida, Homestead, FL, USA. ac@ufl.edu. ORCID
The global supply of vanilla extract is primarily sourced from the cured beans of the tropical orchid species Vanilla planifolia. Vanilla plants were collected from Mesoamerica, clonally propagated and globally distributed as part of the early spice trade. Today, the global food and beverage industry depends on descendants of these original plants that have not generally benefited from genetic improvement. As a result, vanilla growers and processors struggle to meet global demand for vanilla extract and are challenged by inefficient and unsustainable production practices. Here, we report a chromosome-scale, phased V. planifolia genome, which reveals sequence variants for genes that may impact the vanillin pathway and therefore influence bean quality. Resequencing of related vanilla species, including the minor commercial species Vanilla × tahitensis, identified genes that could impact productivity and post-harvest losses through pod dehiscence, flower anatomy and disease resistance. The vanilla genome reported in this study may enable accelerated breeding of vanilla to improve high-value traits.
References
Childers, N. F. Vanilla Culture in Puerto Rico (US Department of Agriculture, 1948).
Medina, J. D. L. C., Jiménes, G. C. R. & García, H. S. Vanilla: Post-Harvest Operations (Food and Agriculture Organization of the United Nations, 2009).
Vanilla Beans and Extract Market Worth US$ 4.3 Bn by 2025 (Acumen Research and Consulting, 2019).
Correll, D. S. Vanilla—its botany, history, cultivation and economic import. Econ. Bot. 7, 291–358 (1953).
[DOI: 10.1007/BF02930810]
Ecott, T. Vanilla: Travels in Search of the Luscious Substance (Penguin UK, 2005).
Chambers, A. H. Advances in Plant Breeding Strategies: Industrial and Food Crops Ch. 18 (Springer, 2019).
Chambers, A. H., Moon, P., Edmond, V. & Bassil, E. Vanilla Cultivation in Southern Florida (EDIS, 2019).
Sasikumar, B. Vanilla breeding—a review. Agric. Rev. 31, 139–144 (2010).
Lepers-Andrzejewski, S., Causse, S., Caromel, B., Wong, M. & Dron, M. Genetic linkage map and diversity analysis of Tahitian vanilla (Vanilla × tahitensis, Orchidaceae). Crop Sci. 52, 795–806 (2012).
[DOI: 10.2135/cropsci2010.11.0634]
Yang, H. L. et al. A re-evaluation of the final step of vanillin biosynthesis in the orchid Vanilla planifolia. Phytochemistry 139, 33–46 (2017).
[PMID: 28411481]
Dong, Y. & Wang, Y. Z. Seed shattering: from models to crops. Front. Plant Sci. 6, 476 (2015).
[PMID: 26157453]
Lapeyre-Montes, F., Conejero, G., Verdeil, J.-L. & Odoux, E. in Vanilla (Medicinal and Aromatic Plants—Industrial Profiles) (eds Odoux, E. & Grisoni, M.) Ch. 10 (CRC Press, 2010).
Soto-Arenas, M. & Cameron, K. in Genera Orchidacearum Vol. 3 (eds Pridgeon, A. M. et al.) 321–334 (Oxford Univ. Press, 2003).
Gigant, R. L. et al. in Microsatellite Markers Ch. 4, 73–93 (IntechOpen, 2016).
National Academies of Sciences, Engineering, and Medicine A Review of the Citrus Greening Research and Development Efforts Supported by the Citrus Research and Development Foundation: Fighting a Ravaging Disease (National Academies Press, 2018).
Ploetz, R. C. Fusarium wilt of banana. Phytopathology 105, 1512–1521 (2015).
[PMID: 26057187]
Delassus, M. La lutte contre la fusariose du vanillier par les méthodes génétiques. Agron. Trop. 18, 245–246 (1963).
Hu, Y. et al. Genomics-based diversity analysis of vanilla species using a Vanilla planifolia draft genome and genotyping-by-sequencing. Sci. Rep. 9, 3416 (2019).
[PMID: 30833623]
Brown, S. C. et al. DNA remodeling by strict partial endoreplication in orchids, an original process in the plant kingdom. Genome Biol. Evol. 9, 1051–1071 (2017).
[PMID: 28419219]
Bory, S. et al. Natural polyploidy in Vanilla planifolia (Orchidaceae). Genome 51, 816–826 (2008).
[PMID: 18923533]
Lepers-Andrzejewski, S., Siljak-Yakovlev, S., Brown, S. C., Wong, M. & Dron, M. Diversity and dynamics of plant genome size: an example of polysomaty from a cytogenetic study of Tahitian vanilla (Vanilla × tahitensis, Orchidaceae). Am. J. Bot. 98, 986–997 (2011).
[PMID: 21613071]
Cai, J. et al. The genome sequence of the orchid Phalaenopsis equestris. Nat. Genet. 47, 65–72 (2015).
[PMID: 25420146]
Zhang, G. Q. et al. The Dendrobium catenatum Lindl. genome sequence provides insights into polysaccharide synthase, floral development and adaptive evolution. Sci. Rep. 6, 19029 (2016).
[PMID: 26754549]
Zhang, G. Q. et al. The Apostasia genome and the evolution of orchids. Nature 549, 379–383 (2017).
[PMID: 28902843]
Wang, W. et al. The Spirodela polyrhiza genome reveals insights into its neotenous reduction fast growth and aquatic lifestyle. Nat. Commun. 5, 3311 (2014).
[PMID: 24548928]
Ming, R. et al. The pineapple genome and the evolution of CAM photosynthesis. Nat. Genet. 47, 1435–1442 (2015).
[PMID: 26523774]
Lubinsky, P. et al. Neotropical roots of a Polynesian spice: the hybrid origin of Tahitian vanilla, Vanilla tahitensis (Orchidaceae). Am. J. Bot. 95, 1040–1047 (2008).
[PMID: 21632424]
Gallage, N. J. et al. The intracellular localization of the vanillin biosynthetic machinery in pods of Vanilla planifolia. Plant Cell Physiol. 59, 304–318 (2018).
[PMID: 29186560]
Rao, X. et al. A deep transcriptomic analysis of pod development in the vanilla orchid (Vanilla planifolia). BMC Genomics 15, 964 (2014).
[PMID: 25380694]
Gallage, N. J. & Møller, B. L. in Biotechnology of Natural Products Ch. 1, 3–24 (Springer, 2018).
Widiez, T. et al. Functional characterization of two new members of the caffeoyl CoA O-methyltransferase-like gene family from Vanilla planifolia reveals a new class of plastid-localized O-methyltransferases. Plant Mol. Biol. 76, 475–488 (2011).
[PMID: 21629984]
Fock-Bastide, I. et al. Expression profiles of key phenylpropanoid genes during Vanilla planifolia pod development reveal a positive correlation between PAL gene expression and vanillin biosynthesis. Plant Physiol. Biochem. 74, 304–314 (2014).
[PMID: 24342082]
Gallage, N. J. et al. Vanillin formation from ferulic acid in Vanilla planifolia is catalysed by a single enzyme. Nat. Commun. 5, 4037 (2014).
[PMID: 24941968]
Odoux, E. & Brillouet, J.-M. Anatomy, histochemistry and biochemistry of glucovanillin, oleoresin and mucilage accumulation sites in green mature vanilla pod (Vanilla planifolia; Orchidaceae): a comprehensive and critical reexamination. Fruits 64, 221–241 (2009).
[DOI: 10.1051/fruits/2009017]
Zhang, M. P. et al. Preparation of megabase-sized DNA from a variety of organisms using the nuclei method for advanced genomics research. Nat. Protoc. 7, 467–478 (2012).
[PMID: 22343429]
Datema, E. et al. The megabase-sized fungal genome of Rhizoctonia solani assembled from nanopore reads only. Preprint at bioRxiv https://doi.org/10.1101/084772 (2016).
Li, H. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34, 3094–3100 (2018).
[PMID: 29750242]
Li, H. Minimap and miniasm: fast mapping and de novo assembly for noisy long sequences. Bioinformatics 32, 2103–2110 (2016).
[PMID: 27153593]
Vaser, R., Sovic, I., Nagarajan, N. & Sikic, M. Fast and accurate de novo genome assembly from long uncorrected reads. Genome Res. 27, 737–746 (2017).
[PMID: 28100585]
Lee, Y. G. et al. Constructing a reference genome in a single lab: the possibility to use Oxford nanopore technology. Plants 8, 270 (2019).
[>PMCID: ]
Michael, T. P. et al. High contiguity Arabidopsis thaliana genome assembly with a single nanopore flow cell. Nat. Commun. 9, 541 (2018).
[PMID: 29416032]
Giordano, F. et al. De novo yeast genome assemblies from MinION, PacBio and MiSeq platforms. Sci. Rep. 7, 3935 (2017).
[PMID: 28638050]
Liao, Y. C. et al. Completing circular bacterial genomes with assembly complexity by using a sampling strategy from a single MinION run with barcoding. Front. Microbiol. 10, 2068 (2019).
[PMID: 31551994]
Roach, M. J., Schmidt, S. A. & Borneman, A. R. Purge Haplotigs: allelic contig reassignment for third-gen diploid genome assemblies. BMC Bioinformatics 19, 460 (2018).
[PMID: 30497373]
Li, H. Aligning sequence reads, clone sequences and assembly contigs with BWA-MEM. Preprint at http://arxiv.org/abs/1303.3997 (2013).
Faust, G. G. & Hall, I. M. SAMBLASTER: fast duplicate marking and structural variant read extraction. Bioinformatics 30, 2503–2505 (2014).
[PMID: 24812344]
Li, H. et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics 25, 2078–2079 (2009).
[PMID: 19505943]
Kronenberg, Z. N. et al. FALCON-Phase: integrating PacBio and Hi-C data for phased diploid genomes. Preprint at BioRxiv https://doi.org/10.1101/327064 (2018).
Ghurye, J., Pop, M., Koren, S., Bickhart, D. & Chin, C. S. Scaffolding of long read assemblies using long range contact information. BMC Genomics 18, 527 (2017).
[PMID: 28701198]
Burton, J. N. et al. Chromosome-scale scaffolding of de novo genome assemblies based on chromatin interactions. Nat. Biotechnol. 31, 1119–1125 (2013).
[PMID: 24185095]
Durand, N. C. et al. Juicebox provides a visualization system for Hi-C contact maps with unlimited zoom. Cell Syst. 3, 99–101 (2016).
[PMID: 27467250]
Ranallo-Benavidez, T. R., Jaron, K. S. & Schatz, M. C. GenomeScope 2.0 and Smudgeplots: reference-free profiling of polyploid genomes. Preprint at BioRxiv https://doi.org/10.1101/747568 (2019).
Price, A. L., Jones, N. C. & Pevzner, P. A. De novo identification of repeat families in large genomes. Bioinformatics 21, I351–I358 (2005).
[PMID: 15961478]
Tarailo-Graovac, M. & Chen, N. Using RepeatMasker to identify repetitive elements in genomic sequences. Curr. Protoc. Bioinformatics 25, 4.10.1–4.10.14 (2009).
Stanke, M. & Waack, S. Gene prediction with a hidden Markov model and a new intron submodel. Bioinformatics 19, ii215–ii225 (2003).
[PMID: 14534192]
Guigo, R., Knudsen, S., Drake, N. & Smith, T. Prediction of gene structure. J. Mol. Biol. 226, 141–157 (1992).
[PMID: 1619647]
Korf, I. Gene finding in novel genomes. BMC Bioinformatics 5, 59 (2004).
[PMID: 15144565]
Majoros, W. H., Pertea, M. & Salzberg, S. L. TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics 20, 2878–2879 (2004).
[PMID: 15145805]
Pertea, M., Kim, D., Pertea, G. M., Leek, J. T. & Salzberg, S. L. Transcript-level expression analysis of RNA-seq experiments with HISAT, StringTie and Ballgown. Nat. Protoc. 11, 1650–1667 (2016).
[PMID: 27560171]
Kim, D., Paggi, J. M., Park, C., Bennett, C. & Salzberg, S. L. Graph-based genome alignment and genotyping with HISAT2 and HISAT-genotype. Nat. Biotechnol. 37, 907–915 (2019).
[PMID: 31375807]
Haas, B. J. et al. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Res. 31, 5654–5666 (2003).
[PMID: 14500829]
Sato, S. et al. The tomato genome sequence provides insights into fleshy fruit evolution. Nature 485, 635–641 (2012).
[DOI: 10.1038/nature11119]
Osuna-Cruz, C. M. et al. PRGdb 3.0: a comprehensive platform for prediction and analysis of plant disease resistance genes. Nucleic Acids Res. 46, D1197–D1201 (2018).
[PMID: 29156057]
Frazee, A. C. et al. Ballgown bridges the gap between transcriptome assembly and expression analysis. Nat. Biotechnol. 33, 243–246 (2015).
[PMID: 25748911]
Kumar, S., Stecher, G., Li, M., Knyaz, C. & Tamura, K. MEGA X: molecular evolutionary genetics analysis across computing platforms. Mol. Biol. Evol. 35, 1547–1549 (2018).
[PMID: 29722887]
Whelan, S. & Goldman, N. A general empirical model of protein evolution derived from multiple protein families using a maximum-likelihood approach. Mol. Biol. Evol. 18, 691–699 (2001).
[PMID: 11319253]
Aronesty, E. ea-utils (fastqmcf) (2011); https://expressionanalysis.github.io/ea-utils/
Kim, D., Landmead, B. & Salzberg, S. L. HISAT: a fast spliced aligner with low memory requirements. Nat. Methods 12, 357–360 (2015).
[PMID: 25751142]
Garrison, E. & Marth, G. Haplotype-based variant detection from short-read sequencing. Preprint at http://arxiv.org/abs/1207.3907 (2012).
Cingolani, P. et al. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w; iso-2; iso-3. Fly 6, 80–92 (2012).
[PMID: 22728672]
Goodstein, D. M. et al. Phytozome: a comparative platform for green plant genomics. Nucleic Acids Res. 40, D1178–D1186 (2012).
[PMID: 22110026]
Suyama, M., Torrents, D. & Bork, P. PAL2NAL: robust conversion of protein sequence alignments into the corresponding codon alignments. Nucleic Acids Res. 34, W609–W612 (2006).
[PMID: 16845082]
Yang, Z. H. PAML 4: phylogenetic analysis by maximum likelihood. Mol. Biol. Evol. 24, 1586–1591 (2007).
[PMID: 17483113]
Dixon, R. A. in Handbook of Vanilla Science and Technology Ch. 24 (Wiley, 2018).