Theobroma cacao


Overview

Cocoa is an evergreen tree plant of the genus Theobroma in the family Malvaceae. The fruit is oval or elongated, initially light green and later turning dark yellow or reddish, with a thick skin that becomes brown when dried. Roasted and ground cocoa seeds can be processed into various food products, used as liquids, solids, or pastes, such as chocolate and cocoa drinks. Cocoa is one of the world's top three beverage crops and is widely cultivated globally.


Geographical Distribution

Cocoa is native to tropical regions of the Americas and was introduced to Asia and Africa after the 16th century. In 1674, the method of producing solid chocolate was invented. In the 18th century, the price of chocolate began to decrease, and it gradually became popular.

Cocoa trees grow within a limited geographical area, approximately 20 degrees north and south of the equator. In China, cocoa is mainly distributed in regions such as Taiwan, Guangdong, Hainan, and southern Yunnan.


Application

  • Edible use: Cocoa serves as the primary raw material for the production of high-grade beverages, chocolates, candies, pastries, ice creams, and more.
  • Economic use: Cocoa is a significant raw material in the food industry, possessing high economic value. It is rich in nutrients, imparting a delightful and aromatic flavor. Currently, it is extensively cultivated in Hainan Province, China, playing a crucial role as a local cash crop.
  • Health benefits: Cocoa seeds are abundant in active compounds such as flavonoids, fats, proteins, and dietary fiber. These elements contribute to various health benefits, including improving heart, kidney, and intestinal functions, alleviating angina, promoting digestion, and treating anemia.
  • Medicinal use: Extracted flavanols and proanthocyanidins from cocoa have a positive impact on cardiovascular protection. They aid in regulating blood pressure, increasing the production of nitric oxide, bolstering oxidative defense, and enhancing the immune system.

Genome sequencing

The Criollo cocoa variety has an almost unique pure genotype and was one of the first varieties grown. criollo is now one of the two cocoa varieties that provide premium quality chocolate. Its genome was completed under the leadership of the International Center for Cooperation in Agronomic Research for Development (CIRAD). The reference genome was obtained using a genome-wide shotgun strategy that combined Roche/454, Illumina, and Sanger sequencing technologies. This approach resulted in the generation of 25,912 contigs and 4,792 scaffolds, with a scaffold N50 value of 0.47 Mb. The assembled genome had a total length of 326.9 Mb, representing approximately 76% of the estimated genome size of cocoa genotype B97-61/B2 (430 Mb). Additionally, a total of 28,798 protein-coding genes were identified, with 23,529 genes anchored on 10 chromosomes.

In 2017, the researchers implemented an NGS-based approach to significantly enhance the assembly of the genome. They utilized four Illumina libraries with large insert sizes in combination with 52x Pacific Biosciences long reads to rectify misassembled regions and reduce the number of scaffolds. Additionally, genotyping by sequencing (GBS) methods were employed to improve the proportion of the assembly anchored to chromosomes. As a result of these improvements, the cumulative size of the new assemblies was reduced by 2.2 Mb compared to the first genome assembly. The number of scaffolds decreased from 4,792 to 554, while the scaffold N50 size increased from 0.47 Mb to 6.5 Mb. In total, 96.7% of the assemblies were anchored to 10 chromosomes, compared to 66.8% in the previous version.

Matina 1-6 clone is a traditional cultivar exhibiting the Amelonado phenotype and belongs to the Amelonado genetic group. This group shows limited genetic diversity and, importantly, it is the most common cultivated type of cocoa worldwide. Mars Incorporated utilized Sanger sequencing and Roche 454 pyrosequencing technologies to assemble the genome, resulting in a size of 445 Mbp. The chromosome-scale assembly consisted of 711 scaffolds, with a contig N50 of 84.4 kbp and a scaffold N50 of 34.4 Mb.


Reference

1.曹恒春,王毅,黄莉莎,等.可可全基因组SSR标记的开发及分析[J].山东农业大学学报(自然科学版),2013,44(3):340-344.DOI:10.3969/j.issn.1000-2324.2013.03.005.

2.秦晓威,郝朝运,吴刚,等.可可种质资源多样性与创新利用研究进展[J].热带作物学报,2014,35(1):188-194.DOI:10.3969/j.issn.1000-2561.2014.01.033.

3.刘昱希,刘明学.可可的种植·加工与产品发展[J].安徽农业科学,2014(22):7541-7544.DOI:10.3969/j.issn.0517-6611.2014.22.084.

4.李明洲.茶和可可中具有抗氧化作用的黄酮类化合物对心血管健康的有益证据[J].世界医学杂志,2002,006 (14):53-59.

5.房一明,初众,谷风林,等.响应面法优化海南可可豆中原花青素的提取工艺[J].热带作物学报,2020,41(4):779-786.DOI:10.3969/j.issn.1000-2561.2020.04.020.

6.Argout X, Salse J, Aury JM, et al. The genome of Theobroma cacao. Nat Genet. 2011;43(2):101-108. [OpenLBID: OLB-PM-21186351]

7.Argout X, Martin G, Droc G, et al. The cacao Criollo genome v2.0: an improved version of the genome for genetic and functional genomic studies. BMC Genomics. 2017;18(1):730. [OpenLBID: OLB-PM-28915793]

8.Motamayor JC, Mockaitis K, Schmutz J, et al. The genome sequence of the most widely cultivated cacao type and its use to identify candidate genes regulating pod color. Genome Biol. 2013;14(6):r53. [OpenLBID: OLB-PM-23731509]