De novo assembly of a new Olea europaea genome accession using nanopore sequencing.

Guodong Rao, Jianguo Zhang, Xiaoxia Liu, Chunfu Lin, Huaigen Xin, Li Xue, Chenhe Wang
Author Information
  1. Guodong Rao: State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China. rgd@caf.ac.cn. ORCID
  2. Jianguo Zhang: State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China. Ralf02@163.com.
  3. Xiaoxia Liu: State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China.
  4. Chunfu Lin: MIANNING Yuansheng Agricultural Science and Technology Co., Ltd., Liangshan Yi Autonomous Prefecture Mianning County, Sichuan, 615600, China.
  5. Huaigen Xin: Biomarker Technologies Corporation, Beijing, 101300, China.
  6. Li Xue: State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China.
  7. Chenhe Wang: State Key Laboratory of Tree Genetics and Breeding, Research Institute of Forestry, Chinese Academy of Forestry, Beijing, 100091, China.

Abstract

Olive (Olea europaea L.) is internationally renowned for its high-end product, extra virgin olive oil. An incomplete genome of O. europaea was previously obtained using shotgun sequencing in 2016. To further explore the genetic and breeding utilization of olive, an updated draft genome of olive was obtained using Oxford Nanopore third-generation sequencing and Hi-C technology. Seven different assembly strategies were used to assemble the final genome of 1.30 Gb, with contig and scaffold N50 sizes of 4.67 Mb and 42.60 Mb, respectively. This greatly increased the quality of the olive genome. We assembled 1.1 Gb of sequences of the total olive genome to 23 pseudochromosomes by Hi-C, and 53,518 protein-coding genes were predicted in the current assembly. Comparative genomics analyses, including gene family expansion and contraction, whole-genome replication, phylogenetic analysis, and positive selection, were performed. Based on the obtained high-quality olive genome, a total of nine gene families with 202 genes were identified in the oleuropein biosynthesis pathway, which is twice the number of genes identified from the previous data. This new accession of the olive genome is of sufficient quality for genome-wide studies on gene function in olive and has provided a foundation for the molecular breeding of olive species.

References

  1. Nucleic Acids Res. 2012 Apr;40(7):e49 [PMID: 22217600]
  2. Proc Natl Acad Sci U S A. 2014 Apr 15;111(15):5598-603 [PMID: 24706833]
  3. Comput Appl Biosci. 1997 Oct;13(5):555-6 [PMID: 9367129]
  4. Nat Methods. 2015 Apr;12(4):357-60 [PMID: 25751142]
  5. Genome Biol. 2015 Dec 01;16:259 [PMID: 26619908]
  6. BMC Bioinformatics. 2004 May 14;5:59 [PMID: 15144565]
  7. Plant Physiol Biochem. 2018 Jul;128:41-49 [PMID: 29753981]
  8. Methods Mol Biol. 2009;537:39-64 [PMID: 19378139]
  9. Syst Biol. 2007 Aug;56(4):564-77 [PMID: 17654362]
  10. Nucleic Acids Res. 2003 Jan 1;31(1):334-41 [PMID: 12520017]
  11. Bioinformatics. 2009 Jul 15;25(14):1754-60 [PMID: 19451168]
  12. Mob DNA. 2015 Jun 02;6:11 [PMID: 26045719]
  13. Science. 2009 Oct 9;326(5950):289-93 [PMID: 19815776]
  14. Curr Opin Plant Biol. 2017 Apr;36:64-70 [PMID: 28231512]
  15. Cell. 2014 Dec 18;159(7):1665-80 [PMID: 25497547]
  16. PLoS One. 2014 May 02;9(5):e91929 [PMID: 24786468]
  17. Hortic Res. 2019 Oct 1;6:111 [PMID: 31645965]
  18. J Plant Physiol. 2012 Jun 15;169(9):908-14 [PMID: 22475500]
  19. Curr Protoc Bioinformatics. 2007 Jun;Chapter 4:Unit 4.3 [PMID: 18428791]
  20. Nucleic Acids Res. 1997 Mar 1;25(5):955-64 [PMID: 9023104]
  21. Bioinformatics. 2005 Jun;21 Suppl 1:i351-8 [PMID: 15961478]
  22. Nucleic Acids Res. 2016 Nov 2;44(19):e147 [PMID: 27458204]
  23. Front Plant Sci. 2018 Apr 13;9:418 [PMID: 29706973]
  24. Plant J. 2005 Nov;44(4):581-94 [PMID: 16262708]
  25. J Biol Chem. 2016 Mar 11;291(11):5542-5554 [PMID: 26709230]
  26. Genome Res. 2017 May;27(5):737-746 [PMID: 28100585]
  27. Nutr Res Rev. 2005 Jun;18(1):98-112 [PMID: 19079898]
  28. Methods Mol Biol. 2019;1962:161-177 [PMID: 31020559]
  29. Bioinformatics. 2015 Oct 1;31(19):3210-2 [PMID: 26059717]
  30. Gigascience. 2016 Jun 27;5:29 [PMID: 27346392]
  31. Food Chem. 2019 Dec 1;300:125246 [PMID: 31357017]
  32. Gigascience. 2017 Feb 1;6(2):1-13 [PMID: 28369459]
  33. Nat Biotechnol. 2015 Mar;33(3):290-5 [PMID: 25690850]
  34. Nucleic Acids Res. 2015 Jul 13;43(12):e78 [PMID: 25870408]
  35. Bioinformatics. 2007 May 1;23(9):1061-7 [PMID: 17332020]
  36. Bioinformatics. 2008 Mar 1;24(5):713-4 [PMID: 18227114]
  37. Genome Res. 2017 May;27(5):722-736 [PMID: 28298431]
  38. Int J Vitam Nutr Res. 2009 May;79(3):152-65 [PMID: 20209466]
  39. Genome Biol. 2019 Nov 14;20(1):238 [PMID: 31727128]
  40. PLoS Comput Biol. 2019 Jun 5;15(6):e1006994 [PMID: 31166948]
  41. Nat Methods. 2020 Feb;17(2):155-158 [PMID: 31819265]
  42. Genome Res. 2017 May;27(5):778-786 [PMID: 28159771]
  43. Nat Methods. 2015 Jan;12(1):59-60 [PMID: 25402007]
  44. Proc Natl Acad Sci U S A. 2017 Oct 31;114(44):E9413-E9422 [PMID: 29078332]
  45. Hortic Res. 2020 Apr 1;7:46 [PMID: 32257232]
  46. Nat Biotechnol. 2016 May 6;34(5):518-24 [PMID: 27153285]
  47. Nat Biotechnol. 2013 Dec;31(12):1119-25 [PMID: 24185095]
  48. Plant Cell. 2017 Oct;29(10):2336-2348 [PMID: 29025960]
  49. Nucleic Acids Res. 2019 Jan 8;47(D1):D419-D426 [PMID: 30407594]
  50. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W265-8 [PMID: 17485477]
  51. Genome Biol. 2008 Jan 11;9(1):R7 [PMID: 18190707]
  52. Mol Biol Evol. 2013 Aug;30(8):1987-97 [PMID: 23709260]
  53. Nat Methods. 2017 Jun;14(6):587-589 [PMID: 28481363]
  54. Nucleic Acids Res. 2020 Jan 8;48(D1):D24-D33 [PMID: 31702008]
  55. Bioinformatics. 2004 Nov 1;20(16):2878-9 [PMID: 15145805]
  56. J Mol Biol. 1990 Oct 5;215(3):403-10 [PMID: 2231712]

Word Cloud

Created with Highcharts 10.0.0olivegenomeeuropaeaobtainedusingsequencingassemblygenesgeneOleabreedingHi-C1qualitytotalidentifiednewaccessionOliveLinternationallyrenownedhigh-endproductextravirginoilincompleteOpreviouslyshotgun2016exploregeneticutilizationupdateddraftOxfordNanoporethird-generationtechnologySevendifferentstrategiesusedassemblefinal30 GbcontigscaffoldN50sizes467 Mb4260 Mbrespectivelygreatlyincreasedassembled1 Gbsequences23pseudochromosomes53518protein-codingpredictedcurrentComparativegenomicsanalysesincludingfamilyexpansioncontractionwhole-genomereplicationphylogeneticanalysispositiveselectionperformedBasedhigh-qualityninefamilies202oleuropeinbiosynthesispathwaytwicenumberpreviousdatasufficientgenome-widestudiesfunctionprovidedfoundationmolecularspeciesDenovonanopore

Similar Articles

Cited By (39)