Multi-Omics Driven Assembly and Annotation of the Sandalwood () Genome.

Hirehally Basavarajegowda Mahesh, Pratigya Subba, Jayshree Advani, Meghana Deepak Shirke, Ramya Malarini Loganathan, Shankara Lingu Chandana, Siddappa Shilpa, Oishi Chatterjee, Sneha Maria Pinto, Thottethodi Subrahmanya Keshava Prasad, Malali Gowda
Author Information
  1. Hirehally Basavarajegowda Mahesh: Center for Functional Genomics and Bioinformatics, TransDisciplinary University, Institute of Trans-Disciplinary Health Sciences and Technology, Bengaluru 560064, India.
  2. Pratigya Subba: Center for Systems Biology and Molecular Medicine, Yenepoya University, Mangalore 575018, India.
  3. Jayshree Advani: Institute of Bioinformatics, International Technology Park, Bengaluru 560066, India.
  4. Meghana Deepak Shirke: Center for Cellular and Molecular Platforms, National Centre for Biological Sciences, Bengaluru 560065, India.
  5. Ramya Malarini Loganathan: Center for Cellular and Molecular Platforms, National Centre for Biological Sciences, Bengaluru 560065, India.
  6. Shankara Lingu Chandana: Center for Cellular and Molecular Platforms, National Centre for Biological Sciences, Bengaluru 560065, India.
  7. Siddappa Shilpa: Center for Cellular and Molecular Platforms, National Centre for Biological Sciences, Bengaluru 560065, India.
  8. Oishi Chatterjee: Institute of Bioinformatics, International Technology Park, Bengaluru 560066, India.
  9. Sneha Maria Pinto: Center for Systems Biology and Molecular Medicine, Yenepoya University, Mangalore 575018, India.
  10. Thottethodi Subrahmanya Keshava Prasad: Center for Systems Biology and Molecular Medicine, Yenepoya University, Mangalore 575018, India keshav@ibioinformatics.org malalig@tdu.edu.in. ORCID
  11. Malali Gowda: Center for Functional Genomics and Bioinformatics, TransDisciplinary University, Institute of Trans-Disciplinary Health Sciences and Technology, Bengaluru 560064, India keshav@ibioinformatics.org malalig@tdu.edu.in. ORCID

Abstract

Indian sandalwood () is an important tropical evergreen tree known for its fragrant heartwood-derived essential oil and its valuable carving wood. Here, we applied an integrated genomic, transcriptomic, and proteomic approach to assemble and annotate the Indian sandalwood genome. Our genome sequencing resulted in the establishment of a draft map of the smallest genome for any woody tree species to date (221 Mb). The genome annotation predicted 38,119 protein-coding genes and 27.42% repetitive DNA elements. In-depth proteome analysis revealed the identities of 72,325 unique peptides, which confirmed 10,076 of the predicted genes. The addition of transcriptomic and proteogenomic approaches resulted in the identification of 53 novel proteins and 34 gene-correction events that were missed by genomic approaches. Proteogenomic analysis also helped in reassigning 1,348 potential noncoding RNAs as bona fide protein-coding messenger RNAs. Gene expression patterns at the RNA and protein levels indicated that peptide sequencing was useful in capturing proteins encoded by nuclear and organellar genomes alike. Mass spectrometry-based proteomic evidence provided an unbiased approach toward the identification of proteins encoded by organellar genomes. Such proteins are often missed in transcriptome data sets due to the enrichment of only messenger RNAs that contain poly(A) tails. Overall, the use of integrated omic approaches enhanced the quality of the assembly and annotation of this nonmodel plant genome. The availability of genomic, transcriptomic, and proteomic data will enhance genomics-assisted breeding, germplasm characterization, and conservation of sandalwood trees.

References

  1. Gigascience. 2012 Dec 27;1(1):18 [PMID: 23587118]
  2. DNA Res. 2015 Oct;22(5):319-29 [PMID: 26341416]
  3. Plant Cell. 2016 May;28(5):993-4 [PMID: 27095838]
  4. Genome Res. 2017 May;27(5):885-896 [PMID: 28420692]
  5. Plant Physiol. 2014 Feb;164(2):513-24 [PMID: 24306534]
  6. Mol Cell Proteomics. 2014 Jan;13(1):157-67 [PMID: 24142994]
  7. Genome Biol. 2013 Apr 25;14(4):R36 [PMID: 23618408]
  8. J Cell Biol. 1975 Jul;66(1):188-93 [PMID: 49354]
  9. G3 (Bethesda). 2017 Jul 5;7(7):2259-2270 [PMID: 28546385]
  10. Nature. 2014 May 29;509(7502):582-7 [PMID: 24870543]
  11. Bioinformatics. 2015 Oct 1;31(19):3210-2 [PMID: 26059717]
  12. Front Plant Sci. 2015 Sep 01;6:661 [PMID: 26388878]
  13. Proc Natl Acad Sci U S A. 2008 Dec 30;105(52):21034-8 [PMID: 19098097]
  14. Plant J. 2002 Jan;29(2):193-202 [PMID: 11862948]
  15. Genome Res. 2017 Jan;27(1):133-144 [PMID: 28003436]
  16. Hortic Res. 2014 Jan 22;1:6 [PMID: 26504530]
  17. Nucleic Acids Res. 2015 Jul 1;43(W1):W78-84 [PMID: 25964301]
  18. Bioinformatics. 2014 May 1;30(9):1236-40 [PMID: 24451626]
  19. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W345-9 [PMID: 17631615]
  20. J Biol Chem. 2011 May 20;286(20):17445-54 [PMID: 21454632]
  21. Nucleic Acids Res. 2017 Jan 4;45(D1):D1040-D1045 [PMID: 27924042]
  22. Nat Biotechnol. 2011 May 15;29(7):644-52 [PMID: 21572440]
  23. Sci Rep. 2015 May 15;5:10095 [PMID: 25976282]
  24. Nucleic Acids Res. 2004 Mar 19;32(5):1792-7 [PMID: 15034147]
  25. Sci Rep. 2015 Dec 18;5:18427 [PMID: 26678784]
  26. Genome Res. 2012 Sep;22(9):1760-74 [PMID: 22955987]
  27. Science. 2008 May 16;320(5878):938-41 [PMID: 18436743]
  28. Nature. 2005 Aug 11;436(7052):793-800 [PMID: 16100779]
  29. Science. 2014 Jul 18;345(6194):1251788 [PMID: 25035500]
  30. Nat Commun. 2016 Jun 24;7:11708 [PMID: 27339440]
  31. Sci Rep. 2016 Jan 19;6:19467 [PMID: 26781930]
  32. 3 Biotech. 2016 Jun;6(1):55 [PMID: 28330125]
  33. Sci Rep. 2017 Feb 07;7:42165 [PMID: 28169358]
  34. Nat Protoc. 2014 Feb;9(2):362-74 [PMID: 24434803]
  35. Nat Biotechnol. 2011 Nov 06;30(1):83-9 [PMID: 22057054]
  36. PLoS One. 2015 Sep 15;10(9):e0137266 [PMID: 26371478]
  37. Nat Methods. 2013 Dec;10 (12 ):1177-84 [PMID: 24185837]
  38. Nature. 2012 May 30;485(7400):635-41 [PMID: 22660326]
  39. Bioinformatics. 2007 May 1;23(9):1061-7 [PMID: 17332020]
  40. J Proteome Res. 2011 Nov 4;10(11):5006-15 [PMID: 21923182]
  41. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W182-5 [PMID: 17526522]
  42. Plant Mol Biol. 1997 Dec;35(6):801-7 [PMID: 9426600]
  43. Am J Bot. 2007 Jun;94(6):1028-40 [PMID: 21636472]
  44. PLoS One. 2013 Sep 18;8(9):e75053 [PMID: 24324844]
  45. BMC Plant Biol. 2011 Apr 12;11:63 [PMID: 21486466]
  46. Nat Protoc. 2012 Mar 01;7(3):562-78 [PMID: 22383036]
  47. Genome Res. 1998 Jun;8(6):590-8 [PMID: 9647634]
  48. Plant Physiol. 2012 Nov;160(3):1597-612 [PMID: 22968831]
  49. Plant J. 2016 May;86(4):289-99 [PMID: 26991058]
  50. PLoS One. 2015 Sep 18;10(9):e0137391 [PMID: 26382944]
  51. Philos Trans R Soc Lond B Biol Sci. 1976 May 27;274(933):227-74 [PMID: 6977]
  52. Science. 1997 Oct 24;278(5338):609-14 [PMID: 9381171]
  53. Bioinformatics. 2011 Feb 15;27(4):578-9 [PMID: 21149342]
  54. Plant J. 2017 Feb;89(4):789-804 [PMID: 27862469]
  55. J Comput Biol. 2012 May;19(5):455-77 [PMID: 22506599]
  56. Theor Appl Genet. 2003 Feb;106(3):411-22 [PMID: 12589540]
  57. Nature. 2014 May 29;509(7502):575-81 [PMID: 24870542]
  58. Syst Biol. 2010 May;59(3):307-21 [PMID: 20525638]
  59. PLoS Comput Biol. 2011 Dec;7(12):e1002269 [PMID: 22144877]
  60. Mol Biol Evol. 2017 Sep 1;34(9):2422-2424 [PMID: 28472384]
  61. Phytochemistry. 2015 May;113:79-86 [PMID: 25624157]

MeSH Term

Gene Expression Profiling
Gene Expression Regulation, Plant
Genome, Plant
Genomics
High-Throughput Nucleotide Sequencing
Molecular Sequence Annotation
Phylogeny
Plant Proteins
Proteome
Proteomics
Santalum

Chemicals

Plant Proteins
Proteome

Word Cloud

Created with Highcharts 10.0.0genomeproteinssandalwoodgenomictranscriptomicproteomicapproachesRNAsIndiantreeintegratedapproachsequencingresultedannotationpredictedprotein-codinggenesanalysisidentificationmissedmessengerencodedorganellargenomesdataimportanttropicalevergreenknownfragrantheartwood-derivedessentialoilvaluablecarvingwoodappliedassembleannotateestablishmentdraftmapsmallestwoodyspeciesdate221Mb381192742%repetitiveDNAelementsIn-depthproteomerevealedidentities72325uniquepeptidesconfirmed10076additionproteogenomic53novel34gene-correctioneventsProteogenomicalsohelpedreassigning1348potentialnoncodingbonafideGeneexpressionpatternsRNAproteinlevelsindicatedpeptideusefulcapturingnuclearalikeMassspectrometry-basedevidenceprovidedunbiasedtowardoftentranscriptomesetsdueenrichmentcontainpolytailsOveralluseomicenhancedqualityassemblynonmodelplantavailabilitywillenhancegenomics-assistedbreedinggermplasmcharacterizationconservationtreesMulti-OmicsDrivenAssemblyAnnotationSandalwoodGenome

Similar Articles

Cited By