Whole genome sequence and de novo assembly revealed genomic architecture of Indian Mithun (Bos frontalis).

Sabyasachi Mukherjee, Zexi Cai, Anupama Mukherjee, Imsusosang Longkumer, Moonmoon Mech, Kezhavituo Vupru, Kobu Khate, Chandan Rajkhowa, Abhijit Mitra, Bernt Guldbrandtsen, Mogens Sandø Lund, Goutam Sahana
Author Information
  1. Sabyasachi Mukherjee: Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland, 797106, India. smup0336@gmail.com. ORCID
  2. Zexi Cai: Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830, Tjele, Denmark.
  3. Anupama Mukherjee: Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland, 797106, India.
  4. Imsusosang Longkumer: Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland, 797106, India.
  5. Moonmoon Mech: Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland, 797106, India.
  6. Kezhavituo Vupru: Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland, 797106, India.
  7. Kobu Khate: Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland, 797106, India.
  8. Chandan Rajkhowa: Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland, 797106, India.
  9. Abhijit Mitra: Animal Genetics and Breeding Lab., ICAR-National Research Centre on Mithun, Medziphema, Nagaland, 797106, India.
  10. Bernt Guldbrandtsen: Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830, Tjele, Denmark.
  11. Mogens Sandø Lund: Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830, Tjele, Denmark.
  12. Goutam Sahana: Center for Quantitative Genetics and Genomics, Department of Molecular Biology and Genetics, Aarhus University, 8830, Tjele, Denmark.

Abstract

BACKGROUND: Mithun (Bos frontalis), also called gayal, is an endangered bovine species, under the tribe bovini with 2n = 58 XX chromosome complements and reared under the tropical rain forests region of India, China, Myanmar, Bhutan and Bangladesh. However, the origin of this species is still disputed and information on its genomic architecture is scanty so far. We trust that availability of its whole genome sequence data and assembly will greatly solve this problem and help to generate many information including phylogenetic status of mithun. Recently, the first genome assembly of gayal, mithun of Chinese origin, was published. However, an improved reference genome assembly would still benefit in understanding genetic variation in mithun populations reared under diverse geographical locations and for building a superior consensus assembly. We, therefore, performed deep sequencing of the genome of an adult female mithun from India, assembled and annotated its genome and performed extensive bioinformatic analyses to produce a superior de novo genome assembly of mithun.
RESULTS: We generated ≈300 Gigabyte (Gb) raw reads from whole-genome deep sequencing platforms and assembled the sequence data using a hybrid assembly strategy to create a high quality de novo assembly of mithun with 96% recovered as per BUSCO analysis. The final genome assembly has a total length of 3.0 Gb, contains 5,015 scaffolds with an N50 value of 1 Mb. Repeat sequences constitute around 43.66% of the assembly. The genomic alignments between mithun to cattle showed that their genomes, as expected, are highly conserved. Gene annotation identified 28,044 protein-coding genes presented in mithun genome. The gene orthologous groups of mithun showed a high degree of similarity in comparison with other species, while fewer mithun specific coding sequences were found compared to those in cattle.
CONCLUSION: Here we presented the first de novo draft genome assembly of Indian mithun having better coverage, less fragmented, better annotated, and constitutes a reasonably complete assembly compared to the previously published gayal genome. This comprehensive assembly unravelled the genomic architecture of mithun to a great extent and will provide a reference genome assembly to research community to elucidate the evolutionary history of mithun across its distinct geographical locations.

Keywords

References

  1. J Hered. 2012 May-Jun;103(3):342-8 [PMID: 22315242]
  2. Bioinformatics. 2011 Feb 15;27(4):578-9 [PMID: 21149342]
  3. Sci Rep. 2016 Jan 25;6:19787 [PMID: 26806430]
  4. Genome Biol. 2009;10(4):R42 [PMID: 19393038]
  5. Science. 2009 Apr 24;324(5926):522-8 [PMID: 19390049]
  6. Nucleic Acids Res. 2008 Jul 1;36(Web Server issue):W423-6 [PMID: 18477636]
  7. Bioinformatics. 2011 Nov 1;27(21):2987-93 [PMID: 21903627]
  8. Genomics. 2020 Jan;112(1):252-262 [PMID: 30822468]
  9. Curr Protoc Bioinformatics. 2009 Mar;Chapter 4:4.10.1-4.10.14 [PMID: 19274634]
  10. BMC Biol. 2017 Nov 16;15(1):110 [PMID: 29145861]
  11. Cytogenet Genome Res. 2005;108(4):310-6 [PMID: 15627750]
  12. Mob DNA. 2015 Jun 02;6:11 [PMID: 26045719]
  13. Nucleic Acids Res. 2006 Jul 1;34(Web Server issue):W435-9 [PMID: 16845043]
  14. Science. 2016 Apr 1;352(6281):aae0344 [PMID: 27034376]
  15. Nucleic Acids Res. 2004 Jan 1;32(Database issue):D138-41 [PMID: 14681378]
  16. Nucleic Acids Res. 2002 Jan 1;30(1):239-41 [PMID: 11752304]
  17. Nucleic Acids Res. 2006 Jan 1;34(Database issue):D281-4 [PMID: 16381865]
  18. Bioinformatics. 2013 Nov 1;29(21):2669-77 [PMID: 23990416]
  19. Genome Biol. 2013;14(9):R101 [PMID: 24034426]
  20. J Hered. 1992 Jul-Aug;83(4):287-98 [PMID: 1401875]
  21. Nucleic Acids Res. 2008 Jan;36(Database issue):D190-5 [PMID: 18045787]
  22. Bioinformatics. 2014 Dec 15;30(24):3506-14 [PMID: 25165095]
  23. ISME J. 2012 Aug;6(8):1621-4 [PMID: 22402401]
  24. Bioinformatics. 2014 Aug 1;30(15):2114-20 [PMID: 24695404]
  25. BMC Bioinformatics. 2005 Feb 15;6:31 [PMID: 15713233]
  26. Nat Protoc. 2012 Mar 01;7(3):562-78 [PMID: 22383036]
  27. Nat Methods. 2016 Jul;13(7):587-90 [PMID: 27159086]
  28. Res Vet Sci. 1986 Jan;40(1):8-17 [PMID: 3704329]
  29. Nat Genet. 2000 May;25(1):25-9 [PMID: 10802651]
  30. Nucleic Acids Res. 2007 Jul;35(Web Server issue):W182-5 [PMID: 17526522]
  31. Nat Biotechnol. 2012 Jul 01;30(7):701-707 [PMID: 22750883]
  32. PLoS One. 2014 Sep 04;9(9):e106689 [PMID: 25188499]
  33. Proc Natl Acad Sci U S A. 2003 Sep 30;100(20):11484-9 [PMID: 14500911]
  34. Mol Biol Evol. 2004 Jul;21(7):1165-70 [PMID: 14739241]
  35. PLoS Biol. 2010 Jun 29;8(6):e1000412 [PMID: 20613859]
  36. Anim Sci J. 2011 Feb;82(1):52-6 [PMID: 21269359]
  37. Bioinformatics. 2015 Oct 1;31(19):3210-2 [PMID: 26059717]
  38. Front Genet. 2019 Jan 11;9:728 [PMID: 30687392]
  39. BMC Genomics. 2017 Jul 19;18(1):541 [PMID: 28724409]
  40. Nucleic Acids Res. 2003 Oct 1;31(19):5654-66 [PMID: 14500829]
  41. Bioinformatics. 2013 Apr 15;29(8):1072-5 [PMID: 23422339]
  42. Gigascience. 2018 Jul 1;7(7): [PMID: 30010758]
  43. Nat Methods. 2015 Aug;12(8):780-6 [PMID: 26121404]
  44. Bioinformatics. 2001 Sep;17(9):847-8 [PMID: 11590104]
  45. Genome Biol. 2013 May 29;14(5):R51 [PMID: 23718773]
  46. Nat Genet. 2012 Jul 01;44(8):946-9 [PMID: 22751099]
  47. Mol Biol Rep. 2012 Feb;39(2):2011-20 [PMID: 21633886]
  48. Genome Biol. 2013 Apr 25;14(4):R36 [PMID: 23618408]
  49. PLoS One. 2012;7(11):e47768 [PMID: 23185243]
  50. Bioinformatics. 2011 Mar 15;27(6):764-70 [PMID: 21217122]
  51. BMC Bioinformatics. 2014 Jun 20;15:211 [PMID: 24950923]
  52. Genome Biol. 2008 Jan 11;9(1):R7 [PMID: 18190707]
  53. Nucleic Acids Res. 2000 Jan 1;28(1):27-30 [PMID: 10592173]
  54. Nat Protoc. 2013 Aug;8(8):1494-512 [PMID: 23845962]
  55. PLoS One. 2015 Apr 28;10(4):e0126289 [PMID: 25919614]
  56. Zoolog Sci. 2004 Nov;21(11):1125-9 [PMID: 15572864]
  57. Nucleic Acids Res. 2009 Jan;37(Database issue):D380-6 [PMID: 19036790]
  58. Genomics Proteomics Bioinformatics. 2015 Oct;13(5):278-89 [PMID: 26542840]
  59. Yi Chuan Xue Bao. 1993;20(5):419-25 [PMID: 8161472]
  60. J Genet Genomics. 2007 May;34(5):413-9 [PMID: 17560527]
  61. Mol Biol Evol. 2018 Mar 1;35(3):543-548 [PMID: 29220515]
  62. Genome Res. 2003 Sep;13(9):2129-41 [PMID: 12952881]
  63. Front Physiol. 2015 May 19;6:144 [PMID: 26042041]
  64. Mol Phylogenet Evol. 2004 Dec;33(3):896-907 [PMID: 15522811]
  65. Genome Res. 2016 Mar;26(3):342-50 [PMID: 26848124]
  66. Mol Biol Evol. 2017 Aug 1;34(8):2115-2122 [PMID: 28460117]
  67. Gigascience. 2017 Nov 1;6(11):1-7 [PMID: 29048483]

Grants

  1. IXX10452/Indian Council of Agricultural Research
  2. Overseas Associateship/Department of Biotechnology , Ministry of Science and Technology

MeSH Term

Animals
Genomics
Molecular Sequence Annotation
Repetitive Sequences, Nucleic Acid
Ruminants
Whole Genome Sequencing

Word Cloud

Created with Highcharts 10.0.0assemblymithungenomedenovogenomicMithunBosfrontalisgayalspeciesarchitecturesequencerearedIndiaHoweveroriginstillinformationdatawillfirstpublishedreferencegeographicallocationssuperiorperformeddeepsequencingassembledannotatedGbhighsequencescattleshowedpresentedcomparedIndianbetterBACKGROUND:alsocalledendangeredbovinetribebovini2n = 58XXchromosomecomplementstropicalrainforestsregionChinaMyanmarBhutanBangladeshdisputedscantyfartrustavailabilitywholegreatlysolveproblemhelpgeneratemanyincludingphylogeneticstatusRecentlyChineseimprovedbenefitunderstandinggeneticvariationpopulationsdiversebuildingconsensusthereforeadultfemaleextensivebioinformaticanalysesproduceRESULTS:generated≈300Gigabyterawreadswhole-genomeplatformsusinghybridstrategycreatequality96%recoveredperBUSCOanalysisfinaltotallength30contains5015scaffoldsN50value1 MbRepeatconstitutearound4366%alignmentsgenomesexpectedhighlyconservedGeneannotationidentified28044protein-codinggenesgeneorthologousgroupsdegreesimilaritycomparisonfewerspecificcodingfoundCONCLUSION:draftcoveragelessfragmentedconstitutesreasonablycompletepreviouslycomprehensiveunravelledgreatextentprovideresearchcommunityelucidateevolutionaryhistoryacrossdistinctWholerevealedGenome

Similar Articles

Cited By