Novel Megaptera novaeangliae (Humpback whale) haplotype chromosome-level reference genome.

Maria-Vittoria Carminati, Vlonjat Lonnie Gashi, Ruiqi Li, Daniel Jacob Klee, Sara Rose Padula, Ajay Manish Patel, Andy Dick Yee Tan, Jacqueline Mattos, Nolan Kane
Author Information
  1. Maria-Vittoria Carminati: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA. Maria-Vittoria.Carminati@colorado.edu. ORCID
  2. Vlonjat Lonnie Gashi: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA.
  3. Ruiqi Li: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA.
  4. Daniel Jacob Klee: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA. ORCID
  5. Sara Rose Padula: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA.
  6. Ajay Manish Patel: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA.
  7. Andy Dick Yee Tan: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA. ORCID
  8. Jacqueline Mattos: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA.
  9. Nolan Kane: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA.

Abstract

The sequencing of a kidney sample (KW2013002) from a stranded Megaptera novaeangliae (Humpback whale) calf is the first chromosome-level reference genome for this species. The calf, a 457���cm and 2,500 lbs male, was found stranded in Hawai'i Kai, HI, in 2013 and was marked as abandoned/orphaned. In 2023, 1���g of kidney was sequenced with PacBio long-read DNA sequencing, chromatin conformation capture (Hi-C), RNA sequencing, and mitochondrial sequencing to comprehensively characterize the genome and transcriptome of M. novaeangliae. Data validation includes a synteny analysis, mitochondrial annotation, and a comparison of BUSCO scores (scaffold v. reference genome and Balaenoptera musculus (Blue whale) v. M. novaeangliae). BUSCO analysis was performed on an M. novaeangliae scaffold-level assembly to determine genomic completeness of the reference genome, with a scaffold BUSCO score of 91.2% versus a score of 95.4%. Synteny analysis was performed using the B. musculus genome as comparison to determine chromosome-level coverage and structure. Further, a time-based phylogenetic tree was constructed using the sequenced data and publicly available genomes.

References

  1. Carminati, M.-V. G. et al. Megaptera novaeangliae isolate KW2013002, whole genome shotgun sequencing project. GenBank https://identifiers.org/ncbi/insdc:JBGMDX010000001.1 (2024).
  2. Jackson, J. A. et al. Global diversity and oceanic divergence of humpback whales (Megaptera novaeangliae). Proc. R. Soc. B 281, 1786, https://doi.org/10.1098/rspb.2013.3222 (2014). [DOI: 10.1098/rspb.2013.3222]
  3. Roman, J. & McCarthy, J. J. The whale pump: marine mammals enhance primary productivity in a coastal basin. PLoS One 5, 10, https://doi.org/10.1371/journal.pone.0013255 (2010). [DOI: 10.1371/journal.pone.0013255]
  4. McGowen, M. R. et al. Phylogenomic Resolution of the Cetacean Tree of Life Using Target Sequence Capture. Syst. Biol. 69, 479���501, https://doi.org/10.1093/sysbio/syz068 (2020). [DOI: 10.1093/sysbio/syz068]
  5. Morin, P. A. et al. Building genomic infrastructure: Sequencing platinum���standard reference���quality genomes of all cetacean species. Marine Mammal Science 36(4), 1356���1366, https://doi.org/10.1111/mms.12721 (2020). [DOI: 10.1111/mms.12721]
  6. GenBank. Megaptera novaeangliae (humpback whale) genome assembly, megNov1, GCA_004329385.1. National Center for Biotechnology Information (NCBI). Available from: https://www.ncbi.nlm.nih.gov/assembly/GCA_004329385.1 (2019).
  7. Tollis, M. et al. Return to the Sea, Get Huge, Beat Cancer: An Analysis of Cetacean Genomes Including an Assembly for the Humpback Whale (Megaptera novaeangliae). Molecular Biology and Evolution 36(8), 1746���1763, https://doi.org/10.1093/molbev/msz099 (2019). [DOI: 10.1093/molbev/msz099]
  8. Carminati, M.-V. G. et al. Novel Megaptera novaeangliae (Humpback whale) haplotype reference genome [Dataset]. Dryad https://doi.org/10.5061/dryad.dv41ns271 (2024).
  9. NCBI Sequence Read Archive https://www.ncbi.nlm.nih.gov/sra/SRP506011 (2024).
  10. GenBank. Balaenoptera musculus (blue whale) genome assembly, mBalMus1.pri.v3, GCA_009873245.3. National Center for Biotechnology Information (NCBI). Available from: https://www.ncbi.nlm.nih.gov/assembly/GCA_009873245.3 (2020).
  11. Katoh, K., Rozewicki, J. & Yamada, K. D. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Briefings in Bioinformatics 20(4), 1160���1166, https://doi.org/10.1093/bib/bbx108 (2019). [DOI: 10.1093/bib/bbx108]
  12. Allio, R., Doneg��, S., Galtier, N. & Nabholz, B. Large Variation in the Ratio of Mitochondrial to Nuclear Mutation Rate across Animals: Implications for Genetic Diversity and the Use of Mitochondrial DNA as a Molecular Marker. Molecular Biology and Evolution 34(11), 2762���2772, https://doi.org/10.1093/molbev/msx197 (2017). [DOI: 10.1093/molbev/msx197]
  13. Cummins, J. Mitochondrial DNA in mammalian reproduction. Reviews of reproduction 3(3), 172���182, https://doi.org/10.1530/ror.0.0030172 (1998). [DOI: 10.1530/ror.0.0030172]
  14. Carminati, M.-V. G. et al. Megaptera novaeangliae voucher NIST KW2013002 mitochondrion, complete genome. https://identifiers.org/ncbi/nucleotide:PP475430.1 (2024).
  15. Genome Assembly mEubGla1.1hap2.+XY (Eubalena glacialis), https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_028564815.1/ (2023).
  16. Genome Assembly mBalAcu1.1 (Balaenoptera acutorostrata), https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_949987535.1/ (2023).
  17. Genome Assembly mBalRic1.hap2 (Balaenoptera ricei) https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_028023285.1/ (2023).
  18. Genome Assembly mTurTru1.mat.Y (Tursiops truncatus) https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_011762595.1/ (2020).
  19. Genome Assembly mOrcOrc1.1 (Orcinus orca) https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_937001465.1/ (2022).
  20. Genome Assembly mKogBre1haplotype1 (Kogia breviceps) https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_026419965.1/ (2022).
  21. Genome Assembly ASM283717v5 (Physeter catodon), https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_002837175.3/ (2023).
  22. Genome Assembly Loxafr3.0 (Loxodonta africana, https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000001905.1/ (2009).
  23. Genome Assembly mHipAmp2.hap2 (Hippopotamus amphibius kiboko) https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_030028045.1/ (2023).
  24. Eschrichtius robustus Genome sequencing and assembly https://www.ncbi.nlm.nih.gov/bioproject/PRJNA707533/ (2023).
  25. Genome Assembly Oros_1.0 (Odobenus rosmarus divergens), https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000321225.1/ (2013).
  26. Genome Assembly TriManLat1.0 (Trichechus manatus latirostris), https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_000243295.1/ (2012).
  27. Genome assembly Neophocaena_asiaeorientalis_V1.1 (Neophocaena asiaeorientalis asiaeorientalis), https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_003031525.2/ (2018).
  28. Genome Assembly mGloMel1.1 (Globicephala melas), https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_963455315.1/ (2023).
  29. Genome assembly mMesDen1 primary haplotype (Mesoplodon densirostris), https://www.ncbi.nlm.nih.gov/datasets/genome/GCF_025265405.1/ (2022).
  30. Justin Chu. JupiterPlot. GitHub repository. https://github.com/JustinChu/JupiterPlot (2018).
  31. Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772���780, https://doi.org/10.1093/molbev/mst010 (2013). [DOI: 10.1093/molbev/mst010]
  32. Nguyen, L. T., Schmidt, H. A., Von Haeseler, A. & Minh, B. Q. IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268���274, https://doi.org/10.1093/molbev/msu300 (2015). [DOI: 10.1093/molbev/msu300]
  33. Smith, S. A. LSD2: Least-Squares Dating for Estimating Species Divergence Times. Bioinformatics 35(21), 4429���4431 (2019).
  34. Kuznetsov, D. et al. OrthoDB v11: annotation of orthologs in the widest sampling of organismal diversity. Nucleic Acids Res 51, D445���D451, https://doi.org/10.1093/nar/gkac998 (2023). [DOI: 10.1093/nar/gkac998]

MeSH Term

Animals
Genome
Humpback Whale
Haplotypes
Male
Sequence Analysis, DNA

Word Cloud

Created with Highcharts 10.0.0genomenovaeangliaesequencingreferencewhalechromosome-levelManalysisBUSCOkidneystrandedMegapteraHumpbackcalfsequencedmitochondrialcomparisonscaffoldvmusculusperformeddeterminescoreusingsampleKW2013002firstspecies457���cm2500lbsmalefoundHawai'iKaiHI2013markedabandoned/orphaned20231���gPacBiolong-readDNAchromatinconformationcaptureHi-CRNAcomprehensivelycharacterizetranscriptomeDatavalidationincludessyntenyannotationscoresBalaenopteraBluescaffold-levelassemblygenomiccompleteness912%versus954%SyntenyBcoveragestructuretime-basedphylogenetictreeconstructeddatapubliclyavailablegenomesNovelhaplotype

Similar Articles

Cited By