Maria-Vittoria Carminati, Vlonjat Lonnie Gashi, Ruiqi Li, Daniel Jacob Klee, Sara Rose Padula, Ajay Manish Patel, Andy Dick Yee Tan, Jacqueline Mattos, Nolan Kane
Author Information
Maria-Vittoria Carminati: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA. Maria-Vittoria.Carminati@colorado.edu. ORCID
Vlonjat Lonnie Gashi: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA.
Ruiqi Li: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA.
Daniel Jacob Klee: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA. ORCID
Sara Rose Padula: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA.
Ajay Manish Patel: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA.
Andy Dick Yee Tan: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA. ORCID
Jacqueline Mattos: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA.
Nolan Kane: Department of Ecology and Evolutionary Biology, University of Colorado Boulder, Boulder, CO, USA.
The sequencing of a kidney sample (KW2013002) from a stranded Megaptera novaeangliae (Humpback whale) calf is the first chromosome-level reference genome for this species. The calf, a 457���cm and 2,500 lbs male, was found stranded in Hawai'i Kai, HI, in 2013 and was marked as abandoned/orphaned. In 2023, 1���g of kidney was sequenced with PacBio long-read DNA sequencing, chromatin conformation capture (Hi-C), RNA sequencing, and mitochondrial sequencing to comprehensively characterize the genome and transcriptome of M. novaeangliae. Data validation includes a synteny analysis, mitochondrial annotation, and a comparison of BUSCO scores (scaffold v. reference genome and Balaenoptera musculus (Blue whale) v. M. novaeangliae). BUSCO analysis was performed on an M. novaeangliae scaffold-level assembly to determine genomic completeness of the reference genome, with a scaffold BUSCO score of 91.2% versus a score of 95.4%. Synteny analysis was performed using the B. musculus genome as comparison to determine chromosome-level coverage and structure. Further, a time-based phylogenetic tree was constructed using the sequenced data and publicly available genomes.
References
Carminati, M.-V. G. et al. Megaptera novaeangliae isolate KW2013002, whole genome shotgun sequencing project. GenBank https://identifiers.org/ncbi/insdc:JBGMDX010000001.1 (2024).
Jackson, J. A. et al. Global diversity and oceanic divergence of humpback whales (Megaptera novaeangliae). Proc. R. Soc. B 281, 1786, https://doi.org/10.1098/rspb.2013.3222 (2014).
[DOI: 10.1098/rspb.2013.3222]
Roman, J. & McCarthy, J. J. The whale pump: marine mammals enhance primary productivity in a coastal basin. PLoS One 5, 10, https://doi.org/10.1371/journal.pone.0013255 (2010).
[DOI: 10.1371/journal.pone.0013255]
McGowen, M. R. et al. Phylogenomic Resolution of the Cetacean Tree of Life Using Target Sequence Capture. Syst. Biol. 69, 479���501, https://doi.org/10.1093/sysbio/syz068 (2020).
[DOI: 10.1093/sysbio/syz068]
Morin, P. A. et al. Building genomic infrastructure: Sequencing platinum���standard reference���quality genomes of all cetacean species. Marine Mammal Science 36(4), 1356���1366, https://doi.org/10.1111/mms.12721 (2020).
[DOI: 10.1111/mms.12721]
GenBank. Megaptera novaeangliae (humpback whale) genome assembly, megNov1, GCA_004329385.1. National Center for Biotechnology Information (NCBI). Available from: https://www.ncbi.nlm.nih.gov/assembly/GCA_004329385.1 (2019).
Tollis, M. et al. Return to the Sea, Get Huge, Beat Cancer: An Analysis of Cetacean Genomes Including an Assembly for the Humpback Whale (Megaptera novaeangliae). Molecular Biology and Evolution 36(8), 1746���1763, https://doi.org/10.1093/molbev/msz099 (2019).
[DOI: 10.1093/molbev/msz099]
Carminati, M.-V. G. et al. Novel Megaptera novaeangliae (Humpback whale) haplotype reference genome [Dataset]. Dryad https://doi.org/10.5061/dryad.dv41ns271 (2024).
GenBank. Balaenoptera musculus (blue whale) genome assembly, mBalMus1.pri.v3, GCA_009873245.3. National Center for Biotechnology Information (NCBI). Available from: https://www.ncbi.nlm.nih.gov/assembly/GCA_009873245.3 (2020).
Katoh, K., Rozewicki, J. & Yamada, K. D. MAFFT online service: multiple sequence alignment, interactive sequence choice and visualization. Briefings in Bioinformatics 20(4), 1160���1166, https://doi.org/10.1093/bib/bbx108 (2019).
[DOI: 10.1093/bib/bbx108]
Allio, R., Doneg��, S., Galtier, N. & Nabholz, B. Large Variation in the Ratio of Mitochondrial to Nuclear Mutation Rate across Animals: Implications for Genetic Diversity and the Use of Mitochondrial DNA as a Molecular Marker. Molecular Biology and Evolution 34(11), 2762���2772, https://doi.org/10.1093/molbev/msx197 (2017).
[DOI: 10.1093/molbev/msx197]
Cummins, J. Mitochondrial DNA in mammalian reproduction. Reviews of reproduction 3(3), 172���182, https://doi.org/10.1530/ror.0.0030172 (1998).
[DOI: 10.1530/ror.0.0030172]
Carminati, M.-V. G. et al. Megaptera novaeangliae voucher NIST KW2013002 mitochondrion, complete genome. https://identifiers.org/ncbi/nucleotide:PP475430.1 (2024).
Katoh, K. & Standley, D. M. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol. Biol. Evol. 30, 772���780, https://doi.org/10.1093/molbev/mst010 (2013).
[DOI: 10.1093/molbev/mst010]
Nguyen, L. T., Schmidt, H. A., Von Haeseler, A. & Minh, B. Q. IQ-TREE: A fast and effective stochastic algorithm for estimating maximum-likelihood phylogenies. Mol. Biol. Evol. 32, 268���274, https://doi.org/10.1093/molbev/msu300 (2015).
[DOI: 10.1093/molbev/msu300]
Smith, S. A. LSD2: Least-Squares Dating for Estimating Species Divergence Times. Bioinformatics 35(21), 4429���4431 (2019).
Kuznetsov, D. et al. OrthoDB v11: annotation of orthologs in the widest sampling of organismal diversity. Nucleic Acids Res 51, D445���D451, https://doi.org/10.1093/nar/gkac998 (2023).
[DOI: 10.1093/nar/gkac998]