Characterization of novel human endogenous retrovirus structures on chromosomes 6 and 7.

Nicholas Pasternack, Ole Paulsen, Avindra Nath
Author Information
  1. Nicholas Pasternack: Section of Infections of the Nervous System, National Institute of Neurological Disorders and Stroke (NINDS), National Institutes of Health (NIH), Bethesda, MD, United States.
  2. Ole Paulsen: Department of Physiology, Development and Neuroscience, University of Cambridge, Cambridge, United Kingdom.
  3. Avindra Nath: Section of Infections of the Nervous System, National Institute of Neurological Disorders and Stroke (NINDS), National Institutes of Health (NIH), Bethesda, MD, United States.

Abstract

human endogenous retroviruses (HERV) represent nearly 8% of the human genome. Of these, HERV-K subtype HML-2 is a transposable element that plays a critical role in embryonic development and in the pathogenesis of several diseases. Quantification and characterization of these multiple HML-2 insertions in the human chromosome has been challenging due to their size, sequence homology with each other, and their repetitive nature. We examined a cohort of 222 individuals for HML-2 proviruses 6q14.1 and 7p22.1a, two loci that are capable of producing full-length viral proteins and have been previously implicated in several cancers, autoimmune disorders and neurodegenerative diseases, using long-read DNA sequencing. While the reference genome for both regions suggests these two loci are structurally dissimilar, we found that for both loci about 5% of individuals have a unique tandem repeat-like sequence (three long terminal repeat sequences sandwiching two internal, potentially protein coding sequences), while most individuals have a standard proviral structure (one internal region sandwiched by two long terminal repeats). Moreover, both proviruses can make full-length, or nearly full-length, HERV-K proteins in multiple transcription orientations. The amino acid sequences from different loci in the same transcriptional orientation share sequence homology with each other. These results demonstrate a clear, previously unreported, relationship between HML-2 loci 6q14.1 and 7p22.1a and highlight the utility of long-read sequencing to study repetitive elements. Future studies need to determine if these polymorphisms determine genetic susceptibility to diseases that are associated with them.

Keywords

References

  1. Sci Transl Med. 2015 Sep 30;7(307):307ra153 [PMID: 26424568]
  2. Annu Rev Genet. 2008;42:709-32 [PMID: 18694346]
  3. iScience. 2024 May 28;27(7):110147 [PMID: 38989463]
  4. J Virol. 2007 Sep;81(17):9437-42 [PMID: 17581995]
  5. Nucleic Acids Res. 2022 Jul 5;50(W1):W276-W279 [PMID: 35412617]
  6. Trends Genet. 2000 Jun;16(6):276-7 [PMID: 10827456]
  7. Genome Biol. 2020 Feb 7;21(1):30 [PMID: 32033565]
  8. Retrovirology. 2016 Jan 22;13:7 [PMID: 26800882]
  9. PLoS Comput Biol. 2019 Sep 30;15(9):e1006453 [PMID: 31568525]
  10. J Virol. 2005 Sep;79(17):10890-901 [PMID: 16103141]
  11. Mol Pathol. 2003 Feb;56(1):11-8 [PMID: 12560456]
  12. Front Microbiol. 2020 Jul 17;11:1690 [PMID: 32765477]
  13. Ann Neurol. 2011 Jan;69(1):141-51 [PMID: 21280084]
  14. Curr Biol. 2001 Nov 13;11(22):R914-6 [PMID: 11719237]
  15. Bioinformatics. 2015 Nov 15;31(22):3593-9 [PMID: 26206304]
  16. PeerJ Comput Sci. 2020 Jan 20;6:e251 [PMID: 33816903]
  17. Nucleic Acids Res. 2002 Jul 15;30(14):3059-66 [PMID: 12136088]
  18. Nucleic Acids Res. 2023 Jan 6;51(D1):D523-D531 [PMID: 36408920]
  19. Crit Rev Microbiol. 2018 Nov;44(6):715-738 [PMID: 30318978]
  20. Front Immunol. 2021 Apr 27;12:661437 [PMID: 33986751]
  21. Genomics. 2001 Mar 15;72(3):314-20 [PMID: 11401447]
  22. Genome Biol. 2021 May 10;22(1):147 [PMID: 33971937]
  23. Biomolecules. 2024 Oct 02;14(10): [PMID: 39456183]
  24. Viruses. 2020 Jul 06;12(7): [PMID: 32640516]
  25. Nucleic Acids Res. 2016 Jul 8;44(W1):W232-5 [PMID: 27084950]
  26. Mol Biol Evol. 2020 May 1;37(5):1530-1534 [PMID: 32011700]
  27. Retrovirology. 2011 Nov 08;8:90 [PMID: 22067224]
  28. Front Genet. 2014 Nov 10;5:381 [PMID: 25426137]
  29. Nature. 2001 Feb 15;409(6822):860-921 [PMID: 11237011]
  30. Nucleic Acids Res. 2014 Jan;42(Database issue):D986-92 [PMID: 24174537]
  31. Nucleic Acids Res. 2022 Jan 7;50(D1):D988-D995 [PMID: 34791404]
  32. Nat Methods. 2023 Oct;20(10):1483-1492 [PMID: 37710018]
  33. Proc Natl Acad Sci U S A. 2016 Apr 19;113(16):E2326-34 [PMID: 27001843]
  34. Nucleic Acids Res. 2004 Jul 1;32(Web Server issue):W20-5 [PMID: 15215342]
  35. Methods Mol Biol. 2023;2607:151-171 [PMID: 36449163]

Grants

  1. P30 AG072946/NIA NIH HHS
  2. Z01 AG000949/Intramural NIH HHS
  3. Z01 ES101986/Intramural NIH HHS
  4. ZIA NS003154/Intramural NIH HHS

Word Cloud

Created with Highcharts 10.0.0HML-2locitwosequencinghumanHERV-Kdiseasessequenceindividualsfull-lengthlong-readsequencesendogenousnearlygenomeseveralmultiplehomologyrepetitiveproviruses6q1417p221aproteinspreviouslyDNAtandemlongterminalrepeatinternaldetermineHumanretrovirusesHERVrepresent8%subtypetransposableelementplayscriticalroleembryonicdevelopmentpathogenesisQuantificationcharacterizationinsertionschromosomechallengingduesizenatureexaminedcohort222capableproducingviralimplicatedcancersautoimmunedisordersneurodegenerativeusingreferenceregionssuggestsstructurallydissimilarfound5%uniquerepeat-likethreesandwichingpotentiallyproteincodingstandardproviralstructureoneregionsandwichedrepeatsMoreovercanmaketranscriptionorientationsaminoaciddifferenttranscriptionalorientationshareresultsdemonstrateclearunreportedrelationshiphighlightutilitystudyelementsFuturestudiesneedpolymorphismsgeneticsusceptibilityassociatedthemCharacterizationnovelretrovirusstructureschromosomes67structuralvariants

Similar Articles

Cited By

No available data.