MalKinID: A classification model for identifying malaria parasite genealogical relationships using identity-by-descent.

Wesley Wong, Lea Wang, Stephen F Schaffner, Xue Li, Ian Cheeseman, Timothy J C Anderson, Ashley Vaughan, Michael Ferdig, Sarah K Volkman, Daniel L Hartl, Dyann F Wirth
Author Information
  1. Wesley Wong: Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Harvard University, Boston, MA 02115, USA. ORCID
  2. Lea Wang: Harvard College, Harvard University, Cambridge, MA 02138, USA.
  3. Stephen F Schaffner: Infectious Disease and Microbiome Program, Broad Institute, Cambridge, MA 02142, USA. ORCID
  4. Xue Li: Program in Disease Intervention and Prevention, Texas Biomedical Research Institute, San Antonio, TX 78227, USA.
  5. Ian Cheeseman: Program in Host Pathogen Interactions, Texas Biomedical Research Institute, San Antonio, TX 78227, USA.
  6. Timothy J C Anderson: Program in Disease Intervention and Prevention, Texas Biomedical Research Institute, San Antonio, TX 78227, USA.
  7. Ashley Vaughan: Center for Global Infectious Disease Research, Seattle Children's Research Institute, Seattle, WA 98105, USA.
  8. Michael Ferdig: Department of Biological Sciences, Eck Institute for Global Health, University of Notre Dame, Notre Dame, IN 46556, USA.
  9. Sarah K Volkman: Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Harvard University, Boston, MA 02115, USA.
  10. Daniel L Hartl: Department of Organismic and Evolutionary Biology, Harvard University, Cambridge, MA 02138, USA.
  11. Dyann F Wirth: Department of Immunology and Infectious Diseases, Harvard T. H. Chan School of Public Health, Harvard University, Boston, MA 02115, USA. ORCID

Abstract

Pathogen genomics is a powerful tool for tracking infectious disease transmission. In malaria, identity-by-descent is used to assess the genetic relatedness between parasites and has been used to study transmission and importation. In theory, identity-by-descent can be used to distinguish genealogical relationships to reconstruct transmission history or identify parasites for QTL experiments. MalKinID (Malaria Kinship Identifier) is a new classification model designed to identify genealogical relationships among malaria parasites based on genome-wide identity-by-descent proportions and identity-by-descent segment distributions. MalKinID was calibrated to the genomic data from 3 laboratory-based genetic crosses (yielding 440 parent-child and 9060 full-sibling comparisons). MalKinID identified lab-generated F1 progeny with >80% sensitivity and showed that 0.39 (95% CI 0.28, 0.49) of the second-generation progeny of a NF54 and NHP4026 cross were F1s and 0.56 (0.45, 0.67) were backcrosses of an F1 with the parental NF54 strain. In simulated outcrossed importations, MalKinID reconstructs genealogy history with high precision and sensitivity, with F1-scores exceeding 0.84. However, when importation involves inbreeding, such as during serial co-transmission, the precision and sensitivity of MalKinID declined, with F1-scores (the harmonic mean of precision and sensitivity) of 0.76 (0.56, 0.92) and 0.23 (0.0, 0.4) for parent-child and full-sibling and <0.05 for second-degree and third-degree relatives. Disentangling inbred relationships required adapting MalKinID to perform multisample comparisons. Genealogical inference is most powered when (1) outcrossing is the norm or (2) multisample comparisons based on a predefined pedigree are used. MalKinID lays the foundations for using identity-by-descent to track parasite transmission history and for separating progeny for quantitative-trait-locus experiments.

Keywords

References

  1. Genetics. 2015 Nov;201(3):1133-41 [PMID: 26311474]
  2. PLoS Comput Biol. 2018 Jan 9;14(1):e1005923 [PMID: 29315306]
  3. Nat Rev Genet. 2025 Jan;26(1):47-58 [PMID: 39349760]
  4. Front Cell Infect Microbiol. 2022 May 30;12:878496 [PMID: 35711667]
  5. PLoS Genet. 2017 Oct 27;13(10):e1007065 [PMID: 29077712]
  6. Nat Commun. 2022 Jun 16;13(1):3464 [PMID: 35710642]
  7. Am J Hum Genet. 2002 Jan;70(1):170-80 [PMID: 11727198]
  8. Trends Parasitol. 2023 Jan;39(1):17-25 [PMID: 36435688]
  9. Malar J. 2018 May 15;17(1):196 [PMID: 29764422]
  10. Am J Hum Genet. 2020 Apr 2;106(4):426-437 [PMID: 32169169]
  11. Genome Med. 2017 Jan 24;9(1):5 [PMID: 28118860]
  12. Proc Biol Sci. 2008 Mar 22;275(1635):613-21 [PMID: 18211868]
  13. J Comput Biol. 2011 Nov;18(11):1481-93 [PMID: 22035331]
  14. Genetics. 2019 Aug;212(4):1337-1351 [PMID: 31209105]
  15. PNAS Nexus. 2022 Sep 10;1(4):pgac187 [PMID: 36246152]
  16. J Comput Biol. 2013 Oct;20(10):780-91 [PMID: 24093229]
  17. Am J Hum Genet. 2014 Nov 6;95(5):553-64 [PMID: 25439724]
  18. Cold Spring Harb Perspect Med. 2017 Aug 1;7(8): [PMID: 28389516]
  19. Philos Trans R Soc Lond B Biol Sci. 2000 Nov 29;355(1403):1553-62 [PMID: 11127900]
  20. PLoS Genet. 2018 May 23;14(5):e1007279 [PMID: 29791438]
  21. Mol Ecol Resour. 2017 Sep;17(5):1009-1024 [PMID: 28271620]
  22. Physiol Genomics. 2014 Feb 1;46(3):81-90 [PMID: 24326347]
  23. J Comput Biol. 2001;8(2):191-200 [PMID: 11454305]
  24. Nat Commun. 2023 Nov 10;14(1):7268 [PMID: 37949851]
  25. Malar J. 2024 Mar 5;23(1):68 [PMID: 38443939]
  26. Commun Biol. 2021 Jun 14;4(1):734 [PMID: 34127785]
  27. Cell Host Microbe. 2020 Jan 8;27(1):93-103.e4 [PMID: 31901523]
  28. J Comput Biol. 1998 Summer;5(2):323-34 [PMID: 9672835]
  29. Malar J. 2020 Jan 28;19(1):47 [PMID: 31992305]
  30. Nat Commun. 2024 Mar 20;15(1):2499 [PMID: 38509066]
  31. Trends Parasitol. 2020 Oct;36(10):850-863 [PMID: 32891493]
  32. Genome Res. 2014 Jun;24(6):1028-38 [PMID: 24812326]
  33. Theor Popul Biol. 1975 Jun;7(3):338-63 [PMID: 1179265]
  34. Malar J. 2020 Feb 18;19(1):75 [PMID: 32070357]
  35. J Anim Breed Genet. 2008 Feb;125(1):35-44 [PMID: 18254824]
  36. PLoS Genet. 2019 Oct 14;15(10):e1008453 [PMID: 31609965]
  37. Nat Rev Genet. 2021 Aug;22(8):502-517 [PMID: 33833443]
  38. Genetics. 2023 Oct 4;225(2): [PMID: 37226886]
  39. Infect Genet Evol. 2018 Nov;65:414-424 [PMID: 30145390]
  40. Proc Biol Sci. 2012 Jul 7;279(1738):2589-98 [PMID: 22398165]
  41. Bioinformatics. 2010 Jun 15;26(12):i199-207 [PMID: 20529906]

Grants

  1. OPP1156051/Bill and Melinda Gates Foundation
  2. R21 AI141843/NIAID NIH HHS
  3. C06 RR013556/NCRR NIH HHS
  4. P51 OD011133/NIH HHS
  5. 5P01AI127338-07/NIH HHS

MeSH Term

Malaria
Models, Genetic
Humans
Animals
Crosses, Genetic
Pedigree

Word Cloud

Created with Highcharts 10.0.00identity-by-descentMalKinIDtransmissionmalariausedrelationshipssensitivitygeneticparasitesgenealogicalhistorycomparisonsprogenyprecisionrelatednessimportationidentifyexperimentsclassificationmodelbasedparent-childfull-siblingF1NF5456genealogyF1-scoresmultisampleinferenceusingparasitePathogengenomicspowerfultooltrackinginfectiousdiseaseassessstudytheorycandistinguishreconstructQTLMalariaKinshipIdentifiernewdesignedamonggenome-wideproportionssegmentdistributionscalibratedgenomicdata3laboratory-basedcrossesyielding4409060identifiedlab-generated>80%showed3995%CI2849second-generationNHP4026crossF1s4567backcrossesparentalstrainsimulatedoutcrossedimportationsreconstructshighexceeding84Howeverinvolvesinbreedingserialco-transmissiondeclinedharmonicmean7692234<005second-degreethird-degreerelativesDisentanglinginbredrequiredadaptingperformGenealogicalpowered1outcrossingnorm2predefinedpedigreelaysfoundationstrackseparatingquantitative-trait-locusMalKinID:identifying

Similar Articles

Cited By

No available data.