Phylogenetic inference of inter-population transmission rates for infectious diseases.
Skylar A Gay, Gregory Ellison, Jianing Xu, Jialin Yang, Yiliang Wei, Shaoyuan Wu, Lili Yu, Christopher C Whalen, Jonathan Arnold, Liang Liu
Author Information
Skylar A Gay: Institute of Bioinformatics, University of Georgia, 120 Green Street, Athens, GA 30602, United States.
Gregory Ellison: Department of Statistics, University of Georgia, 310 Herty Drive, Athens, GA 30602, United States.
Jianing Xu: Department of Statistics, University of Georgia, 310 Herty Drive, Athens, GA 30602, United States.
Jialin Yang: Department of Statistics, University of Georgia, 310 Herty Drive, Athens, GA 30602, United States.
Yiliang Wei: Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu International Joint Center of Genomics, School of Life Sciences, Jiangsu Normal University, 101 Shanghai Road, Xuzhou, Jiangsu 221116, China.
Shaoyuan Wu: Jiangsu Key Laboratory of Phylogenomics and Comparative Genomics, Jiangsu International Joint Center of Genomics, School of Life Sciences, Jiangsu Normal University, 101 Shanghai Road, Xuzhou, Jiangsu 221116, China.
Lili Yu: Department of Biostatistics, Epidemiology and Environmental Health Sciences, College of Public Health, Georgia Southern University, 1332 Southern Drive, Statesboro, GA 30677, United States.
Christopher C Whalen: Global Health Institute, Department of Epidemiology and Biostatistics, College of Public Health, University of Georgia, 100 Foster Road, Athens, GA 30602, United States.
Jonathan Arnold: Institute of Bioinformatics, University of Georgia, 120 Green Street, Athens, GA 30602, United States.
Liang Liu: Institute of Bioinformatics, University of Georgia, 120 Green Street, Athens, GA 30602, United States. ORCID
Estimating transmission rates is a challenging yet essential aspect of comprehending and controlling the spread of infectious diseases. Various methods exist for estimating transmission rates, each with distinct assumptions, data needs, and constraints. This study introduces a novel phylogenetic approach called transRate, which integrates genetic information with traditional epidemiological approaches to estimate inter-population transmission rates. The phylogenetic method is statistically consistent as the sample size (i.e. the number of pathogen genomes) approaches infinity under the multi-population susceptible-infected-recovered model. Simulation analyses indicate that transRate can accurately estimate the transmission rate with a sample size of 200���~���400 pathogen genomes. Using transRate, we analyzed 40,028 high-quality sequences of SARS-CoV-2 in human hosts during the early pandemic. Our analysis uncovered significant transmission between populations even before widespread travel restrictions were implemented. The development of transRate provides valuable insights for scientists and public health officials to enhance their understanding of the pandemic's progression and aiding in preparedness for future viral outbreaks. As public databases for genomic sequences continue to expand, transRate is increasingly vital for tracking and mitigating the spread of infectious diseases.