Hidden population size estimation and diagnostics using two respondent-driven samples with applications in Armenia.

Brian J Kim, Lisa G Johnston, Trdat Grigoryan, Arshak Papoyan, Samvel Grigoryan, Katherine R McLaughlin
Author Information
  1. Brian J Kim: Joint Program in Survey Methodology, University of Maryland, College Park, Maryland, USA. ORCID
  2. Lisa G Johnston: Independent Consultant, LGJ Consultants, Inc., Valencia, Spain.
  3. Trdat Grigoryan: National Center for AIDS Prevention, Yerevan, Armenia.
  4. Arshak Papoyan: National Center for AIDS Prevention, Yerevan, Armenia.
  5. Samvel Grigoryan: National Center for AIDS Prevention, Yerevan, Armenia.
  6. Katherine R McLaughlin: Department of Statistics, Oregon State University, Corvallis, Oregon, USA. ORCID

Abstract

Estimating the size of hidden populations is essential to understand the magnitude of social and healthcare needs, risk behaviors, and disease burden. However, due to the hidden nature of these populations, they are difficult to survey, and there are no gold standard size estimation methods. Many different methods and variations exist, and diagnostic tools are needed to help researchers assess method-specific assumptions as well as compare between methods. Further, because many necessary mathematical assumptions are unrealistic for real survey implementation, assessment of how robust methods are to deviations from the stated assumptions is essential. We describe diagnostics and assess the performance of a new population size estimation method, capture-recapture with successive sampling population size estimation (CR-SS-PSE), which we apply to data from 3 years of studies from three cities and three hidden populations in Armenia. CR-SS-PSE relies on data from two sequential respondent-driven sampling surveys and extends the successive sampling population size estimation (SS-PSE) framework by using the number of individuals in the overlap between the two surveys and a model for the successive sampling process to estimate population size. We demonstrate that CR-SS-PSE is more robust to violations of successive sampling assumptions than SS-PSE. Further, we compare the CR-SS-PSE estimates to population size estimations using other common methods, including unique object and service multipliers, wisdom of the crowd, and two-source capture-recapture to illustrate volatility across estimation methods.

Keywords

References

  1. Avery, L., & Rotondi, M. (2020). More comprehensive reporting of methods in studies using respondent driven sampling is required: A systematic review of the uptake of the strobe-rds guidelines. Journal of Clinical Epidemiology, 117, 68-77.
  2. Avery, L., Macpherson, A., Flicker, S., & Rotondi, M. (2021). A review of reported network degree and recruitment characteristics in respondent driven sampling implications for applied researchers and methodologists. PLoS ONE, 16(4), e0249074.
  3. Bernard, H., Hallett, T., Iovita, A., Johnsen, E., Lyerla, R., McCarty, C., Mahy, M., Salganik, M., Saliuk, T., Scutelniciuc, O., Shelley, G., Sirinirund, P., Weir, S., & Stroup, D. (2010). Counting hard-to-count populations: The network scale-up method for public health. Sexually Transmitted Infections, 86, ii11-ii15.
  4. Chapman, D. G. (1951). Some properties of the hypergeometric distribution with applications to zoological sample censuses (pp. 131-160). University of California Press.
  5. Crawford, F. W., Aronow, P. M., Zeng, L., & Li, J. (2018). Identification of homophily and preferential recruitment in respondent-driven sampling. American Journal of Epidemiology, 187(1), 153-160.
  6. Crawford, F. W., Wu, J., & Heimer, R. (2018). Hidden population size estimation from respondent-driven sampling: A network approach. Journal of the American Statistical Association, 113(522), 755-766.
  7. Darroch, J., Fienberg, S., Glonek, G., & Junker, B. (1993). A three-sample multiple recapture approach to census population estimation with heterogeneous catchability. Journal of the American Statistical Association, 88, 1137-1148.
  8. Fienberg, S., Johnson, M., & Junker, B. (1999). Classical multilevel and Bayesian approaches to population size estimation using multiple lists. Journal of the Royal Statistical Society, Series A (Statistics in Society), 162, 383-405.
  9. Gile, K. J. (2011). Improved inference for respondent-driven sampling data with application to HIV prevalence estimation. Journal of the American Statistical Association, 106(493), 135-146.
  10. Gile, K. J., & Handcock, M. S. (2010). Respondent-driven sampling: An assessment of current methodology. Sociological Methodology, 40(1), 285-327.
  11. Gile, K. J., Johnston, L. G., & Salganik, M. J. (2015). Diagnostics for respondent-driven sampling. Journal of the Royal Statistical Society, Series A (Statistics in Society), 178(1), 241-269.
  12. Gile, K. J., Beaudry, I. S., Handcock, M. S., & Ott, M. Q. (2018). Methods for inference from respondent-driven sampling data. Annual Review of Statistics and Its Application, 5(1), 65-93.
  13. Global HIV Strategic Information Working Group. (2017). Biobehavioural survey guidelines for populations at risk for HIV. World Health Organization.
  14. Handcock, M. S., Gile, K. J., & Mar, C. M. (2014). Estimating hidden population size using respondent-driven sampling data. Electronic Journal of Statistics, 8(1), 1491-1521.
  15. Handcock, M. S., Gile, K. J., & Mar, C. M. (2015). Estimating the size of populations at high risk for HIV using respondent-driven sampling data. Biometrics, 71(1), 258-266.
  16. Handcock, M. S., Fellows, I. E., & Gile, K. J. (2019). RDS: Respondent-driven sampling. https://CRAN.R-project.org/package=RDS. R package version 0.9-0.
  17. Handcock, M. S., Gile, K. J., Kim, B. J., & McLaughlin, K. R. (2022). sspse: Estimating hidden population size using respondent driven sampling data. https://CRAN.R-project.org/package=sspse. R package version 1.0.3.
  18. Heckathorn, D. D. (1997). Respondent-driven sampling: A new approach to the study of hidden populations. Social Problems, 44(2), 174-199.
  19. Heckathorn, D. D. (2007). Extensions of respondent-driven sampling: Analyzing continuous variables and controlling for differential recruitment. Sociological Methodology, 37(1), 151-208.
  20. Johnston, L., Hakim, A., Dittrich, S., Burnett, J., Kim, E., & White, R. (2016). A systematic review of published respondent-driven sampling surveys collecting behavioral and biologic data. AIDS Behavior, 20(8), 1754-76.
  21. Johnston, L. G. (2013). Introduction to respondent-driven sampling. World Health Organization.
  22. Johnston, L. G., Prybylski, D., Fisher, R. H., Mirzazadeh, A., Manopaiboon, C., & McFarland, W. (2013). Incorporating the service multiplier method in respondent-driven sampling surveys to estimate the size of hidden and hard-to-reach populations: Case studies from around the world. Sexually Transmitted Diseases, 40(4), 304-310.
  23. Johnston, L. G., McLaughlin, K. R., El Rhilani, H., Latifi, A., Toufik, A., Bennani, A., Alami, K., Elomari, B., & Handcock, M. S. (2015). Estimating the size of hidden populations using respondent-driven sampling data: Case examples from Morocco. Epidemiology, 26(6), 846-852.
  24. Johnston, L. G., Soe, P.-M., Aung, M. Y., & Ammassari, S. (2019). Estimating the population size of males who inject drugs in Myanmar: Methods for obtaining township and national estimates. AIDS Behavior, 23(1), 295-301.
  25. Kim, B. J., & Handcock, M. S. (2021). Population size estimation using multiple respondent-driven sampling surveys. Journal of Survey Statistics and Methodology, 9(1), 94-120. https://doi.org/10.1093/jssam/smz055
  26. Lansky, A., Abdul-Quader, A. S., Cribbin, M., Hall, T., Finlayson, T. J., Garfein, R. S., Lin, L. S., & Sullivan, P. S. (2007). Developing an HIV behavioral surveillance system for injecting drug users: The National HIV Behavioral Surveillance System. Public Health Report, 122, 48-55.
  27. Malekinejad, M., Johnston, L. G., Kendall, C., Kerr, L. R. F. S., Rifkin, M. R., & Rutherford, G. W. (2008). Using respondent-driven sampling methodology for HIV biological and behavioral surveillance in international settings: A systematic review. AIDS Behavior, 12(Supplement 1), 105-130.
  28. Manrique-Vallier, D. (2016). Bayesian population size estimation using Dirichlet process mixtures. Biometrics, 72, 1246-1252.
  29. Manrique-Vallier, D., & Fienberg, S. (2008). Population size estimation using individual level mixture models. Biometrical Journal, 50, 1051-63.
  30. McLaughlin, K. R., Johnston, L. G., Gamble, L. J., Grigoryan, T., Papoyan, A., & Grigoryan, S. (2019). Population size estimations among hidden populations using respondent-driven sampling surveys: Case studies from Armenia. JMIR Public Health and Surveillance, 5(1), e12034.
  31. National Center for AIDS Prevention of Ministry of Health of Armenia. (2016). Integrated biological-behavioral surveillance survey among people who inject drugs, female sex workers and men who have sex with men in Yerevan, Gyumri, and Vanadzor, Armenia (Technical Report). National Center for AIDS Prevention of Ministry of Health of the Republic of Armenia, Yerevan, Armenia. http://www.armaids.am/images/pdf/BBS_PWID_FSW_MSM_ARM_REPORT_FINAL_2016_eng.pdf
  32. National Center for AIDS Prevention of Ministry of Health of Armenia. (2018). Integrated biological-behavioral surveillance survey among people who inject drugs, female sex workers, men who have sex with men and transgender persons (Technical Report). National Center for AIDS Prevention of Ministry of Health of the Republic of Armenia, Yerevan, Armenia. https://ecom.ngo/wp-content/uploads/2018/12/IBBS_ARMENIA_2018_eng_FINAL.pdf
  33. Okal, J., Geibel, S., Muraguri, N., Musyoki, H., Tun, W., Broz, D., Kuria, D., Kim, A., Oluoch, T., & Raymond, H. F. (2013). Estimates of the size of key populations at risk for HIV infection: Men who have sex with men, female sex workers and injecting drug users in Nairobi, Kenya. Sexually Transmitted Infections, 89(5), 366-371.
  34. Okiria, A. G., Bolo, A., Achut, V., Arkangelo, G. C., Michael, A. T. I., Katoro, J. S., Wesson, J., Gutreuter, S., Hundley, L., & Hakim, A. (2019). Novel approaches for estimating female sex worker population size in conflict-affected South Sudan. JMIR Public Health and Surveillance, 5(1), e11576.
  35. Paz-Bailey, G., Jacobson, J. O., Guardado, M. E., Hernandez, F. M., Nieto, A. I., Estrada, M., & Creswell, J. (2011). How many men who have sex with men and female sex workers live in El Salvador? Using respondent-driven sampling and capture-recapture to estimate population sizes. Sexually Transmitted Infections, 87(4), 279-282.
  36. Sabin, K., Zhao, J., Garcia Calleja, J. M., Sheng, Y., Arias Garcia, S., Reinisch, A., & Komatsu, R. (2016). Availability and quality of size estimations of female sex workers, men who have sex with men, people who inject drugs and transgender women in low- and middle-income countries. PLoS ONE, 11(5), e0155150.
  37. Spreen, M. (1992). Rare populations, hidden populations, and link-tracing designs: What and why? Bulletin of Sociological Methodology, 36(1), 34-58.
  38. Statistical Committee of the Republic of Armenia. (2016). Statistical Committee of the Republic of Armenia. https://www.armstat.am/en/
  39. Sulaberidze, L., Mirzazadeh, A., Chikovani, I., Shengelia, N., Tsereteli, N., & Gotsadze, G. (2016). Population size estimation of men who have sex with men in Tbilisi, Georgia; Multiple methods and triangulation of findings. PLoS ONE, 11, e0147413.
  40. UNAIDS & World Health Organization. (2010). Guidelines on estimating the size of populations most at risk to HIV (Technical Report.) World Health Organization. http://www.unaids.org/en/media/unaids/contentassets/documents/epidemiology/2011/2011_estimating_populations_en.pdf

MeSH Term

Humans
Population Density
Armenia
Surveys and Questionnaires
Cities
Sampling Studies

Word Cloud

Created with Highcharts 10.0.0sizesamplingpopulationestimationmethodssuccessiveassumptionsCR-SS-PSEhiddenpopulationstwousingessentialsurveyassesscomparerobustdiagnosticscapture-recapturedatathreeArmeniarespondent-drivensurveysSS-PSEEstimatingunderstandmagnitudesocialhealthcareneedsriskbehaviorsdiseaseburdenHoweverduenaturedifficultgoldstandardManydifferentvariationsexistdiagnostictoolsneededhelpresearchersmethod-specificwellmanynecessarymathematicalunrealisticrealimplementationassessmentdeviationsstateddescribeperformancenewmethodapply3yearsstudiescitiesreliessequentialextendsframeworknumberindividualsoverlapmodelprocessestimatedemonstrateviolationsestimatesestimationscommonincludinguniqueobjectservicemultiplierswisdomcrowdtwo-sourceillustratevolatilityacrossestimation methodsHiddensamplesapplicationsMarkovchainMonteCarlohumanimmunodeficiencyviruslinktracingnonprobability

Similar Articles

Cited By