The SAFE procedure: a practical stopping heuristic for active learning-based screening in systematic reviews and meta-analyses.

Josien Boetje, Rens van de Schoot
Author Information
  1. Josien Boetje: Research Group Digital Ethics, Knowledge Center Learning and Innovation (LENI), Archimedes Institute, HU University of Applied Sciences Utrecht, Utrecht, the Netherlands. josien.boetje@hu.nl. ORCID
  2. Rens van de Schoot: Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, The Netherlands.

Abstract

Active learning has become an increasingly popular method for screening large amounts of data in systematic reviews and meta-analyses. The active learning process continually improves its predictions on the remaining unlabeled records, with the goal of identifying all relevant records as early as possible. However, determining the optimal point at which to stop the active learning process is a challenge. The cost of additional labeling of records by the reviewer must be balanced against the cost of erroneous exclusions. This paper introduces the SAFE procedure, a practical and conservative set of stopping heuristics that offers a clear guideline for determining when to end the active learning process in screening software like ASReview. The eclectic mix of stopping heuristics helps to minimize the risk of missing relevant papers in the screening process. The proposed stopping heuristic balances the costs of continued screening with the risk of missing relevant records, providing a practical solution for reviewers to make informed decisions on when to stop screening. Although active learning can significantly enhance the quality and efficiency of screening, this method may be more applicable to certain types of datasets and problems. Ultimately, the decision to stop the active learning process depends on careful consideration of the trade-off between the costs of additional record labeling against the potential errors of the current model for the specific dataset and context.

Keywords

References

  1. PLoS One. 2020 Jan 14;15(1):e0227742 [PMID: 31935267]
  2. BMC Bioinformatics. 2010 Jan 26;11:55 [PMID: 20102628]
  3. J Clin Epidemiol. 2020 May;121:81-90 [PMID: 32004673]
  4. Syst Rev. 2020 Apr 2;9(1):73 [PMID: 32241297]
  5. Brief Bioinform. 2021 Mar 22;22(2):781-799 [PMID: 33279995]
  6. BMC Med Res Methodol. 2022 Dec 16;22(1):322 [PMID: 36522637]
  7. J Med Libr Assoc. 2018 Oct;106(4):531-541 [PMID: 30271302]
  8. J Clin Epidemiol. 2021 Oct;138:80-94 [PMID: 34242757]
  9. JMIR Med Inform. 2022 May 2;10(5):e33219 [PMID: 35499859]
  10. Health Info Libr J. 2010 Jun;27(2):114-22 [PMID: 20565552]
  11. Syst Rev. 2021 Apr 1;10(1):93 [PMID: 33795003]
  12. BMJ. 2021 Mar 29;372:n71 [PMID: 33782057]
  13. BMC Med Res Methodol. 2020 Oct 15;20(1):256 [PMID: 33059590]
  14. Conserv Biol. 2018 Aug;32(4):762-764 [PMID: 29644722]
  15. Am J Surg. 2013 Sep;206(3):439-40 [PMID: 23759696]
  16. Res Synth Methods. 2018 Sep;9(3):470-488 [PMID: 29956486]
  17. Proc Conf Assoc Comput Linguist Meet. 2017 Jul;2017:7-12 [PMID: 29093610]
  18. Front Res Metr Anal. 2021 May 28;6:685591 [PMID: 34124534]
  19. J Clin Epidemiol. 2022 Apr;144:22-42 [PMID: 34896236]
  20. Methods Mol Biol. 2022;2345:17-40 [PMID: 34550582]
  21. Res Synth Methods. 2022 Jul;13(4):533-545 [PMID: 35472127]
  22. Front Res Metr Anal. 2023 May 16;8:1178181 [PMID: 37260784]
  23. J Biomed Inform. 2012 Apr;45(2):265-72 [PMID: 22127105]
  24. Syst Rev. 2016 Dec 5;5(1):210 [PMID: 27919275]
  25. Environ Int. 2020 May;138:105623 [PMID: 32203803]
  26. J Clin Epidemiol. 2009 Feb;62(2):149-57 [PMID: 18722088]

MeSH Term

Humans
Heuristics
Problem-Based Learning
Systematic Reviews as Topic
Software

Word Cloud

Created with Highcharts 10.0.0learningscreeningactiveprocessrecordsstoppingrelevantstoppracticalheuristicActivemethodsystematicreviewsmeta-analysesdeterminingcostadditionallabelingSAFEheuristicsriskmissingcostsStoppingbecomeincreasinglypopularlargeamountsdatacontinuallyimprovespredictionsremainingunlabeledgoalidentifyingearlypossibleHoweveroptimalpointchallengereviewermustbalancederroneousexclusionspaperintroducesprocedureconservativesetoffersclearguidelineendsoftwarelikeASRevieweclecticmixhelpsminimizepapersproposedbalancescontinuedprovidingsolutionreviewersmakeinformeddecisionsAlthoughcansignificantlyenhancequalityefficiencymayapplicablecertaintypesdatasetsproblemsUltimatelydecisiondependscarefulconsiderationtrade-offrecordpotentialerrorscurrentmodelspecificdatasetcontextprocedure:learning-basedMachineMeta-analysisMethodologyScreeningprioritizationruleSystematicreview

Similar Articles

Cited By (9)