Increasing the efficiency of study selection for systematic reviews using prioritization tools and a single-screening approach.
Siw Waffenschmidt, Wiebke Sieben, Thomas Jakubeit, Marco Knelangen, Inga Overesch, Stefanie Bühn, Dawid Pieper, Nicole Skoetz, Elke Hausner
Author Information
Siw Waffenschmidt: Institute for Quality and Efficiency in Health Care, Cologne, Germany. siw.waffenschmidt@iqwig.de. ORCID
Wiebke Sieben: Institute for Quality and Efficiency in Health Care, Cologne, Germany.
Thomas Jakubeit: Institute for Quality and Efficiency in Health Care, Cologne, Germany.
Marco Knelangen: Institute for Quality and Efficiency in Health Care, Cologne, Germany.
Inga Overesch: Institute for Quality and Efficiency in Health Care, Cologne, Germany.
Stefanie Bühn: Institute for Research in Operative Medicine, Herdecke University, Witten, Germany.
Dawid Pieper: Institute for Research in Operative Medicine, Herdecke University, Witten, Germany.
Nicole Skoetz: Evidence-Based Medicine, Department I of Internal Medicine, Faculty of Medicine, University Hospital Cologne, University of Cologne, Cologne, Germany.
Elke Hausner: Institute for Quality and Efficiency in Health Care, Cologne, Germany.
BACKGROUND: Systematic literature screening is a key component in systematic reviews. However, this approach is resource intensive as generally two persons independently of each other (double screening) screen a vast number of search results. To develop approaches for increasing efficiency, we tested the use of text mining to prioritize search results as well as the involvement of only one person (single screening) in the study selection process. METHOD: Our study is based on health technology assessments (HTAs) of drug and non-drug interventions. Using a sample size calculation, we consecutively included 11 searches resulting in 33 study selection processes. Of the three screeners for each search, two used screening tools with prioritization (Rayyan, EPPI Reviewer) and one a tool without prioritization. For each prioritization tool, we investigated the proportion of citations classified as relevant at three cut-offs or STOP criteria (after screening 25%, 50% and 75% of the citation set). For each STOP criterion, we measured sensitivity (number of correctly identified relevant studies divided by the total number of relevant studies in the study pool). In addition, we determined the number of relevant studies identified per single screening round and investigated whether missed studies were relevant to the HTA conclusion. RESULTS: Overall, EPPI Reviewer performed better than Rayyan and identified the vast majority (88%, Rayyan 66%) of relevant citations after screening half of the citation set. As long as additional information sources were screened, it was sufficient to apply a single-screening approach to identify all studies relevant to the HTA conclusion. Although many relevant publications (n = 63) and studies (n = 29) were incorrectly excluded, ultimately only 5 studies could not be identified at all in 2 of the 11 searches (1x 1 study, 1x 4 studies). However, their omission did not change the overall conclusion in any HTA. CONCLUSIONS: EPPI Reviewer helped to identify relevant citations earlier in the screening process than Rayyan. Single screening would have been sufficient to identify all studies relevant to the HTA conclusion. However, this requires screening of further information sources. It also needs to be considered that the credibility of an HTA may be questioned if studies are missing, even if they are not relevant to the HTA conclusion.