Alexander Brenner, Jutta Esser, Franziska Schuler, Julian Varghese, Frieder Schaumburg
Urine samples are frequently analyzed in microbiology laboratories, but a large proportion of them are culture-negative. The aim of this study was to test whether positive urine cultures can be predicted from routine flow cytometric data. Urine samples (n���=���1325) were used for a train dataset (n���=���1032) and three independent test datasets (n���=���93-100 samples) that were collected three months apart. Predictors from flow cytometry were total counts per ��l of bacteria, erythrocytes, yeast-like cells, hyaline casts, crystals, leukocytes, squamous epithelial cells, non-hyaline casts and non-squamous epithelial cells in addition to age, sex and type of urine sample. Labels were positive culture and detection of clinically relevant uropathogens. Three classifiers (decision tree, random forest classifier, CatBoost) were 5-fold cross-validated on the train dataset to select an optimized model with ������95���% sensitivity. The optimized model was trained on the complete train dataset and evaluated on the three independent test sets. In total, 72.5���% (960/1325) samples were culture positive with a predominance of Escherichia coli (n���=���295). CatBoost outperformed the other classifiers in terms of balanced accuracy (train data) and was selected as the classifier for predictions. With optimised hyperparameters, the balanced accuracy was 62-74���% for the prediction of a positive culture (test data) and had a sensitivity that was stable over a period of six months (94-96���%, negative predictive value [NPV]: 67-77���%, positive predictive value [PPV]: 78-81���%). For the prediction of uropathogens, the balanced accuracy was 57-63���% with a stable sensitivity (95-100���%, NPV: 83-100���%, PPV: 48-59���%). In conclusion, the ML algorithms showed high sensitivity for detecting positive urine cultures.