Privacy-Preserving Artificial Intelligence Techniques in Biomedicine.
Reihaneh Torkzadehmahani, Reza Nasirigerdeh, David B Blumenthal, Tim Kacprowski, Markus List, Julian Matschinske, Julian Spaeth, Nina Kerstin Wenke, Jan Baumbach
Author Information
Reihaneh Torkzadehmahani: Institute for Artificial Intelligence in Medicine and Healthcare, Technical University of Munich, Munich, Germany.
Reza Nasirigerdeh: Institute for Artificial Intelligence in Medicine and Healthcare, Technical University of Munich, Munich, Germany.
David B Blumenthal: Department of Artificial Intelligence in Biomedical Engineering (AIBE), Friedrich-Alexander University Erlangen-N��rnberg (FAU), Erlangen, Germany.
Tim Kacprowski: Division of Data Science in Biomedicine, Peter L. Reichertz Institute for Medical Informatics of TU Braunschweig and Medical School Hannover, Braunschweig, Germany.
Markus List: Chair of Experimental Bioinformatics, Technical University of Munich, Munich, Germany.
BACKGROUND: Artificial intelligence (AI) has been successfully applied in numerous scientific domains. In biomedicine, AI has already shown tremendous potential, e.g., in the interpretation of next-generation sequencing data and in the design of clinical decision support systems. OBJECTIVES: However, training an AI model on sensitive data raises concerns about the privacy of individual participants. For example, summary statistics of a genome-wide association study can be used to determine the presence or absence of an individual in a given dataset. This considerable privacy risk has led to restrictions in accessing genomic and other biomedical data, which is detrimental for collaborative research and impedes scientific progress. Hence, there has been a substantial effort to develop AI methods that can learn from sensitive data while protecting individuals' privacy. METHOD: This paper provides a structured overview of recent advances in privacy-preserving AI techniques in biomedicine. It places the most important state-of-the-art approaches within a unified taxonomy and discusses their strengths, limitations, and open problems. CONCLUSION: As the most promising direction, we suggest combining federated machine learning as a more scalable approach with other additional privacy-preserving techniques. This would allow to merge the advantages to provide privacy guarantees in a distributed way for biomedical applications. Nonetheless, more research is necessary as hybrid approaches pose new challenges such as additional network or computation overhead.
References
BMC Med Inform Decis Mak. 2014;14 Suppl 1:S2
[PMID: 25521306]
J Am Med Inform Assoc. 2016 May;23(3):570-9
[PMID: 26554428]