A deep learning approach to private data sharing of medical images using conditional generative adversarial networks (GANs).

Hanxi Sun, Jason Plawinski, Sajanth Subramaniam, Amir Jamaludin, Timor Kadir, Aimee Readie, Gregory Ligozio, David Ohlssen, Mark Baillie, Thibaud Coroller
Author Information
  1. Hanxi Sun: Department of Statistics, Purdue University, West Lafayette, IN, United States of America.
  2. Jason Plawinski: Novartis Pharmaceutical Corporation, East Hanover, New Jersey, United States of America.
  3. Sajanth Subramaniam: Novartis Pharmaceutical Corporation, East Hanover, New Jersey, United States of America.
  4. Amir Jamaludin: Oxford Big Data Institute, Oxford, United Kingdom.
  5. Timor Kadir: Plexalis Ltd, Oxford, United Kingdom.
  6. Aimee Readie: Novartis Pharmaceutical Corporation, East Hanover, New Jersey, United States of America.
  7. Gregory Ligozio: Novartis Pharmaceutical Corporation, East Hanover, New Jersey, United States of America.
  8. David Ohlssen: Novartis Pharmaceutical Corporation, East Hanover, New Jersey, United States of America.
  9. Mark Baillie: Novartis Pharmaceutical Corporation, East Hanover, New Jersey, United States of America.
  10. Thibaud Coroller: Novartis Pharmaceutical Corporation, East Hanover, New Jersey, United States of America. ORCID

Abstract

Clinical data sharing can facilitate data-driven scientific research, allowing a broader range of questions to be addressed and thereby leading to greater understanding and innovation. However, sharing biomedical data can put sensitive personal information at risk. This is usually addressed by data anonymization, which is a slow and expensive process. An alternative to anonymization is construction of a synthetic dataset that behaves similar to the real clinical data but preserves patient privacy. As part of a collaboration between Novartis and the Oxford Big Data Institute, a synthetic dataset was generated based on images from COSENTYX® (secukinumab) ankylosing spondylitis (AS) clinical studies. An auxiliary classifier Generative Adversarial Network (ac-GAN) was trained to generate synthetic magnetic resonance images (MRIs) of vertebral units (VUs), conditioned on the VU location (cervical, thoracic and lumbar). Here, we present a method for generating a synthetic dataset and conduct an in-depth analysis on its properties along three key metrics: image fidelity, sample diversity and dataset privacy.

References

  1. Nat Commun. 2020 Aug 3;11(1):3877 [PMID: 32747659]
  2. Lancet. 2013 Nov 23;382(9906):1705-13 [PMID: 24035250]
  3. IEEE Trans Pattern Anal Mach Intell. 2021 Dec;43(12):4217-4228 [PMID: 32012000]
  4. Clin Cancer Res. 2019 Jun 1;25(11):3266-3275 [PMID: 31010833]
  5. PLoS Med. 2018 Nov 30;15(11):e1002711 [PMID: 30500819]
  6. Artif Intell Med. 2020 Sep;109:101938 [PMID: 34756215]
  7. Nat Med. 2018 Sep;24(9):1342-1350 [PMID: 30104768]
  8. Circ Cardiovasc Qual Outcomes. 2019 Jul;12(7):e005122 [PMID: 31284738]
  9. Med Image Anal. 2019 Dec;58:101552 [PMID: 31521965]
  10. Harv Data Sci Rev. 2020 Summer;2(3): [PMID: 38116301]
  11. N Engl J Med. 2015 Dec 24;373(26):2534-48 [PMID: 26699169]
  12. Transl Vis Sci Technol. 2020 Jul 06;9(2):36 [PMID: 32855840]
  13. Procedia Comput Sci. 2021;181:1018-1026 [PMID: 33643498]
  14. Sci Data. 2017 Sep 19;4:170124 [PMID: 28925987]
  15. Med Image Anal. 2017 Oct;41:63-73 [PMID: 28756059]
  16. IEEE Trans Image Process. 2004 Apr;13(4):600-12 [PMID: 15376593]

MeSH Term

Humans
Deep Learning
Academies and Institutes
Benchmarking
Big Data
Information Dissemination
Image Processing, Computer-Assisted

Word Cloud

Created with Highcharts 10.0.0datasyntheticdatasetsharingimagescanaddressedanonymizationclinicalprivacyClinicalfacilitatedata-drivenscientificresearchallowingbroaderrangequestionstherebyleadinggreaterunderstandinginnovationHoweverbiomedicalputsensitivepersonalinformationriskusuallyslowexpensiveprocessalternativeconstructionbehavessimilarrealpreservespatientpartcollaborationNovartisOxfordBigDataInstitutegeneratedbasedCOSENTYX®secukinumabankylosingspondylitisASstudiesauxiliaryclassifierGenerativeAdversarialNetworkac-GANtrainedgeneratemagneticresonanceMRIsvertebralunitsVUsconditionedVUlocationcervicalthoraciclumbarpresentmethodgeneratingconductin-depthanalysispropertiesalongthreekeymetrics:imagefidelitysamplediversitydeeplearningapproachprivatemedicalusingconditionalgenerativeadversarialnetworksGANs

Similar Articles

Cited By