Classification of the Clinical Images for Benign and Malignant Cutaneous Tumors Using a Deep Learning Algorithm.

Seung Seog Han, Myoung Shin Kim, Woohyung Lim, Gyeong Hun Park, Ilwoo Park, Sung Eun Chang
Author Information
  1. Seung Seog Han: I Dermatology Clinic, Seoul, Korea.
  2. Myoung Shin Kim: Department of Dermatology, Sanggye Paik Hospital, Inje University College of Medicine, Seoul, Korea.
  3. Woohyung Lim: SK Telecom, Human Machine Interface Technology Laboratory, Seoul, Korea.
  4. Gyeong Hun Park: Department of Dermatology, Dongtan Sacred Heart Hospital, Hallym University College of Medicine, Dongtan, Korea.
  5. Ilwoo Park: Department of Radiology, Chonnam National University Medical School and Hospital, Gwangju, Korea.
  6. Sung Eun Chang: Department of Dermatology, Asan Medical Center, Ulsan University College of Medicine, Seoul, Korea. Electronic address: csesnumd@gmail.com.

Abstract

We tested the use of a deep learning algorithm to classify the clinical images of 12 skin diseases-basal cell carcinoma, squamous cell carcinoma, intraepithelial carcinoma, actinic keratosis, seborrheic keratosis, malignant melanoma, melanocytic nevus, lentigo, pyogenic granuloma, hemangioma, dermatofibroma, and wart. The convolutional neural network (Microsoft ResNet-152 model; Microsoft Research Asia, Beijing, China) was fine-tuned with images from the training portion of the Asan dataset, MED-NODE dataset, and atlas site images (19,398 images in total). The trained model was validated with the testing portion of the Asan, Hallym and Edinburgh datasets. With the Asan dataset, the area under the curve for the diagnosis of basal cell carcinoma, squamous cell carcinoma, intraepithelial carcinoma, and melanoma was 0.96 ± 0.01, 0.83 ± 0.01, 0.82 ± 0.02, and 0.96 ± 0.00, respectively. With the Edinburgh dataset, the area under the curve for the corresponding diseases was 0.90 ± 0.01, 0.91 ± 0.01, 0.83 ± 0.01, and 0.88 ± 0.01, respectively. With the Hallym dataset, the sensitivity for basal cell carcinoma diagnosis was 87.1% ± 6.0%. The tested algorithm performance with 480 Asan and Edinburgh images was comparable to that of 16 dermatologists. To improve the performance of convolutional neural network, additional images with a broader range of ages and ethnicities should be collected.

MeSH Term

Adult
Aged
Aged, 80 and over
Area Under Curve
Biopsy
Datasets as Topic
Deep Learning
Diagnosis, Differential
False Positive Reactions
Female
Granuloma, Pyogenic
Humans
Image Processing, Computer-Assisted
Keratosis, Actinic
Keratosis, Seborrheic
Lentigo
Male
Middle Aged
Photography
Predictive Value of Tests
ROC Curve
Skin
Skin Neoplasms
Software
Warts
Young Adult

Word Cloud

Created with Highcharts 10.0.00±carcinomaimages01celldatasetAsanEdinburghtestedalgorithmsquamousintraepithelialkeratosismelanomaconvolutionalneuralnetworkMicrosoftmodelportionHallymareacurvediagnosisbasal9683respectivelyperformanceusedeeplearningclassifyclinical12skindiseases-basalactinicseborrheicmalignantmelanocyticnevuslentigopyogenicgranulomahemangiomadermatofibromawartResNet-152ResearchAsiaBeijingChinafine-tunedtrainingMED-NODEatlassite19398totaltrainedvalidatedtestingdatasets820200correspondingdiseases909188sensitivity871%60%480comparable16dermatologistsimproveadditionalbroaderrangeagesethnicitiescollectedClassificationClinicalImagesBenignMalignantCutaneousTumorsUsingDeepLearningAlgorithm

Similar Articles

Cited By