Scaling multi-instance support vector machine to breast cancer detection on the BreaKHis dataset.

Hoon Seo, Lodewijk Brand, Lucia Saldana Barco, Hua Wang
Author Information
  1. Hoon Seo: Department of Computer Science, Colorado School of Mines, Golden, CO 80401, USA.
  2. Lodewijk Brand: Department of Computer Science, Colorado School of Mines, Golden, CO 80401, USA.
  3. Lucia Saldana Barco: Department of Computer Science, Colorado School of Mines, Golden, CO 80401, USA.
  4. Hua Wang: Department of Computer Science, Colorado School of Mines, Golden, CO 80401, USA.

Abstract

MOTIVATION: Breast cancer is a type of cancer that develops in breast tissues, and, after skin cancer, it is the most commonly diagnosed cancer in women in the United States. Given that an early diagnosis is imperative to prevent breast cancer progression, many machine learning models have been developed in recent years to automate the histopathological classification of the different types of carcinomas. However, many of them are not scalable to large-scale datasets.
RESULTS: In this study, we propose the novel Primal-Dual Multi-Instance Support Vector Machine to determine which tissue segments in an image exhibit an indication of an abnormality. We derive an efficient optimization algorithm for the proposed objective by bypassing the quadratic programming and least-squares problems, which are commonly employed to optimize Support Vector Machine models. The proposed method is computationally efficient, thereby it is scalable to large-scale datasets. We applied our method to the public BreaKHis dataset and achieved promising prediction performance and scalability for histopathological classification.
AVAILABILITY AND IMPLEMENTATION: Software is publicly available at: https://1drv.ms/u/s!AiFpD21bgf2wgRLbQq08ixD0SgRD?e=OpqEmY.
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.

References

IEEE Rev Biomed Eng. 2009;2:147-71 [PMID: 20671804]
IEEE Trans Image Process. 2010 Jun;19(6):1657-63 [PMID: 20215079]
Adv Exp Med Biol. 2014;773:537-69 [PMID: 24563365]
Nat Med. 2021 May;27(5):775-784 [PMID: 33990804]
J Cytol. 2018 Apr-Jun;35(2):99-104 [PMID: 29643657]
J Med Eng. 2015;2015:457906 [PMID: 27006938]
IEEE Trans Biomed Eng. 2016 Jul;63(7):1455-62 [PMID: 26540668]
Acta Cytol. 2020;64(6):511-519 [PMID: 32570234]
BMC Bioinformatics. 2007 Mar 30;8:110 [PMID: 17394669]

MeSH Term

Algorithms
Breast Neoplasms
Female
Humans
Machine Learning
Software
Support Vector Machine