MOTIVATION: Breast cancer is a type of cancer that develops in breast tissues, and, after skin cancer, it is the most commonly diagnosed cancer in women in the United States. Given that an early diagnosis is imperative to prevent breast cancer progression, many machine learning models have been developed in recent years to automate the histopathological classification of the different types of carcinomas. However, many of them are not scalable to large-scale datasets.
RESULTS: In this study, we propose the novel Primal-Dual Multi-Instance Support Vector Machine to determine which tissue segments in an image exhibit an indication of an abnormality. We derive an efficient optimization algorithm for the proposed objective by bypassing the quadratic programming and least-squares problems, which are commonly employed to optimize Support Vector Machine models. The proposed method is computationally efficient, thereby it is scalable to large-scale datasets. We applied our method to the public BreaKHis dataset and achieved promising prediction performance and scalability for histopathological classification.
AVAILABILITY AND IMPLEMENTATION: Software is publicly available at: https://1drv.ms/u/s!AiFpD21bgf2wgRLbQq08ixD0SgRD?e=OpqEmY.
SUPPLEMENTARY INFORMATION: Supplementary data are available at Bioinformatics online.
IEEE Rev Biomed Eng. 2009;2:147-71
[PMID:
20671804]
IEEE Trans Image Process. 2010 Jun;19(6):1657-63
[PMID:
20215079]
Adv Exp Med Biol. 2014;773:537-69
[PMID:
24563365]
Nat Med. 2021 May;27(5):775-784
[PMID:
33990804]
J Cytol. 2018 Apr-Jun;35(2):99-104
[PMID:
29643657]
J Med Eng. 2015;2015:457906
[PMID:
27006938]
IEEE Trans Biomed Eng. 2016 Jul;63(7):1455-62
[PMID:
26540668]
Acta Cytol. 2020;64(6):511-519
[PMID:
32570234]
BMC Bioinformatics. 2007 Mar 30;8:110
[PMID:
17394669]