Introduction

In prognosis and survival studies, an important goal is to identify multi-biomarker panels with predictive power using molecular characteristics or clinical observations. Such analysis is often challenged by censored, small-sample-size, but high-dimensional genomic profiles or clinical data. Therefore, sophisticated models and algorithms are in pressing need.In this study, we propose a novel Area Under Curve (AUC) optimization method for multi-biomarker panel identification named Nearest Centroid Classifier for AUC optimization (NCC-AUC). Our method is motived by the connection between AUC score for classification accuracy evaluation and Harrell's concordance index in survival analysis. This connection allows us to convert the survival time regression problem to a binary classification problem. Then an optimization model is formulated to directly maximize AUC and meanwhile minimize the number of selected features to construct a predictor in the nearest centroid classifier framework. NCC-AUC shows its great performance by validating both in genomic data of breast cancer and clinical data of stage IB Non-Small-Cell Lung Cancer (NSCLC). For the genomic data, NCC-AUC outperforms Support Vector Machine (SVM) and Support Vector Machine-based Recursive Feature Elimination (SVM-RFE) in classification accuracy. It tends to select a multi-biomarker panel with low average redundancy and enriched biological meanings. Also NCC-AUC is more significant in separation of low and high risk cohorts than widely used Cox model (Cox proportional-hazards regression model) and L1-Cox model (L1 penalized in Cox model). These performance gains of NCC-AUC are quite robust across 5 subtypes of breast cancer. Further in an independent clinical data, NCC-AUC outperforms SVM and SVM-RFE in predictive accuracy and is consistently better than Cox model and L1-Cox model in grouping patients into high and low risk categories.In summary, NCC-AUC provides a rigorous optimization framework to systematically reveal multi-biomarker panel from genomic and clinical data. It can serve as a useful tool to identify prognostic biomarkers for survival analysis.NCC-AUC is available at http://doc.aporc.org/wiki/NCC-AUC.ywang@amss.ac.cnSupplementary data are available at Bioinformatics online.

Publications

  1. NCC-AUC: an AUC optimization method to identify multi-biomarker panel for cancer prognosis from genomic and clinical data.
    Cite this
    Zou M, Liu Z, Zhang XS, Wang Y, 2015-10-01 - Bioinformatics (Oxford, England)

Credits

  1. Meng Zou
    Developer

    Academy of Mathematics and Systems Science, National Center for Mathematics and Interdisciplinary Sciences, China

  2. Zhaoqi Liu
    Developer

    Academy of Mathematics and Systems Science, National Center for Mathematics and Interdisciplinary Sciences, China

  3. Xiang-Sun Zhang
    Developer

    Academy of Mathematics and Systems Science, National Center for Mathematics and Interdisciplinary Sciences, China

  4. Yong Wang
    Investigator

    Academy of Mathematics and Systems Science, National Center for Mathematics and Interdisciplinary Sciences, China

Community Ratings

UsabilityEfficiencyReliabilityRated By
0 user
Sign in to rate
Summary
AccessionBT006520
Tool TypeApplication
Category
PlatformsLinux/Unix
Technologies
User InterfaceTerminal Command Line
Download Count0
Country/RegionChina
Submitted ByYong Wang