Enhanced heart failure mortality prediction through model-independent hybrid feature selection and explainable machine learning.

Georgios Petmezas, Vasileios E Papageorgiou, Vassilios Vassilikos, Efstathios Pagourelias, Dimitrios Tachmatzidis, George Tsaklidis, Aggelos K Katsaggelos, Nicos Maglaveras
Author Information
  1. Georgios Petmezas: 2(nd) Department of Obstetrics and Gynecology, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece. Electronic address: petmezgs@auth.gr.
  2. Vasileios E Papageorgiou: Department of Mathematics, Aristotle University of Thessaloniki, Thessaloniki, Greece.
  3. Vassilios Vassilikos: 3(rd) Department of Cardiology, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece.
  4. Efstathios Pagourelias: 3(rd) Department of Cardiology, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece.
  5. Dimitrios Tachmatzidis: 3(rd) Department of Cardiology, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece.
  6. George Tsaklidis: Department of Mathematics, Aristotle University of Thessaloniki, Thessaloniki, Greece.
  7. Aggelos K Katsaggelos: Department of Electrical and Computer Engineering, Northwestern University, Evanston, IL, USA.
  8. Nicos Maglaveras: 2(nd) Department of Obstetrics and Gynecology, School of Medicine, Aristotle University of Thessaloniki, Thessaloniki, Greece.

Abstract

Heart failure (HF) remains a significant public health challenge with high mortality rates. Machine learning (ML) techniques offer a promising approach to predict HF mortality, potentially improving clinical outcomes. However, the effectiveness of these techniques heavily depends on the quality and relevance of the features used. This study introduces a novel hybrid feature selection methodology that combines Extremely Randomized Trees (Extra-Trees) and non-linear correlation measures to enhance 1-year all-cause mortality prediction in HF patients using echocardiographic and key demographic data. Unlike existing feature selection methods that are often tied to specific ML models and produce inconsistent feature sets across different algorithms, our proposed approach is model-independent, ensuring robustness and generalizability. Moreover, the optimal number of predictive features is identified through loss graph inspection, leading to a compact and highly informative subset of seven features. We trained and evaluated seven widely-used ML models on both the full feature set and the selected subset, finding that most models maintained or improved their predictive performance despite an 80% reduction in features. Model interpretability was enhanced using SHapley Additive exPlanations (SHAP), allowing for a detailed examination of how individual features influence predictions. To further assess its effectiveness, we compared our methodology against widely known feature selection techniques across all seven ML models. The results underscore the superiority of our proposed feature set in accurately predicting HF mortality over conventional methods, offering new opportunities for personalized management strategies based on a streamlined and explainable feature subset.

Keywords

MeSH Term

Heart Failure
Humans
Machine Learning
Algorithms
Female
Male
Aged
Middle Aged
Prognosis
Echocardiography

Word Cloud

Created with Highcharts 10.0.0featuremortalityfeaturesselectionHFMLmodelsfailurelearningtechniquespredictionsubsetsevenHeartMachineapproacheffectivenesshybridmethodologyusingmethodsacrossproposedmodel-independentpredictivesetSHAPexplainableremainssignificantpublichealthchallengehighratesofferpromisingpredictpotentiallyimprovingclinicaloutcomesHoweverheavilydependsqualityrelevanceusedstudyintroducesnovelcombinesExtremelyRandomizedTreesExtra-Treesnon-linearcorrelationmeasuresenhance1-yearall-causepatientsechocardiographickeydemographicdataUnlikeexistingoftentiedspecificproduceinconsistentsetsdifferentalgorithmsensuringrobustnessgeneralizabilityMoreoveroptimalnumberidentifiedlossgraphinspectionleadingcompacthighlyinformativetrainedevaluatedwidely-usedfullselectedfindingmaintainedimprovedperformancedespite80%reductionModelinterpretabilityenhancedSHapleyAdditiveexPlanationsallowingdetailedexaminationindividualinfluencepredictionsassesscomparedwidelyknownresultsunderscoresuperiorityaccuratelypredictingconventionalofferingnewopportunitiespersonalizedmanagementstrategiesbasedstreamlinedEnhancedheartmachineEchocardiographyExplainableAIFeature

Similar Articles

Cited By