AUTHOR=Bian Junjie , Chee Kok-Han , Liu Chengyu , Sun Hongwei , Zhang Shixi , Chen Peili , Ting Hua-Nong TITLE=Machine learning-based screening of heart failure using the integrated features of electrocardiogram and phonocardiogram: a multicenter study in China JOURNAL=Frontiers in Cardiovascular Medicine VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/cardiovascular-medicine/articles/10.3389/fcvm.2025.1613577 DOI=10.3389/fcvm.2025.1613577 ISSN=2297-055X ABSTRACT=BackgroundsHeart failure (HF) is a major health concern associated with poor prognosis, and there is an urgent clinical need for an easy and accurate method for screening HF. This multicenter study aims to validate a novel AI-based phono-electrocardiogram algorithm (AI-PECG) in early HF detection.MethodsA total of 1,017 individuals were grouped into a training cohort and an external validating cohort, with a ratio of 8:2. In the training cohort, data of patients were further split into training set and test set randomly with the 8:2 ratio. The least absolute shrinkage and selection operator with five-fold cross-validation was utilized for dimensionality reduction and selection of features for model construction from clinical variables, phonocardiogram (PCG) parameters and electrocardiogram (ECG) parameters. Five machine learning (ML) algorithms were then carried out to choose a classifier model with the optimal recognition of HF, including logistic regression, random forest, eXtreme Gradient Boosting, Category Boosting (CatBoost), and Naive Bayes. The importance of ranking predicted factors was calculated in the final screening model using the SHapley Additive exPlanations analysis.ResultsAmong eligible participants, 302 reported HF. Totally 17 variables were selected to conduct the screening models. In the training set, the area under the curve (AUC) of the CatBoost model was 0.998 [95% confidence interval (CI): 0.996–1.000], which was higher compared to that of other ML models. The sensitivity and specificity of CatBoost model was 0.989 (95% CI: 0.978–0.996) and 0.989 (95% CI: 0.979–0.999). In the screening model, top 5 factors in terms of importance were EMAT, lymphocyte, LVST, CRP, and platelet.ConclusionThe ML model incorporating general data alongside ECG and PCG features carried out good detection performance for HF. This had the potential to be an available tool for clinicians to screen HF patients as early as possible for further clinical interventions.