AUTHOR=Chen Hong , Huang Yuping , Chen Lizhen TITLE=Ensemble machine learning for predicting renal function decline in chronic kidney disease: development and external validation JOURNAL=Frontiers in Medicine VOLUME=Volume 12 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/medicine/articles/10.3389/fmed.2025.1598065 DOI=10.3389/fmed.2025.1598065 ISSN=2296-858X ABSTRACT=IntroductionChronic kidney disease (CKD) poses a significant global health challenge, requiring timely interventions to manage renal function decline. Traditional predictive models often lack accuracy and generalizability. This study aimed to develop and validate a machine learning model to enhance risk prediction of renal function decline in CKD patients, enabling early and personalized interventions.MethodsWe developed an ensemble machine learning model using Random Forest, XGBoost, and LightGBM algorithms, incorporating advanced feature selection and hyperparameter tuning. The model was trained and validated on data from 1,200 CKD patients across multiple clinics, selected through stringent inclusion and exclusion criteria. Clinical, demographic, and laboratory data were processed with rigorous quality control. Model performance was assessed using area under the curve (AUC), calibration metrics, and five-fold cross-validation, with external validation across three medical centers.ResultsThe ensemble model achieved an AUC of 0.89 (95% CI: 0.87-0.91), outperforming traditional Cox models (AUC: 0.82, 95% CI: 0.79-0.85) and standard machine learning approaches (AUC: 0.85, 95% CI: 0.83-0.87). Key predictors identified via SHAP analysis included estimated glomerular filtration rate (eGFR), age, and urinary protein-creatinine ratio. The model demonstrated excellent calibration (slope: 0.96, 95% CI: 0.94-0.98) and robust performance across diverse patient subgroups, with a 60.6% reduction in computational resource use compared to traditional methods.DiscussionThis machine learning model offers a significant advancement in predicting CKD progression, providing a reliable, generalizable tool for early risk stratification. Its superior accuracy and efficiency support integration into clinical workflows, potentially transforming CKD management by enabling proactive, data-driven interventions. Future research should focus on incorporating novel biomarkers and expanding multicenter validation to further enhance clinical applicability.