AUTHOR=Rodea-Montero Edel Rafael , Rodríguez-Alcántar Brenda Jesús , Armenta-Medina Dagoberto TITLE=Prediction of in-hospital death among patients admitted to a tertiary care hospital over the first 10 years: a machine learning approach JOURNAL=Frontiers in Public Health VOLUME=Volume 13 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/public-health/articles/10.3389/fpubh.2025.1635708 DOI=10.3389/fpubh.2025.1635708 ISSN=2296-2565 ABSTRACT=PurposeTo describe the pre- and post-admission characteristics of hospitalized patients in a tertiary care hospital and to adjust machine learning models capable of predicting and identifying the factors that are associated with and have a greater prognostic value for in-hospital death.Materials and methodsThis was a retrospective study based on data from patients who were discharged from a Mexican tertiary care hospital during its first 10 years of operation (2007–2016). Preadmission characteristics were analyzed using descriptive statistics. Comparison tests (Mann–Whitney U) and association tests (chi-square) were applied according to the absence or presence of in-hospital death. Multivariate models (logistic regression, random forest and XGBoost) were fitted. Their ROC curves were compared using the DeLong test, and performance metrics were evaluated.ResultsIn total, 55,253 hospital discharges were considered, only 45,011 (0–101 years) had complete data, and the rate of in-hospital death was 4.17%. In total, 70% of the data were used for training and 30% for testing. Two-to-two comparisons between areas under the curve (AUCs) revealed that XGBoost (AUC = 0.9162) outperformed logistic regression (AUC = 0.9036) and random forest (AUC = 0.8978) (p-value < 0.001 in both cases). XGBoost had a sensitivity of 87%, specificity of 81.3% and balanced efficiency of 84.2%. The most relevant predictive factors were medical service that performed the admission, number of conditions, origin of the outpatient consultation of the hospital, and the main condition diagnosed at admission according to the ICD-10, age, month of admission, and day of the week of admission.ConclusionsOwing to its ability to capture complex patterns, the XGBoost model makes it possible to identify patients with a relatively high risk of in-hospital death using the data available at hospital admission. This constitutes a support tool for decision-making, helping to determine which patients require closer monitoring and follow-up during their hospital stay to improve the quality of medical care.