AUTHOR=Xiong Yu , Liu Yu-Meng , Hu Jia-Qiang , Zhu Bao-Qiang , Wei Yuan-Kui , Yang Yan , Wu Xing-Wei , Long En-Wu TITLE=A personalized prediction model for urinary tract infections in type 2 diabetes mellitus using machine learning JOURNAL=Frontiers in Pharmacology VOLUME=Volume 14 - 2023 YEAR=2024 URL=https://www.frontiersin.org/journals/pharmacology/articles/10.3389/fphar.2023.1259596 DOI=10.3389/fphar.2023.1259596 ISSN=1663-9812 ABSTRACT=T2DM are at higher risk for UTIs, which greatly impacts their quality of life. Developing a risk prediction model to identify high-risk patients for UTIs in those with T2DM and assisting clinical decision-making can help reduce the incidence of UTIs in T2DM patients. To construct the predictive model, potential relevant variables were first selected from the literature, and then data was extracted from the HIS. The data set was split into a training and a test set in an 8:2 ratio. To handle the data and establish risk warning models, four imputation methods, four balancing methods, three feature screening methods, and eighteen machine learning algorithms were employed. A 10-fold cross-validation technique was applied to internally validate the training set, while the bootstrap method was used for external validation in the test set. The AUC was used to evaluate the performance of the models, and the best models were selected. The contributions of features were interpreted using the SHapley Additive ExPlanation (SHAP) approach. And a web-based prediction platform for UTIs in T2DM was constructed by Flask framework. Finally, 106 variables and 1340 patients were included in the study. After comprehensive data preprocessing, a total of 48 datasets were generated, and 864 risk warning models were constructed. The ROC curves were used to assess the performances of these models, and the best model achieved an impressive AUC of 0.9789 upon external validation. Notably, the most critical factors contributing to UTIs in T2DM patients were found to be UTIs-related inflammatory markers, medication use, mainly SGLT2 inhibitors, severity of comorbidities, blood routine indicators, as well as other factors such as length of hospital stay and eGFR. Furthermore, the SHAP method was utilized to interpret the contribution of each feature to the model. And based on the optimal predictive model a user-friendly prediction platform for UTIs in T2DM was built to assist clinicians in making clinical decisions. The machine learning model-based prediction system developed in this study exhibited favorable predictive ability and promising clinical utility. The web-based prediction platform, combined with the professional judgment of clinicians, can assist to make better clinical decisions.