AUTHOR=Tong Yi-Tong , Gao Guang-Jie , Chang Huan , Wu Xing-Wei , Li Meng-Ting TITLE=Development and economic assessment of machine learning models to predict glycosylated hemoglobin in type 2 diabetes JOURNAL=Frontiers in Pharmacology VOLUME=Volume 14 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/pharmacology/articles/10.3389/fphar.2023.1216182 DOI=10.3389/fphar.2023.1216182 ISSN=1663-9812 ABSTRACT=Background Glycosylated hemoglobin (HbA1c), is recommended for diagnosing and monitoring type 2 diabetes. However, the monitoring frequency in the real-world has often not reached the recommended frequency in the guidelines. Developing machine learning models to screen patients with poor glycemic control in patients with T2D could optimize management and decrease the medical service costs.This study was carried out on patients with T2D at the Sichuan Provincial People’s Hospital from April 2018 to December 2019, who were examined for HbA1c. Characteristics were extracted from interviews and electronic medical records. The data (excluded FBG or included FBG) were randomly divided into a training dataset and a test dataset with a radio of 8: 2 after data pre-processing. Four imputing methods, four screening methods, and six machine learning algorithms were used to optimize data and develop models. Models were compared on the basis of predictive performance metrics, especially on the Model Benefit(MB, confusion matrix combined with economic burden associated with therapeutic inertia). The contributions of features were interpreted using SHapley Additive ExPlanation (SHAP). Finally, we validated the sample size on the best model. The study included patients with T2D, of whom 513 (52.3%) were defined as positive (need to perform the HbA1c test). The results indicated that the model trained in the data (included FBG) presented better forecast performance than the model (excluded FBG value). The best model used Random Forest as the imputation method, ElasticNet as the feature screening method, and the LightGBM algorithms, have the best performance. The MB, AUC, and AUPRC of the best model, among a total of 224 trained models, were 43475.750 (¥), 0.972, 0.944, and 0.974, respectively. FBG value, Previous HbA1c values, Have a ration and reasonable diet, Health status scores, Type of manufacturers of metformin, Interval of measurement, EQ-5D scores, Occupational status and Age were the most significant contributors to the prediction model.We found that MB could be an indicator to evaluate the model prediction performance.The proposed model was well performed to identify that patients with T2D need to undergo the HbA1c test and could help improve individualized T2D management.