AUTHOR=Liu Taobin , Zhang Xiaoming , Chen Ru , Deng Xinxi , Fu Bin TITLE=Development, comparison, and validation of four intelligent, practical machine learning models for patients with prostate-specific antigen in the gray zone JOURNAL=Frontiers in Oncology VOLUME=Volume 13 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2023.1157384 DOI=10.3389/fonc.2023.1157384 ISSN=2234-943X ABSTRACT=Purpose: Machine learning prediction models based on LogisticRegression, XGBoost, GaussianNB, and LGBMClassifier for patients in the prostate-specific antigen gray zone are to be developed and compared, identifying valuable predictors. Predictive models are to be integrated into actual clinical decisions. Results: Machine learning prediction models based on LogisticRegression, XGBoost, GaussianNB, and LGBMClassifier exhibit higher predictive power than individual metrics. The area under the curve (AUC) (95% CI), accuracy, sensitivity, specificity, positive predictive value, negative predictive value, and F1 score of the LogisticRegression machine learning prediction model were 0.932 (0.881–0.983), 0.792, 0.824, 0.919, 0.652, 0.920, and 0.728, respectively; of the XGBoost machine learning prediction model were 0.813 (0.723–0.904), 0.771, 0.800, 0.768, 0.737, 0.793 and 0.767, respectively; of the GaussianNB machine learning prediction model were 0.902 (0.843–0.962), 0.813, 0.875, 0.819, 0.600, 0.909, and 0.712, respectively; and of the LGBMClassifier machine learning prediction model were 0.886 (0.809–0.963), 0.833, 0.882, 0.806, 0.725, 0.911, and 0.796, respectively. The LogisticRegression machine learning prediction model has the highest AUC among all prediction models, and the difference between the AUC of the LogisticRegression prediction model and those of XGBoost, GaussianNB, and LGBMClassifier is statistically significant (p < 0.001). Conclusion: Machine learning prediction models based on LogisticRegression, XGBoost, GaussianNB, and LGBMClassifier algorithms exhibit superior predictability for patients in the PSA gray area, with the LogisticRegression model yielding the best prediction. The aforementioned predictive models can be used for actual clinical decision-making.