AUTHOR=Qiu Binxu , Shen Zixiong , Yang Dongliang , Wang Quan TITLE=Applying machine learning techniques to predict the risk of lung metastases from rectal cancer: a real-world retrospective study JOURNAL=Frontiers in Oncology VOLUME=Volume 13 - 2023 YEAR=2023 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2023.1183072 DOI=10.3389/fonc.2023.1183072 ISSN=2234-943X ABSTRACT=Background Metastasis in the lungs is common in patients with rectal cancer, and it can have severe consequences on their survival and quality of life. Therefore, it is essential to identify patients who may be at risk of developing lung metastasis from rectal cancer. Result Our study employed tenfold cross-validation to assess the performance of eight machine-learning models for predicting the risk of lung metastasis in patients with rectal cancer. The AUC values ranged from 0.73 to 0.96 in the training set, with the extreme gradient boosting (XGB) model achieving the highest AUC value of 0.96. Moreover, the XGB model obtained the best AUPR and MCC in the training set, reaching 0.98 and 0.88, respectively. We found that the XGB model demonstrated the best predictive power, achieving an AUC of 0.87, an AUPR of 0.60, an accuracy of 0.92, and a sensitivity of 0.93 in the internal test set. Furthermore, the XGB model was evaluated in the external test set and achieved an AUC of 0.91, an AUPR of 0.63, an accuracy of 0.93, a sensitivity of 0.92, and a specificity of 0.93. The XGB model obtained the highest MCC in the internal test set and external validation set, with 0.61 and 0.68, respectively. Based on the DCA and calibration curve analysis, the XGB model had better clinical decision-making ability and predictive power than the other seven models. Lastly, we developed an online web calculator using the XGB model to assist doctors in making informed decisions and to facilitate the model's wider adoption (https://share.streamlit.io/woshiwz/rectal_cancer/main/lung.py). Conclusion In this study, we developed an XGB model based on clinicopathological information to predict the risk of lung metastasis in patients with rectal cancer, which may help physicians make clinical decisions.