AUTHOR=Wang Yating , Bai Genji , Huang Min , Chen Wei TITLE=Machine learning model based on enhanced CT radiomics for the preoperative prediction of lymphovascular invasion in esophageal squamous cell carcinoma JOURNAL=Frontiers in Oncology VOLUME=Volume 14 - 2024 YEAR=2024 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2024.1308317 DOI=10.3389/fonc.2024.1308317 ISSN=2234-943X ABSTRACT=Objective: To evaluate the value of a machine learning model by enhanced CT radiomics features in prediction of LVI of ESCC before treatment. Methods: Reviewing and analyzing the enhanced CT images of 258 ESCC patients from June 2017 to December 2019.We randomly assigned the patients in a ratio of 7:3 to a training and validation set. Clinical risk factors and CT image characteristics were recorded, multi-factor logistic regression was used to screen independent risk factors of LVI of ESCC patients. We extract the CT radiomics features by FAE software, screen radiomics features by MRMR and LASSO algorithms, and finally the radiomics labels of each patient were established. Five machine learning algorithms, including SVM, KNN, LR, GNB, and MLP, were used to construct model of radiomics labels and the screened clinical features. The predictive efficacy of the machine learning model for LVI of ESCC was evaluated by using ROC curve. Results: Tumor thickness , TNR , and clinical N stage were determined as independent risk factors of LVI. 14 radiomics features were selected by MRMR and LASSO to construct the radiomics labels. In test set, SVM, KNN, LR, and GNB showed high predictive performance, while MLP model had poor performance. In training set, AUC was 0.945 and 0.905 in KNN and SVM models, but decreased to 0.866 and 0.867 in validation set, indicating significant overfitting. The GNB and LR models had AUCs of 0.905 and 0.911 in training set, 0.900 and 0.893 in validation set, with stable performance, and good fitting and predictive ability. The MLP model had AUCs of 0.658 and 0.674 in training and validation set, with poor performance. A multi-scale combined model constructed by the multivariate logistic regression has an AUC of 0.911 and 0.893, accuracy of 84.4% and 79.7%, sensitivity of 90.8% and 87.1%, and specificity of 80.5% and 79.0% in training and validation sets. Conclusion: Machine learning models can preoperatively predict the condition of LVI effectively in patients with ESCC based on enhanced CT radiomics features. The GNB and LR models exhibit good stability for non-invasive prediction of LVI condition in ESCC patients before treatment.