AUTHOR=Su Yuqi , Cai Yuhuang , Jin Shui , Ye Xuemei , Jeong Jaesik , Yuan Ye , Yi Heqing TITLE=Explainable multi-modal machine learning for predicting occult pulmonary metastases in differentiated thyroid cancer: a SHAP-based approach prior to radioactive iodine scans JOURNAL=Frontiers in Medical Technology VOLUME=Volume 7 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/medical-technology/articles/10.3389/fmedt.2025.1685088 DOI=10.3389/fmedt.2025.1685088 ISSN=2673-3129 ABSTRACT=BackgroundPatients with differentiated thyroid cancer (DTC) may have occult lung metastases before 131iodine (131I) treatment. Identifying occult lung metastases before 131I treatment is of great clinical value for the correct staging of patients and the establishment of 131I treatment plans. Our research is of great significance in establishing statistical models for clinical data using machine learning algorithms to study the prediction of lung metastasis before 131I treatment.MethodsPatients were selected from Zhejiang cancer hospital and data was from two groups of DTC patients treated with 131I, where the experimental group consisted of 55 patients who showed no lung metastases on CT but tested positive on 131I-whole body scan (131I-WBS). The control group included 316 patients who tested negative for metastases across CT, ultrasound, and 131I-WBS. Six machine learning algorithms such as Support Vector Machines (SVM), Decision Trees (DT), Random Forests (RF), Logistic Regression (LR), Extreme Gradient Boosting (XGBoost), and K-Nearest Neighbors (KNN) were employed to predict models and AUC, sensitivity, accuracy, precision, specificity, F1 Score were used to compare the performance between each models. Finally, the SHAP algorithm was used to explain the importance rank of the features.ResultsA total of 371 thyroid cancer patients were included in this study, 55 patients with occult lung metastasis and 316 patients in the control group. The data is divided into a training set and a testing set in a 7:3 ratio. Eleven acceptable variables analyzed including gender, age, T stage, N stage, tumor size, degree of invasion, number of lymph node metastases count, Thyroid Stimulating Hormone (TSH), thyroglobulin (Tg), Thyroglobulin antibodies (Tgab), and administrated activity were screened out by multivariate Cox regression. Evaluation indicators of the best model- LR were as following: accuracy (0.91), recall rate (0.64), precision (0.92), F1-s core (0.70), Area Under Curve (AUC) value (0.93), and the Specificity score (0.96).ConclusionThe logistic model (LR) showed the best performance in predicting occult lung metastases of thyroid cancer patients before 131I-WBS. Lymph nodes metastases and throglobulin have the most significant impact on the prediction.