AUTHOR=Huang Yuhong , Wei Lihong , Hu Yalan , Shao Nan , Lin Yingyu , He Shaofu , Shi Huijuan , Zhang Xiaoling , Lin Ying TITLE=Multi-Parametric MRI-Based Radiomics Models for Predicting Molecular Subtype and Androgen Receptor Expression in Breast Cancer JOURNAL=Frontiers in Oncology VOLUME=Volume 11 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2021.706733 DOI=10.3389/fonc.2021.706733 ISSN=2234-943X ABSTRACT=Objective: To investigate whether radiomics features extracted from multi-parametric MRI combining machine learning approach can predict molecular subtypes and androgen receptor (AR) expression of breast cancer in a non-invasive way. Materials and Methods: Patients diagnosed with clinical T2-4 stage breast cancer from March 2016 to July 2020 were retrospectively enrolled. The molecular subtypes and AR expression in pre-treatment biopsy specimens were assessed. A total of 4198 radiomics features were extracted from the pre-biopsy multi-parametric MRI (including dynamic contrast-enhancement T1-weighted images, fat-suppressed T2-weighted images and apparent diffusion coefficient map) of each patient. Based on leave-one-out cross-validation (LOOCV), the least absolute shrinkage and selection operator (LASSO), and recursive feature elimination (RFE) were applied to select the most optimal features in the training dataset. We built 12 diagnostic models using distinct classification algorithms to predict molecular subtypes and AR expression of breast cancer in the testing dataset. The performances of models were assessed via the area under the receiver operating characteristic curve (AUC), accuracy, sensitivity and specificity. Results: 162 patients (mean age, 46.91±10.08 years) were enrolled in this study, 30 were low-AR expression and 132 were high-AR expression. HR+/HER2- cancers were diagnosed in 56 cases (34.6%), HER2+ cancers in 81 cases (50.0%), and TNBC in 25 patients (15.4%). There was no significant difference in clinicopathologic characteristics between low-AR and high-AR groups (P>0.05), except the menopausal status, ER, PR, HER2, and Ki-67 index (P=0.043, <0.001, <0.001, 0.015 and 0.006, respectively). No significant difference in clinicopathologic characteristics was observed among three molecular subtypes except the AR status and Ki-67 (P=<0.001 and 0.012, respectively). The Support Vector Machine (SVM) showed the best performance in discriminating AR expression, with an AUC of 0.916 and an accuracy of 0.932 in the testing dataset. The highest performances were obtained for discriminating TNBC vs. non-TNBC (Random Forest, AUC: 0.851, accuracy: 77.8%), HER2+ vs. HER2- (SVM, AUC: 0.831, accuracy: 80.2%) and HR+/HER2- vs. others (Logistic Regression, AUC: 0.848, accuracy: 80.2%). Conclusions: Multi-parametric MRI-based radiomics combining with machine learning approaches provide a promising method to predict the molecular subtypes and AR expression of breast cancer non-invasively.