AUTHOR=Zhang Wentai , Li Dongfang , Feng Ming , Hu Baotian , Fan Yanghua , Chen Qingcai , Wang Renzhi TITLE=Electronic Medical Records as Input to Predict Postoperative Immediate Remission of Cushing’s Disease: Application of Word Embedding JOURNAL=Frontiers in Oncology VOLUME=Volume 11 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2021.754882 DOI=10.3389/fonc.2021.754882 ISSN=2234-943X ABSTRACT=Background: There are no machine learning (ML) based models that use free text from electronic medical records (EMR) as input to predict immediate remission (IR) of Cushing’s disease (CD) after transsphenoidal surgery (TSS). Purpose: The present study aims to develop an ML-based to model using electronic medical records (EMR) including both structured features and free text as input to preoperatively predict IR after TSS. Methods: A total of 419 patients with Cushing’s disease (CD) from Peking Union Medical College Hospital (PUMCH) between January 2014 and August 2020 were enrolled. The EMR of the patients were embedded and transformed into low dimensional dense vectors that can be included in four ML-based models together with structured features. The area under the curve (AUC) of receiver operating characteristic curves (ROC) was used to evaluate the performance of the models. Results: The overall remission rate of the 419 patients was 75.7%. First operation (P=0.005), invasion of cavernous sinus from MRI (IOMRI) (P=0.007), sellar floor changes (P=0.010) and ACTH (P=0.004) were strongly correlated with IR according to logistic univariate analysis. The AUC values for the four ML-based models range between 0.686 and 0.793. The highest AUC value (0.793) appears in logistic regression (LR) when 11 structured features and “individual conclusions of the case by doctor” were included. Conclusion: We developed an ML-based model using both structured and unstructured features (after being processed by word embedding method) as input to preoperatively predict postoperative IR.