AUTHOR=Subramanian Harry , Dey Rahul , Brim Waverly Rose , Tillmanns Niklas , Cassinelli Petersen Gabriel , Brackett Alexandria , Mahajan Amit , Johnson Michele , Malhotra Ajay , Aboian Mariam TITLE=Trends in Development of Novel Machine Learning Methods for the Identification of Gliomas in Datasets That Include Non-Glioma Images: A Systematic Review JOURNAL=Frontiers in Oncology VOLUME=Volume 11 - 2021 YEAR=2021 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2021.788819 DOI=10.3389/fonc.2021.788819 ISSN=2234-943X ABSTRACT=Purpose Machine learning has been applied to the diagnostic imaging of gliomas to augment classification, prognostication, segmentation, and treatment planning. A systematic literature review was performed to identify how machine learning has been applied to identify gliomas in datasets which include non-glioma images thereby simulating normal clinical practice. Materials and Methods Four databases were searched by a medical librarian and confirmed by a second librarian for all articles published prior to February 1, 2021: Ovid Embase, Ovid MEDLINE, Cochrane trials (CENTRAL), and Web of Science-Core Collection. The search strategy included both keywords and controlled vocabulary combining the terms for: artificial intelligence, machine learning, deep learning, radiomics, magnetic resonance imaging, glioma, as well as related terms. The review was conducted in stepwise fashion with abstract screening, full text screening, and data extraction. Quality of reporting was assessed using TRIPOD criteria. Results A total of 11,727 candidate articles were identified, of which 12 articles were included in the final analysis. Studies investigated the differentiation of normal from abnormal images (7 articles) and the differentiation of glioma images from non-glioma or normal images (5 articles). Single institution datasets were most common (5 articles), followed by BRATS (3 articles). The median sample size was 280 patients (9 articles). Algorithm training and validation strategies consisted of five-fold cross validation (6 articles), and the use of exclusive sets of images within the same dataset for training and for validation (6 articles). Neural networks were the most common type of algorithm, used in 83% of studies. The accuracy of algorithms ranged from 0.75 to 1.00 (median 0.96, 10 articles). Quality of reporting assessment utilizing TRIPOD criteria yielded a mean individual TRIPOD ratio of 0.50 (standard deviation 0.14, range 0.37 to 0.85). Conclusion Systematic review investigating the identification of gliomas in datasets which include non-glioma images demonstrated predominant use of neural network algorithms on a few established databases, with low sample sizes. The same dataset was used for both algorithm training and validation, limiting generalizability. TRIPOD criteria scoring indicates that the quality of reporting is lacking in multiple domains.