AUTHOR=Huang Jian , Ding Wen , Zhang Jiarun , Li Zhao , Shu Ting , Kuosmanen Pekka , Zhou Guanqun , Zhou Chuan , Yu Gang TITLE=Variational deep embedding-based active learning for the diagnosis of pneumonia JOURNAL=Frontiers in Neurorobotics VOLUME=Volume 16 - 2022 YEAR=2022 URL=https://www.frontiersin.org/journals/neurorobotics/articles/10.3389/fnbot.2022.1059739 DOI=10.3389/fnbot.2022.1059739 ISSN=1662-5218 ABSTRACT=Machine learning works in the same way as humans train their brains. Basically, previous experiences prepared the brain by firing specific nerve cells in the brain and increasing the weight of the links between them. Machine learning also completes the classification task by constantly changing the weights in the model through training on the training set. Machine learning can conduct a much more significant amount of training and achieve higher recognition accuracy in specific fields than the human brain. In this paper, we propose an active learning framework called Variational Deep Embedding Based Active Learning (VaDEAL) as a human-centric computing method to improve the accuracy of diagnosing pneumonia. Specifically, because Active learning (AL) realizes label-efficient learning by labeling the most valuable queries, we propose a new active learning strategy incorporating clustering to improve the sampling quality. Our framework consists of a variational deep embedding (VaDE) module, a task learner, and a sampling calculator. First, the VaDE performs unsupervised dimension reduction and clustering over the entire data set. The end-to-end task learner obtains the VaDE-processed samples' embedding representations while training the model's target classifier. The sampling calculator will get the representativeness of the samples G via VaDE, the uncertainty of the samples h through task learning, and ensure the overall diversity of the sampling samples R by calculating the similarity constraints between the current sampling samples and the sampled samples. With our novel design, the combination of uncertainty, representativeness, and diversity scores allows us to select the most informative samples for labeling, thus improving overall performance. With extensive experiments and evaluations on a large dataset, we demonstrate that our proposed method is superior to the state-of-the-art methods and has the highest accuracy in pneumonia diagnosis.