AUTHOR=Zhang Hua , Chen Kai , Xu Xiaotong , You Tao , Sun Wenzheng , Dang Jun TITLE=Spatiotemporal correlation enhanced real-time 4D-CBCT imaging using convolutional LSTM networks JOURNAL=Frontiers in Oncology VOLUME=Volume 14 - 2024 YEAR=2024 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2024.1390398 DOI=10.3389/fonc.2024.1390398 ISSN=2234-943X ABSTRACT=To enhance the accuracy of real-time four-dimensional cone beam CT (4D-CBCT) imaging by incorporating spatiotemporal correlation from the sequential projection image into the single projection-based 4D-CBCT estimation process.We first derived 4D deformation vector fields (DVF) from patient 4D-CT. Principal component analysis (PCA) was then employed to extract distinctive feature labels for each DVF, focusing on the first three PCA coefficients. To simulate a wide range of respiratory motion, we expanded the motion amplitude and used random sampling to generate approximately 900 sets of PCA labels. These labels were used to produce 900 simulated 4D-DVFs, which in turn deformed the 0% phase 4D-CT to obtain 900 CBCT volumes with continuous motion amplitudes. Following this, the forward projection was performed at one angle to get all of the digital reconstructed radiographs (DRRs). These DRRs and the PCA labels were used as the training data set. To capture the spatiotemporal correlation in the projections, we propose to use the convolutional LSTM (ConvLSTM) network for PCA coefficient estimation. For network testing, when several online CBCT projections (with different motion amplitudes that cover the full respiration range) are acquired and sent into the network, the corresponding 4D-PCA coefficients will be obtained and finally lead to a full online 4D-CBCT prediction. A phantom experiment is first performed with the XCAT phantom, then a pilot clinical evaluation is further conducted.Results on the XCAT phantom and the patient data show the proposed approach outperformed other networks in terms of visual inspection and quantitative metrics. For the XCAT phantom experiment, ConvLSTM achieves the highest quantification accuracy with FSIM(Feature Similarity Index Measure), PSNR(Peak Signal-to-Noise Ratio), MSSIM(Multi-scale Structural Similarity Index Measure) of 0.9998, 64.6742, 0.9998, respectively. For the patient pilot clinical experiment, ConvLSTM also achieves the best quantification accuracy with that of 0.9999, 63.7294, 0.9999, respectively.The spatiotemporal correlation-based respiration motion modeling supplied a potential solution for accurate real-time 4D-CBCT reconstruction.