AUTHOR=Zhou Yongqiang , Gong Changfei , Jian Junming , Zhang Yun TITLE=Innovative patient-specific delivered-dose prediction for volumetric modulated arc therapy using lightweight Swin-Transformer JOURNAL=Frontiers in Oncology VOLUME=Volume 15 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/oncology/articles/10.3389/fonc.2025.1640685 DOI=10.3389/fonc.2025.1640685 ISSN=2234-943X ABSTRACT=BackgroundVolumetric modulated arc therapy (VMAT) necessitates rigorous pre-treatment patient-specific quality assurance (PSQA) to ensure dosimetric accuracy, yet conventional manual verification methods encounter time and labor constraints in clinical workflows. While deep learning (DL) models have advanced PSQA by automating metrics prediction, existing approaches relying on convolutional neural networks struggle to reconcile local feature extraction with global contextual awareness. This study aims to develop a novel lightweight DL framework that synergizes hierarchical spatial feature learning and computational efficiency to enhance VMAT-delivered dose (VTDose) prediction.MethodsWe propose a hybrid architecture featuring a novel hierarchical fusion framework that synergizes shifted-window self-attention with adaptive local-global feature interaction. (termed “STQA”). Specially, strategic replacement of Swin-Transformer blocks with ResNet residual modules in deep layers, coupled with depthwise separable attention mechanisms, enables 40% parameter reduction while preserving spatial resolution. The model was trained on multimodal inputs and evaluated against state-of-the-art methods using structural similarity index (SSIM), mean absolute error (MAE), root mean square error (RMSE), and gamma passing rate (GPR).ResultsVisual evaluation of VTDose and discrepancy maps across axial, coronal, and sagittal planes demonstrated enhanced fidelity of STQA to ground truth (GT). Quantitative analysis revealed superior performance of STQA across all evaluation metrics: SSIM=0.978, MAE=0.163, and RMSE= 0.416. GPR analysis confirmed clinical applicability, with STQA achieving 95.43%±3.41% agreement with GT (94.63%±2.84%).ConclusionsSTQA establishes a paradigm for efficient and accurate VTDose prediction. Its lightweight design, validated through multi-site clinical data, addresses critical limitations in current DL-based PSQA, offering a clinically viable solution to enhance radiotherapy PSQA workflows.