AUTHOR=Shi Yuxuan , Li Fang , Zhao Shuting , Yu Hongmeng , Chen Xinrong , Liu Quan TITLE=IAP-TransUNet: integration of the attention mechanism and pyramid pooling for medical image segmentation JOURNAL=Frontiers in Neurorobotics VOLUME=Volume 19 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/neurorobotics/articles/10.3389/fnbot.2025.1706626 DOI=10.3389/fnbot.2025.1706626 ISSN=1662-5218 ABSTRACT=IntroductionThe combination of CNN and Transformer has attracted much attention for medical image segmentation due to its superior performance at present. However, the segmentation performance is affected by limitations such as the local receptive field and static weights of CNN convolution operations, as well as insufficient information exchange between Transformer local regions.MethodsTo address these issues, an integrated attention mechanism and pyramid pooling network is proposed in this paper. Firstly, an efficient channel attention mechanism is embedded into CNN to extract more comprehensive image features. Then, CBAM_ASPP module is introduced into the bottleneck layer to obtain multi-scale context information. Finally, in order to address the limitations of traditional convolution, depthwise separable convolution is used to achieve a lightweight network.ResultsThe experiments based on the Synapse multi organ segmentation dataset and ACDC dataset showed that the proposed IAP-TransUNet achieved Dice similarity coefficients (DSCs) of 78.85% and 90.46%, respectively. Compared with the state-of-the-art method, for the Synapse multi organ segmentation dataset, the Hausdorff distance was reduced by 2.92%. For the ACDC dataset, the segmentation accuracy of the left ventricle, myocardium, and right ventricle was improved by 0.14%, 1.89%, and 0.23%, respectively.DiscussionThe experimental results demonstrate that the proposed network has improved the effectiveness and shows strong performance on both CT and MRI data, which suggests its potential for generalization across different medical imaging modalities.