AUTHOR=Wu Mengyao , Wang Xinrui , Ji Rongbiao , Li Yadong , Lv Zongyuan , Chen Li , Yang Jianping , Wang Canyu TITLE=SinGAN-CBAM: a multi-scale GAN with attention for few-shot plant disease image generation JOURNAL=Frontiers in Plant Science VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2025.1703529 DOI=10.3389/fpls.2025.1703529 ISSN=1664-462X ABSTRACT=IntroductionTo address the limitation in model performance for tea and coffee disease identification caused by scarce and low-quality image samples, this paper proposes a few-shot multi-scale image generation method named SinGAN-CBAM, aiming to enhance the detail fidelity and semantic usability of generated images.MethodsThe research data were collected from Kunming, Baoshan, and Pu’er regions in Yunnan Province, covering seven typical diseases affecting both tea and coffee plants. Based on the SinGAN framework as the baseline, we incorporate the Convolutional Block Attention Module (CBAM), which leverages dual-channel and spatial attention mechanisms to strengthen the model’s ability to capture texture, edges, and spatial distribution features of diseased regions. Additionally, a SinGAN-SE model is constructed for comparative analysis to evaluate the improvement brought by channel-wise attention mechanisms. The generated images are validated through classification using a YOLO v8 model to assess their effectiveness in real-world recognition tasks.ResultsExperimental results demonstrate that SinGAN-CBAM significantly outperforms GAN, Fast-GAN, and the original SinGAN in metrics such as SSIM, PSNR, and Tenengrad, exhibiting superior structural consistency and edge clarity in generating both tea and coffee disease images. Compared with SinGAN-SE, SinGAN-CBAM further improves the naturalness of texture details and lesion distribution, showing particularly notable advantages in generating complex diseases such as rust and leaf miner infestations. Downstream classification results indicate that the YOLOv8 model trained on data generated by SinGAN-CBAM achieves higher precision, recall, and F1-score than those trained with other models, with key category recognition performance approaching or exceeding 0.98.DiscussionThis study validates the effectiveness of dual-dimensional attention mechanisms in enhancing the quality of agricultural few-shot image generation, providing a high-quality data augmentation solution for intelligent disease identification with promising practical applications.