AUTHOR=Zhao Shilong , Lyu Jun , Liu Shuhua , Feng Zelin , Ling Heping , Jiao Jiabao , Ni Zhaoxin , Yang Baojun , Yao Qing , Luo Ju TITLE=A counting method of whiteflies on crop leave images captured by AR glasses based on segmentation and improved YOLOv11 models JOURNAL=Frontiers in Plant Science VOLUME=Volume 16 - 2025 YEAR=2025 URL=https://www.frontiersin.org/journals/plant-science/articles/10.3389/fpls.2025.1687282 DOI=10.3389/fpls.2025.1687282 ISSN=1664-462X ABSTRACT=The whitefly (Bemisia tabaci) is a globally distributed agricultural pest. While accurate monitoring of this species is crucial for early warning systems and efficient pest control, traditional manual monitoring methods suffer from subjectivity, low accuracy with large populations, and arduous data traceability. To surmount these challenges, this paper proposes an automatic counting method for whitefly adults and late-instar nymphs, based on whitefly images acquired using augmented reality (AR) glasses and a segmentation-then-detection approach. Acquired by the surveyors wearing AR glasses, the images of whiteflies on the undersides of crop leaves are transmitted to a server via Wi-Fi/5G. The system enables the automatic whitefly counting model to enumerate the adult and late-instar nymph populations, and the results can be viewed on both the AR glasses and mobile devices. The study utilizes Mask2Former-Leaf to segment the foreground primary leaf in pursuit of the minimal influence of non-primary leaf areas and background noise in the images, and detects tiny whitefly adults and late-instar nymphs in high-resolution images by involving the YOLOv11-Whitefly detection model. This model integrates Slicing Aided Hyper Inference (SAHI) capability, and can enormously amplify the feature representation of tiny objects by slicing large images through overlapping windows for both training and inference. Furthermore, DyCM-C3K2 is introduced into the YOLOv11 backbone network, which enhances the detection capability for small whitefly targets by dynamically generating input-dependent convolutional kernels, and injecting global contextual information into local convolution operations. Also, a Multi-Branch Channel Re-Weighted Feature Pyramid Network (MCRFPN) is designed to replace the original neck network, optimizing the fusion between superficial and deep features. In contrast to mainstream detection models such as YOLO, RTMDet, Cascade-CNN, DETR, and DINO, YOLOv11-Whitefly demonstrates superior performance—attaining an average recall rate of 86.20%, an average precision of 84.25%, and an mAP50 of 91.60% for whitefly adults and late-instar nymphs. With the purpose of visualizing the whitefly infestation data, this paper developed an intelligent whitefly survey system that provides on-site visualization of whitefly images integrated with their adult and late-instar nymph counting results. This facilitates surveyors in understanding pest populations and formulating scientific control decisions.