<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Plant Sci.</journal-id>
<journal-title>Frontiers in Plant Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Plant Sci.</abbrev-journal-title>
<issn pub-type="epub">1664-462X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpls.2022.850606</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Plant Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Tea Chrysanthemum Detection by Leveraging Generative Adversarial Networks and Edge Computing</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Qi</surname> <given-names>Chao</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="author-notes" rid="fn002"><sup>&#x2020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1625856/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Gao</surname> <given-names>Junfeng</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="author-notes" rid="fn002"><sup>&#x2020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1346691/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Chen</surname> <given-names>Kunjie</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Shu</surname> <given-names>Lei</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c002"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1309858/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Pearson</surname> <given-names>Simon</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/384688/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>College of Engineering, Nanjing Agricultural University</institution>, <addr-line>Nanjing</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>Lincoln Agri-Robotics Centre, Lincoln Institute for Agri-Food Technology, University of Lincoln</institution>, <addr-line>Lincoln</addr-line>, <country>United Kingdom</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Daobilige Su, China Agricultural University, China</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Marcin Wozniak, Silesian University of Technology, Poland; Saeed Hamood Alsamhi, Ibb University, Yemen</p></fn>
<corresp id="c001">&#x002A;Correspondence: Kunjie Chen, <email>kunjiechen@njau.edu.cn</email></corresp>
<corresp id="c002">Lei Shu, <email>lei.shu@njau.edu.cn</email></corresp>
<fn fn-type="equal" id="fn002"><p><sup>&#x2020;</sup>These authors have contributed equally to this work and share first authorship</p></fn>
<fn fn-type="other" id="fn004"><p>This article was submitted to Sustainable and Intelligent Phytoprotection, a section of the journal Frontiers in Plant Science</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>07</day>
<month>04</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>13</volume>
<elocation-id>850606</elocation-id>
<history>
<date date-type="received">
<day>07</day>
<month>01</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>09</day>
<month>03</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2022 Qi, Gao, Chen, Shu and Pearson.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Qi, Gao, Chen, Shu and Pearson</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>A high resolution dataset is one of the prerequisites for tea chrysanthemum detection with deep learning algorithms. This is crucial for further developing a selective chrysanthemum harvesting robot. However, generating high resolution datasets of the tea chrysanthemum with complex unstructured environments is a challenge. In this context, we propose a novel tea chrysanthemum &#x2013; generative adversarial network (TC-GAN) that attempts to deal with this challenge. First, we designed a non-linear mapping network for untangling the features of the underlying code. Then, a customized regularization method was used to provide fine-grained control over the image details. Finally, a gradient diversion design with multi-scale feature extraction capability was adopted to optimize the training process. The proposed TC-GAN was compared with 12 state-of-the-art generative adversarial networks, showing that an optimal average precision (AP) of 90.09% was achieved with the generated images (512 &#x00D7; 512) on the developed TC-YOLO object detection model under the NVIDIA Tesla P100 GPU environment. Moreover, the detection model was deployed into the embedded NVIDIA Jetson TX2 platform with 0.1 s inference time, and this edge computing device could be further developed into a perception system for selective chrysanthemum picking robots in the future.</p>
</abstract>
<kwd-group>
<kwd>tea chrysanthemum</kwd>
<kwd>generative adversarial network</kwd>
<kwd>deep learning</kwd>
<kwd>edge computing</kwd>
<kwd>NVIDIA Jetson TX2</kwd>
</kwd-group>
<contract-sponsor id="cn001">Nanjing Agricultural University<named-content content-type="fundref-id">10.13039/501100008562</named-content></contract-sponsor><contract-sponsor id="cn002">Nanjing Agricultural University<named-content content-type="fundref-id">10.13039/501100008562</named-content></contract-sponsor><contract-sponsor id="cn003">Lincoln University<named-content content-type="fundref-id">10.13039/501100004594</named-content></contract-sponsor>
<counts>
<fig-count count="8"/>
<table-count count="7"/>
<equation-count count="13"/>
<ref-count count="52"/>
<page-count count="16"/>
<word-count count="9555"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1" sec-type="intro">
<title>Introduction</title>
<p>Some researches indicated that tea chrysanthemum has great commercial value (<xref ref-type="bibr" rid="B24">Liu et al., 2019</xref>). Besides, tea chrysanthemums offers a range of health benefits (<xref ref-type="bibr" rid="B43">Yue et al., 2018</xref>). For instance, it can considerably suppress carcinogenic activity and has significant anti-aging effects (<xref ref-type="bibr" rid="B49">Zheng et al., 2021</xref>). In the field, a tea chrysanthemum plant could present multiple flower heads, varying in different growth stages and sizes. Normally, tea chrysanthemums at the early flowering stage hold the best commercial value and health benefits, so they are mainly manually harvested at the early flowering stage, and this is a labor-intensive and time-consuming process.</p>
<p>Rapid developments in artificial intelligence and robotics offer a new opportunity to automate this harvesting task, dealing with the current scarcity of the skilled laborers (<xref ref-type="bibr" rid="B8">Dhaka et al., 2021</xref>; <xref ref-type="bibr" rid="B19">Kundu et al., 2021</xref>; <xref ref-type="bibr" rid="B22">Liu et al., 2021</xref>; <xref ref-type="bibr" rid="B41">Wieczorek et al., 2021</xref>). Hence, it is urgent to develop a selective harvesting robot. The perception system and manipulator are the two key components for developing selective harvesting robot. Many studies have shown that a high resolution image dataset has a profound impact on detection performance as it contains fine-grained features for object recognition (<xref ref-type="bibr" rid="B52">Zhou et al., 2021</xref>). However, collecting a dataset of tea chrysanthemums presents inherent difficulties. Tea chrysanthemums normally mature once a year and have to be picked at the early flowering stage to maximize commercial values. Moreover, the early flowering stage is incredibly short, typically from only 2 days to 1 week. Currently, there is no publicly available dataset on tea chrysanthemums worldwide for developing a detection algorithm, which is a hindrance to build an intelligent selective harvesting robot and other intelligent phytoprotection equipment (<xref ref-type="bibr" rid="B3">Ansari et al., 2020</xref>; <xref ref-type="bibr" rid="B2">Alsamhi et al., 2021</xref>, <xref ref-type="bibr" rid="B1">2022</xref>), e.g., Internet of Things based solar insecticidal lamps. Therefore, it is important to have a good dataset of tea chrysanthemums.</p>
<p>Using classical data augmentation to expand datasets and balance categories were reported in <xref ref-type="bibr" rid="B38">Tran et al. (2021)</xref>. Nevertheless, classical data enhancement methods (rotation, translation, flipping, and scaling, etc.) only allow for restricted feature diversity, prompting the utilization of generated data. Generated samples provide more variation and further enrich the dataset to improve training accuracy. Recent approaches address the data generation issues through utilizing generative adversarial networks (GANs) (<xref ref-type="bibr" rid="B39">Wang et al., 2019</xref>). These methods use an encoder-decoder strategy to generate fake images that can be used to enrich the original dataset. GANs have shown the impressive results by generating stunning fake images such as human faces (<xref ref-type="bibr" rid="B46">Zhao et al., 2019</xref>). However, GANs still suffer from non-negligible flaws. In our case, three issues need to be further investigated.</p>
<p>Issue 1: In the current agricultural field, GAN generates images with a maximum resolution of 256 &#x00D7; 256 pixels. This is not suitable for the chrysanthemum detection task as the low resolution images contain restricted information about the environment related features, which somewhat affects the robustness of the whole model. How to generate images that can meet the detection task resolution of the tea chrysanthemum is an issue requiring further exploration.</p>
<p>Issue 2: The traditional GAN directly provides the latent code to the generative network, resulting in a massive feature entanglement, thus directly influencing the diversity of the generated chrysanthemum images. How to design a network structure that could improve the diversity of the generated chrysanthemum images is an issue to be further explored.</p>
<p>Issue 3: The alternating optimization of generators and discriminators makes the GAN prone to pattern collapse and gradient vanishing during training, so how to achieve stable training is an issue to be further explored.</p>
<p>Based on these three issues, we propose a tea chrysanthemum &#x2013; generative adversarial network (TC-GAN) that can generate images with diversity at 512 &#x00D7; 512 resolution, as well as stable training. We decouple the latent code into intermediate vectors via a Mapping Network, resulting in controlling the diversity of chrysanthemum features. Also, we apply path length regularization in the Mapping Network, leading to more reliable and consistent behavior of the model and making architectural exploration easier. In the generative network, we add Stochastic variation after each convolutional layer to increase the diversity of the chrysanthemum images. Finally, we embed Res2Net into the generative network so that we can better guide the gradient flow to alleviate pattern collapse and gradient vanishing during the training process.</p>
<p>In this article, our goal is to generate datasets that can be used for the tea chrysanthemum detection task. We tested the images generated by TC-GAN on some state-of-the-art object detection models, as well as our own proposed detection model (TC-YOLO) (<xref ref-type="bibr" rid="B33">Qi et al., 2022</xref>). Moreover, for subsequent development work on an automated selective chrysanthemum picking robot, we chose to test the images generated by TC-GAN on a low-power embedded GPU platform, the NVIDIA Jetson TX2, as shown in <xref ref-type="fig" rid="F1">Figure 1</xref>.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>The results of testing tea chrysanthemum &#x2013; generative adversarial network (TC-GAN) on NVIDIA Jetson TX2. First, we used an HDMI cable to connect the laptop with the Jetson TX2, and ensure that the laptop and Jetson TX2 were under the same wireless network. Then, the TC-YOLO model and the tea chrysanthemum dataset were embedded in the flashed Jetson TX2 for testing.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-850606-g001.tif"/>
</fig>
<p>The contributions of this article are as follows:</p>
<list list-type="simple">
<list-item>
<label>1.</label>
<p>High resolution (512 &#x00D7; 512) images of tea chrysanthemums with complex unstructured environments (illumination variations, occlusions, overlaps) were generated using the proposed TC-GAN model.</p>
</list-item>
<list-item>
<label>2.</label>
<p>The images generated with TC-YOLO quantified the impact of five aspects, i.e., (1) dataset size, (2) epoch number, (3) different data enhancement methods, (4) various object detection models, and (5) complex unstructured environments on the TC-YOLO model, and verified the superiority of the TC-GAN model by comparing with some state-of-the-art GANs.</p>
</list-item>
<list-item>
<label>3.</label>
<p>TC-YOLO, developed from images generated by TC-GAN, was successfully deployed and tested in the edge device NVIDIA Jetson TX2.</p>
</list-item>
</list>
<p>The rest of this article is organized as follows. Section &#x201C;Related Work&#x201D; describes the research background. Section &#x201C;Materials and Methods&#x201D; depicts the proposed TC-GAN structure. Section &#x201C;Results&#x201D; presents the experimental details. Section &#x201C;Discussion&#x201D; describes the contribution of this article and the limitations of the research, as well as pointing out possible future solutions. Section &#x201C;Conclusion&#x201D; gives a concise summary of this article.</p>
</sec>
<sec id="S2">
<title>Related Work</title>
<p>Some GANs emerged to response the aforementioned Issue 3. Conditional Generative Adversarial Net (CGAN) (<xref ref-type="bibr" rid="B21">Liu et al., 2020</xref>) can strengthen the robustness of the model by applying conditional variables to the generator and discriminator that alleviate pattern collapse. Deep convolutional generative adversarial networks (DCGAN) (<xref ref-type="bibr" rid="B16">Jeon and Lee, 2021</xref>), the first GAN architecture based on convolutional neural networks, demonstrates a stable training process that effectively mitigates pattern collapse and gradient vanishing, but suffers from low quality and inadequate diversity of the generated images. Wasserstein GAN (WGAN) (<xref ref-type="bibr" rid="B51">Zhou et al., 2022</xref>) uses Wasserstein as an alternative to Jensen-Shannon (JS) divergence for comparing distributions, producing better gradients and improving training stability. Nevertheless, WGAN has difficulty converging due to the use of weight clipping, which can lead to sub-optimal performance.</p>
<p>Conditional Generative Adversarial Net, DCGAN, and WGAN had a profound impact on the development of GAN. Moreover, with the development of deep learning techniques, some high-performance GANs emerged to mitigate pattern collapse and gradient vanishing, resulting in stable training. Specific details are shown in <xref ref-type="table" rid="T1">Table 1</xref>. We will compare these models with the proposed TC-GAN in section &#x201C;Results.&#x201D;</p>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>Details of the twelve latest generative adversarial networks.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Algorithm</td>
<td valign="top" align="center">Published year</td>
<td valign="top" align="center">Characteristic</td>
<td valign="top" align="center">Resolution</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Progressive GAN (<xref ref-type="bibr" rid="B7">Collier et al., 2018</xref>)</td>
<td valign="top" align="center">2017</td>
<td valign="top" align="center">Grow the generator and discriminator progressively</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
</tr>
<tr>
<td valign="top" align="left">LSGAN (<xref ref-type="bibr" rid="B26">Mao et al., 2019</xref>)</td>
<td valign="top" align="center">2017</td>
<td valign="top" align="center">Applying the least squares loss function</td>
<td valign="top" align="center">112 &#x00D7; 112</td>
</tr>
<tr>
<td valign="top" align="left">SN-GAN (<xref ref-type="bibr" rid="B28">Mufti et al., 2019</xref>)</td>
<td valign="top" align="center">2018</td>
<td valign="top" align="center">Applying spectral normalization</td>
<td valign="top" align="center">32 &#x00D7; 32</td>
</tr>
<tr>
<td valign="top" align="left">MGAN (<xref ref-type="bibr" rid="B13">He et al., 2019</xref>)</td>
<td valign="top" align="center">2018</td>
<td valign="top" align="center">Applying multi-channel gait templates</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
</tr>
<tr>
<td valign="top" align="left">Dist-GAN (<xref ref-type="bibr" rid="B37">Tran et al., 2018</xref>)</td>
<td valign="top" align="center">2018</td>
<td valign="top" align="center">Applying a latent-data distance constraint</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
</tr>
<tr>
<td valign="top" align="left">Rob-GAN (<xref ref-type="bibr" rid="B23">Liu and Hsieh, 2019</xref>)</td>
<td valign="top" align="center">2019</td>
<td valign="top" align="center">Jointly optimize generator and discriminator</td>
<td valign="top" align="center">128 &#x00D7; 128</td>
</tr>
<tr>
<td valign="top" align="left">AutoGAN (<xref ref-type="bibr" rid="B12">Gong et al., 2019</xref>)</td>
<td valign="top" align="center">2019</td>
<td valign="top" align="center">Applying NAS algorithm</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
</tr>
<tr>
<td valign="top" align="left">BigGAN (<xref ref-type="bibr" rid="B34">Qiao et al., 2020</xref>)</td>
<td valign="top" align="center">2018</td>
<td valign="top" align="center">Applying orthogonal regularization</td>
<td valign="top" align="center">512 &#x00D7; 512</td>
</tr>
<tr>
<td valign="top" align="left">Improved WGAN (<xref ref-type="bibr" rid="B42">Yang et al., 2020</xref>)</td>
<td valign="top" align="center">2020</td>
<td valign="top" align="center">Injecting an instance noise</td>
<td valign="top" align="center">128 &#x00D7; 128</td>
</tr>
<tr>
<td valign="top" align="left">Improved WGAN-GP (<xref ref-type="bibr" rid="B17">Kim et al., 2021</xref>)</td>
<td valign="top" align="center">2021</td>
<td valign="top" align="center">Wasserstein GAN with gradient penalty</td>
<td valign="top" align="center">28 &#x00D7; 28</td>
</tr>
<tr>
<td valign="top" align="left">Improved DCGAN (<xref ref-type="bibr" rid="B6">Chao et al., 2021</xref>)</td>
<td valign="top" align="center">2021</td>
<td valign="top" align="center">Applying batch normalization</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
</tr>
<tr>
<td valign="top" align="left">DAG (<xref ref-type="bibr" rid="B38">Tran et al., 2021</xref>)</td>
<td valign="top" align="center">2021</td>
<td valign="top" align="center">Improve learning of the original distribution</td>
<td valign="top" align="center">48 &#x00D7; 48</td>
</tr>
</tbody>
</table></table-wrap>
<p>We collated the available literature on image recognition using GANs in agriculture, with a particular focus on the generated image resolution and the complex unstructured environment in the generated images, as shown in <xref ref-type="table" rid="T2">Table 2</xref>. High-resolution images contain better fine-grained features and more complex unstructured environments, facilitating the extraction of abundant image features for robust detection results. Also, high resolution images make transfer learning easier, and current object detection frameworks typically require datasets with a resolution higher than 416 &#x00D7; 416 (<xref ref-type="bibr" rid="B20">Liu and Wang, 2020</xref>). Not only that, to summarize the GANs in <xref ref-type="table" rid="T1">Tables 1</xref>, <xref ref-type="table" rid="T2">2</xref>, several structural improvements are needed. First, the latent codes (input vectors) in the GANs in <xref ref-type="table" rid="T1">Tables 1</xref>, <xref ref-type="table" rid="T2">2</xref> are directly fed into the generator network. Nevertheless, the design of using latent codes to generate specific visual features is somewhat restricted so that it has to consider the probability density of the input data. This design can prevent some latent codes from being mapped to features, resulting in feature entanglement. The proposed model structure allows vectors to be generated without considering the input data distribution through a custom mapping network, as well as reducing the correlation between different features. Second, multi-scale extraction and feature fusion can effectively guide the gradient flow, but in the GANs in <xref ref-type="table" rid="T1">Tables 1</xref>, <xref ref-type="table" rid="T2">2</xref>, the structure is designed mainly for normalization approaches, loss functions, control variables and mapping relationships between generators and discriminators. Currently, the structure of GANs lacks design for multi-scale extraction and feature fusion. The generator structure of the proposed model focuses on the combination of multi-scale extraction and feature fusion.</p>
<table-wrap position="float" id="T2">
<label>TABLE 2</label>
<caption><p>Available literature using GAN for image recognition in agriculture.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Algorithm</td>
<td valign="top" align="center">Published year</td>
<td valign="top" align="center">Task</td>
<td valign="top" align="center">Accuracy (%)</td>
<td valign="top" align="center">Resolution</td>
<td valign="top" align="center">Test environment</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">DCGAN (<xref ref-type="bibr" rid="B11">Gandhi et al., 2018</xref>)</td>
<td valign="top" align="center">2018</td>
<td valign="top" align="center">Plant disease detection</td>
<td valign="top" align="center">88.6</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Illumination</td>
</tr>
<tr>
<td valign="top" align="left">C-DCGAN (<xref ref-type="bibr" rid="B14">Hu et al., 2019</xref>)</td>
<td valign="top" align="center">2019</td>
<td valign="top" align="center">Tea leaf&#x2019;s disease identification</td>
<td valign="top" align="center">90</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Illumination</td>
</tr>
<tr>
<td valign="top" align="left">DCGAN (<xref ref-type="bibr" rid="B9">Douarre et al., 2019</xref>)</td>
<td valign="top" align="center">2019</td>
<td valign="top" align="center">Apple scab segmentation</td>
<td valign="top" align="center">60</td>
<td valign="top" align="center">28 &#x00D7; 28</td>
<td valign="top" align="center">Ideal</td>
</tr>
<tr>
<td valign="top" align="left">CycleGAN (<xref ref-type="bibr" rid="B32">Padilla-Medina et al., 2019</xref>)</td>
<td valign="top" align="center">2019</td>
<td valign="top" align="center">Detection of apple lesions in orchards</td>
<td valign="top" align="center">95.57</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Ideal</td>
</tr>
<tr>
<td valign="top" align="left">DCGAN (<xref ref-type="bibr" rid="B4">Bian et al., 2019</xref>)</td>
<td valign="top" align="center">2019</td>
<td valign="top" align="center">Tea clones identifications</td>
<td valign="top" align="center">76</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Ideal</td>
</tr>
<tr>
<td valign="top" align="left">Deep CORAL (<xref ref-type="bibr" rid="B27">Marino et al., 2020</xref>)</td>
<td valign="top" align="center">2020</td>
<td valign="top" align="center">Potato defects classification</td>
<td valign="top" align="center">90</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Ideal</td>
</tr>
<tr>
<td valign="top" align="left">CAAE (<xref ref-type="bibr" rid="B50">Zhong et al., 2020</xref>)</td>
<td valign="top" align="center">2020</td>
<td valign="top" align="center">Citrus plant diseases recognition</td>
<td valign="top" align="center">53.4</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Illumination</td>
</tr>
<tr>
<td valign="top" align="left">DCGAN (<xref ref-type="bibr" rid="B29">Nafi and Hsu, 2020</xref>)</td>
<td valign="top" align="center">2020</td>
<td valign="top" align="center">Plant disease detection</td>
<td valign="top" align="center">86.63</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Ideal</td>
</tr>
<tr>
<td valign="top" align="left">BEGAN (<xref ref-type="bibr" rid="B25">Luo et al., 2020</xref>)</td>
<td valign="top" align="center">2020</td>
<td valign="top" align="center">Pine cone detection</td>
<td valign="top" align="center">95.3</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Ideal</td>
</tr>
<tr>
<td valign="top" align="left">CGAN (<xref ref-type="bibr" rid="B31">Olatunji et al., 2020</xref>)</td>
<td valign="top" align="center">2020</td>
<td valign="top" align="center">Kiwi geometry reconstruction</td>
<td valign="top" align="center">75</td>
<td valign="top" align="center">28 &#x00D7; 28</td>
<td valign="top" align="center">Ideal</td>
</tr>
<tr>
<td valign="top" align="left">DCGAN (<xref ref-type="bibr" rid="B36">Talukdar, 2020</xref>)</td>
<td valign="top" align="center">2020</td>
<td valign="top" align="center">Plant disease classification</td>
<td valign="top" align="center">95.88</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Ideal</td>
</tr>
<tr>
<td valign="top" align="left">DCGAN (<xref ref-type="bibr" rid="B15">Hu et al., 2020</xref>)</td>
<td valign="top" align="center">2020</td>
<td valign="top" align="center">Recognition of diseased pinus trees</td>
<td valign="top" align="center">78.6</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Ideal</td>
</tr>
<tr>
<td valign="top" align="left">TasselGAN (<xref ref-type="bibr" rid="B35">Shete et al., 2020</xref>)</td>
<td valign="top" align="center">2020</td>
<td valign="top" align="center">Plant traits detection</td>
<td valign="top" align="center">94</td>
<td valign="top" align="center">128 &#x00D7; 128</td>
<td valign="top" align="center">Illumination</td>
</tr>
<tr>
<td valign="top" align="left">CycleGAN (<xref ref-type="bibr" rid="B47">Zhao et al., 2021a</xref>)</td>
<td valign="top" align="center">2021</td>
<td valign="top" align="center">Bale detection</td>
<td valign="top" align="center">93</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Ideal</td>
</tr>
<tr>
<td valign="top" align="left">DCGAN (<xref ref-type="bibr" rid="B10">Espejo-Garcia et al., 2021</xref>)</td>
<td valign="top" align="center">2021</td>
<td valign="top" align="center">Weeds identification</td>
<td valign="top" align="center">93.23</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Ideal</td>
</tr>
<tr>
<td valign="top" align="left">DoubleGAN (<xref ref-type="bibr" rid="B48">Zhao et al., 2021b</xref>)</td>
<td valign="top" align="center">2021</td>
<td valign="top" align="center">Plant disease detection</td>
<td valign="top" align="center">99.06</td>
<td valign="top" align="center">64 &#x00D7; 64</td>
<td valign="top" align="center">Ideal</td>
</tr>
<tr>
<td valign="top" align="left">AR-GAN (<xref ref-type="bibr" rid="B30">Nazki et al., 2020</xref>)</td>
<td valign="top" align="center">2020</td>
<td valign="top" align="center">Plant disease recognition</td>
<td valign="top" align="center">86.1</td>
<td valign="top" align="center">256 &#x00D7; 256</td>
<td valign="top" align="center">Illumination</td>
</tr>
</tbody>
</table></table-wrap>
</sec>
<sec id="S3" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec id="S3.SS1">
<title>Datasets</title>
<p>The tea chrysanthemum dataset utilized in this article was collected from October 2019 to October 2020 in Sheyang County, Dongzhi County and Nanjing Agricultural University, China. The datasets were all collected using an Apple X phone with an image resolution of 1080 &#x00D7; 1920. The datasets were captured in natural light under three environments, including illumination variation, overlap and occlusion. The chrysanthemums in the dataset comprise three flowering stages: the bud stage, the early flowering stage and the full bloom stage. The bud stage refers to when the petals are not yet open. The early flowering stage means when the petals are not fully open and the full bloom stage denotes when the petals are fully open. The three examples of the original images are shown in <xref ref-type="fig" rid="F2">Figure 2</xref>.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Examples of the collected original images.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-850606-g002.tif"/>
</fig>
</sec>
<sec id="S3.SS2">
<title>NVIDIA Jetson TX2</title>
<p>There is no need to transmit all gathered image data back to cloud for further processing since the communication environment in countryside is generally not stable and the long time delay for smart equipment, i.e., chrysanthemum picking robot, is not acceptable. The NVIDIA Jetson TX2 has a 6-core ARMv8 64-bit CPU complex and a 256-core NVIDIA Pascal architecture GPU. The CPU complex consists of a dual-core Denver2 processor and a quad-core ARM Cortex-A57, as well as 8 GB LPDDR4 memory and a 128-bit interface, making it ideal for low power and high computational performance applications. Thus, this edge computing device was chosen to design and implement a real-time object detection system. We introduced the NVIDIA Jetson TX2 in <xref ref-type="fig" rid="F3">Figure 3</xref>.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>NVIDIA Jetson TX2 parameters.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-850606-g003.tif"/>
</fig>
</sec>
<sec id="S3.SS3">
<title>Architecture</title>
<p>The proposed TC-GAN comprises a generator and a discriminator. In the generator, the non-linear mapping network <italic>f</italic> is implemented with a 4-layer multilayer perceptron (MLP), as well as applying path length regularization to decorrelate neighboring features for more fine-grained control of the generated images. The learned affine transform then specializes w to the style <italic>y</italic> = (ys, yb), controlling the Adaptive Instance Normalization (AdaIN) operation after each convolutional layer of the synthetic network <italic>g</italic>, followed by Res2Net to better guide the gradient flow without increasing the network computational workload. Finally, we introduce noisy inputs that enable the generator to provide random detail. We inject a specialized noise image into each layer (4<sup>2</sup>&#x2013;512<sup>2</sup>) of the generator network, these are single channel images composed of Gaussian noise. The noise images are used with a feature scaling factor broadcast to all feature maps, and subsequently applied to the output of the corresponding convolution. Leaky ReLU is employed as the activation function throughout the generator. In the discriminator, the generated 512 &#x00D7; 512 resolution image and the real image of the same resolution are fed into the discriminator network simultaneously and mapped to 4 &#x00D7; 4 via convolution. In the whole convolution process, some diverse modules are inserted, including CL (Convolution + Leaky ReLU) and CBL (Convolution + Batch Normalization + Leaky ReLU). It is worth noting that the GAN training tends to be unstable, and no extra modules are inserted to guide the gradient flow and make the overall discriminator network look as simple as possible. Also, due to the lack of gradient flow in the underlying layer, the BN module was not inserted in the convolution process. Leaky ReLU is utilized as the activation function throughout the discriminator. Moreover, the generator and discriminator both employ the Wasserstein distance with gradient penalty as the loss function. The structure of TC-GAN is shown in <xref ref-type="fig" rid="F4">Figure 4</xref>.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Structure of the proposed TC-GAN network. Mapping network can effectively capture the location of potential codes with rich features, benefiting the generator network to accurately extract complex unstructured features. A represents the learned affine transformation. B denotes the learned per-channel scaling factor applied to the noisy input. Discriminator network is designed to guide the training of the generator network, which is continuously confronted by alternating training between the two networks, ultimately enabling the generator network to better execute the generation task.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-850606-g004.tif"/>
</fig>
</sec>
<sec id="S3.SS4">
<title>Mapping Network</title>
<p>The mapping network consists of four fully connected layers that map the latent space <italic>z</italic> to the intermediate latent space w via affine transformations. <xref ref-type="fig" rid="F4">Figure 4</xref> depicts the structure of the mapping network. To capture the location of latent codes with rich features, this network encourages feature-based localization. A mixed regularization strategy is adopted, where two random latent codes are used instead of one latent code to generate some images during the training process. When generating an image, we simply switch from one latent code to another at a randomly picked point in the generative network. Specifically, the two latent codes <italic>z</italic>1, <italic>z</italic>2 are under control in the mapping network, and the corresponding <italic>w</italic>1, <italic>w</italic>2 are allowed to fix the features so that <italic>w</italic>1 works before the intersection point and <italic>w</italic>2 works after the intersection point. This regularization strategy prevents neighboring features from being correlated. Furthermore, extracting potential vectors in a truncated or reduced sample space helps to improve the quality of the generated images, although a certain degree of diversity in the generated images would be lost. Based on this, we can consider a similar approach. First, after training, intermediate vectors are generated in the mapping network by randomly selecting different inputs and calculating the center of mass in these vectors:</p>
<disp-formula id="S3.E1">
<label>(1)</label>
<mml:math id="M1">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>w</mml:mi>
<mml:mo stretchy="false">&#x00AF;</mml:mo>
</mml:mover>
<mml:mo>=</mml:mo>
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mi>z</mml:mi>
<mml:mo>&#x223C;</mml:mo>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>z</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <inline-formula><mml:math id="INEQ1"><mml:mover accent="true"><mml:mi>w</mml:mi><mml:mo stretchy="false">&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> stands for the center of mass and <italic>z</italic> denotes the latent space.</p>
<p>We can then scale the deviation of a given <italic>w</italic> from the center as:</p>
<disp-formula id="S3.E2">
<label>(2)</label>
<mml:math id="M2">
<mml:mrow>
<mml:msup>
<mml:mi>w</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msup>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>w</mml:mi>
<mml:mo stretchy="false">&#x00AF;</mml:mo>
</mml:mover>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">&#x03C8;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>-</mml:mo>
<mml:mover accent="true">
<mml:mi>w</mml:mi>
<mml:mo stretchy="false">&#x00AF;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>w</italic>&#x2032; refers to the truncated <italic>w</italic> and &#x03C8; defines the difference coefficient between the intermediate vector and the center of mass.</p>
</sec>
<sec id="S3.SS5">
<title>Stochastic Variation</title>
<p>The sole input of traditional networks is through the input layer, which generates spatially varying pseudo-random numbers from earlier activations. This method consumes the capacity of the network and thus makes it difficult to hide the periodicity of the generated signal, causing the whole generation process unstable. To address this challenge, we embed noise along each convolutional layer. In a feature-based generator network, the entire feature map is scaled and biased with the same values. As a result, global effects like shape, illumination or background style could be controlled consistently. Moreover, noise is applied to each pixel individually and thus is eminently suitable for controlling random variations. Once the generative network attempts to control the noise, this leads to spatially inconsistent decisions that will be penalized by the discriminator. Accordingly, TC-GAN can learn to use global and local channels properly without clear guidance.</p>
</sec>
<sec id="S3.SS6">
<title>Path Length Regularization</title>
<p>Path length regularization makes the network more reliable and makes architectural exploration easier. Specifically, we stimulate fixed-size steps of <italic>W</italic> to generate non-zero fixed-size variations in the image. The bias is measured by observing the corresponding gradient of <italic>W</italic> in the random direction, which should have a similar length regardless of <italic>w</italic> or the image space direction. This indicates that the mapping from potential space to image space is conditional.</p>
<p>At a single <italic>w</italic> &#x2208; <italic>W</italic>, the local metric scaling properties of the generator mapping <italic>g</italic> (<italic>w</italic>): <italic>W</italic> &#x2192; <italic>Y</italic> are fixed by the Jacobian matrix <italic>J</italic><sub><italic>w</italic></sub> = &#x2202;&#x2061;<italic>g</italic>(w)/&#x2202;&#x2061;w. Since we wish to preserve the expected length of the vector regardless of its direction, we formulate the regularizer as:</p>
<disp-formula id="S3.E3">
<label>(3)</label>
<mml:math id="M3">
<mml:mrow>
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mrow>
<mml:mrow>
<mml:mi>w</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mo>&#x223C;</mml:mo>
<mml:mrow>
<mml:mpadded width="+2.8pt">
<mml:mi>N</mml:mi>
</mml:mpadded>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi>I</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo>&#x2225;</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>J</mml:mi>
<mml:mi>w</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mo>&#x2225;</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>y</italic> is a random image with normally distributed pixel intensities, and <italic>w</italic> &#x223C; <italic>f</italic>(<italic>z</italic>), where <italic>z</italic> are normally distributed. In higher dimensions, this prior is minimized when <italic>J</italic><sub><italic>w</italic></sub>is orthogonal at any <italic>w</italic>. An orthogonal matrix retains length and does not introduce squeezing across any dimension.</p>
<p>This prior is minimized when the expected value of <italic>y</italic> reaches the minimum at each latent space point <italic>w</italic>, respectively, and we start from the internal expectation:</p>
<disp-formula id="S3.E4">
<label>(4)</label>
<mml:math id="M4">
<mml:mrow>
<mml:msub>
<mml:mi class="ltx_font_mathcaligraphic">&#x2112;</mml:mi>
<mml:mi>w</mml:mi>
</mml:msub>
<mml:mo rspace="8.1pt">:</mml:mo>
<mml:mo>=</mml:mo>
<mml:mpadded width="+2.8pt">
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mi>y</mml:mi>
</mml:msub>
</mml:mpadded>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mo>&#x2225;</mml:mo>
<mml:msubsup>
<mml:mi>J</mml:mi>
<mml:mi>w</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
<mml:mi>y</mml:mi>
<mml:msub>
<mml:mo>&#x2225;</mml:mo>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:mi>a</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:math>
</disp-formula>
<p>We use the single-valued decomposition <inline-formula><mml:math id="INEQ5"><mml:mrow><mml:mpadded width="+5.6pt"><mml:msubsup><mml:mi>J</mml:mi><mml:mi>w</mml:mi><mml:mi>T</mml:mi></mml:msubsup></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mi>U</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mover accent="true"><mml:mi mathvariant="normal">&#x03A3;</mml:mi><mml:mo stretchy="false">~</mml:mo></mml:mover><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>V</mml:mi><mml:mi>T</mml:mi></mml:msup></mml:mrow></mml:mrow></mml:math></inline-formula> for analysis. Where <italic>U</italic> &#x2208; <italic>R<sup>L</sup></italic><sup>&#x00D7;<italic>L</italic></sup> and <italic>V</italic> &#x2208; &#x211D;<italic><sup>M</sup></italic><sup>&#x00D7;<italic>M</italic></sup> represent orthogonal matrices. Since rotating a unit normal random variable by an orthogonal matrix will make its distribution invariant, the equation simplifies to:</p>
<disp-formula id="S3.E5">
<label>(5)</label>
<mml:math id="M5">
<mml:mrow>
<mml:mpadded width="+5.6pt">
<mml:msub>
<mml:mi class="ltx_font_mathcaligraphic">&#x2112;</mml:mi>
<mml:mi>w</mml:mi>
</mml:msub>
</mml:mpadded>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mpadded width="+2.8pt">
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mi>y</mml:mi>
</mml:msub>
</mml:mpadded>
<mml:mo>&#x2062;</mml:mo>
<mml:mpadded width="+5.6pt">
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo>&#x2225;</mml:mo>
<mml:mrow>
<mml:mi>U</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mover accent="true">
<mml:mi mathvariant="normal">&#x03A3;</mml:mi>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
<mml:mo>&#x2062;</mml:mo>
<mml:msup>
<mml:mi>V</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mo>&#x2225;</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mpadded>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mpadded width="+2.8pt">
<mml:mi>y</mml:mi>
</mml:mpadded>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo>&#x2225;</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>E</mml:mi>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mo>&#x2225;</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Moreover, the zero matrix effectively marginalizes its distribution in dimension. Then, we simply consider the minimization of the expression:</p>
<disp-formula id="S3.E6">
<label>(6)</label>
<mml:math id="M6">
<mml:mrow>
<mml:mpadded width="+5.6pt">
<mml:msub>
<mml:mi class="ltx_font_mathcaligraphic">&#x2112;</mml:mi>
<mml:mi>w</mml:mi>
</mml:msub>
</mml:mpadded>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mpadded width="+2.8pt">
<mml:msub>
<mml:mi>E</mml:mi>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
</mml:msub>
</mml:mpadded>
<mml:mo>&#x2062;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo>&#x2225;</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">&#x03A3;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x2225;</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <inline-formula><mml:math id="INEQ8"><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false">~</mml:mo></mml:mover></mml:math></inline-formula> is a unit-normal distribution in dimension <italic>L</italic>. All matrices <inline-formula><mml:math id="INEQ9"><mml:msubsup><mml:mi>J</mml:mi><mml:mi>W</mml:mi><mml:mi>T</mml:mi></mml:msubsup></mml:math></inline-formula> that share the same singular values as &#x03A3; generate the same raw loss values. When each diagonal entry of the diagonal matrix &#x03A3; is given the specific same value, thus writing the expectation into the integral of the probability density over <inline-formula><mml:math id="INEQ10"><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false">~</mml:mo></mml:mover></mml:math></inline-formula>:</p>
<disp-formula id="S3.Ex1">
<mml:math id="M7">
<mml:mrow>
<mml:mpadded width="+5.6pt">
<mml:msub>
<mml:mi class="ltx_font_mathcaligraphic">&#x2112;</mml:mi>
<mml:mrow>
<mml:mtext>w</mml:mtext>
</mml:mrow>
</mml:msub>
</mml:mpadded>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo mathsize="90%" movablelimits="false" stretchy="false">&#x222B;</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo>&#x2225;</mml:mo>
<mml:mrow>
<mml:mtext>&#x03A3;</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mover accent="true">
<mml:mtext>y</mml:mtext>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x2225;</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mover accent="true">
<mml:mrow>
<mml:mtext>y</mml:mtext>
</mml:mrow>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mover accent="true">
<mml:mtext>y</mml:mtext>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mtext>d</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mover accent="true">
<mml:mtext>y</mml:mtext>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S3.E7">
<label>(7)</label>
<mml:math id="M8">
<mml:mrow>
<mml:mi/>
<mml:mo lspace="8.1pt">=</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x03C0;</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mfrac>
<mml:mi>L</mml:mi>
<mml:mn>2</mml:mn>
</mml:mfrac>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo mathsize="90%" movablelimits="false" stretchy="false">&#x222B;</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mo>&#x2225;</mml:mo>
<mml:mrow>
<mml:mtext>&#x03A3;</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mover accent="true">
<mml:mtext>y</mml:mtext>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x2225;</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mtext>exp</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mstyle displaystyle="true">
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mover accent="true">
<mml:mtext>y</mml:mtext>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
<mml:mi>T</mml:mi>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mover accent="true">
<mml:mtext>y</mml:mtext>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:mfrac>
</mml:mstyle>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mtext>d</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mover accent="true">
<mml:mtext>y</mml:mtext>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>To observe the radially symmetric form of the density, we alter to polar coordinates <inline-formula><mml:math id="INEQ11"><mml:mrow><mml:mpadded width="+5.6pt"><mml:mover accent="true"><mml:mi>y</mml:mi><mml:mo stretchy="false">~</mml:mo></mml:mover></mml:mpadded><mml:mo>=</mml:mo><mml:mrow><mml:mi>r</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi mathvariant="normal">&#x03D5;</mml:mi></mml:mrow></mml:mrow></mml:math></inline-formula>. Such a variable change is replaced by the Jacobian factor <italic>r<sup>L</sup></italic><sup>&#x2212;1</sup>:</p>
<disp-formula id="S3.E8">
<label>(8)</label>
<mml:math id="M9">
<mml:mrow>
<mml:mpadded width="+5.6pt">
<mml:msub>
<mml:mover accent="true">
<mml:mi class="ltx_font_mathcaligraphic">&#x2112;</mml:mi>
<mml:mo stretchy="false">~</mml:mo>
</mml:mover>
<mml:mi>w</mml:mi>
</mml:msub>
</mml:mpadded>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x03C0;</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mfrac>
<mml:mi>L</mml:mi>
<mml:mn>2</mml:mn>
</mml:mfrac>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="false">
<mml:munder>
<mml:mo mathsize="90%" movablelimits="false" stretchy="false">&#x222B;</mml:mo>
<mml:mi>&#x1D54A;</mml:mi>
</mml:munder>
</mml:mstyle>
<mml:mrow>
<mml:mstyle displaystyle="false">
<mml:munderover>
<mml:mo mathsize="90%" movablelimits="false" stretchy="false">&#x222B;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mi mathvariant="normal">&#x221E;</mml:mi>
</mml:munderover>
</mml:mstyle>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mi>r</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo>&#x2225;</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">&#x03A3;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x03D5;</mml:mi>
</mml:mrow>
<mml:mo>&#x2225;</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:msup>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mfrac>
<mml:msup>
<mml:mi>r</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mn>2</mml:mn>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>d</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>r</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>d</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x03D5;</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>r</italic> represents the distance from the origin, and &#x03D5; stands for a unit vector. Thus, the <inline-formula><mml:math id="INEQ13"><mml:mrow><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mi mathvariant="normal">&#x03C0;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:mi>L</mml:mi><mml:mo>/</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>r</mml:mi><mml:mrow><mml:mi>L</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mtext>exp</mml:mtext><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:msup><mml:mi>r</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mn>2</mml:mn></mml:mfrac></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> denotes the <italic>L</italic>-dimensional unit average density expressed in polar coordinates. The Taylor approximation argument indicates that when <italic>L</italic> is high, the density is well-approximated by density <inline-formula><mml:math id="INEQ14"><mml:mrow><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mi mathvariant="normal">&#x03C0;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>e</mml:mi></mml:mrow><mml:mo>/</mml:mo><mml:mi>L</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mi>L</mml:mi><mml:mn>2</mml:mn></mml:mfrac></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mi>exp</mml:mi><mml:mo>&#x2061;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>r</mml:mi><mml:mo>-</mml:mo><mml:mi mathvariant="normal">&#x03BC;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mo>/</mml:mo><mml:msup><mml:mi mathvariant="normal">&#x03C3;</mml:mi><mml:mn>2</mml:mn></mml:msup></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula>for any &#x03D5;. Replacing the density into the integral, the loss is given by approximately:</p>
<disp-formula id="S3.E9">
<label>(9)</label>
<mml:math id="M10">
<mml:mrow>
<mml:msub>
<mml:mi class="ltx_font_mathcaligraphic">&#x2112;</mml:mi>
<mml:mrow>
<mml:mtext>w</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2248;</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x03C0;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mo>/</mml:mo>
<mml:mi>L</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="false">
<mml:munder>
<mml:mo mathsize="90%" movablelimits="false" stretchy="false">&#x222B;</mml:mo>
<mml:mrow>
<mml:mtext>S</mml:mtext>
</mml:mrow>
</mml:munder>
</mml:mstyle>
<mml:mrow>
<mml:mstyle displaystyle="false">
<mml:munderover>
<mml:mo mathsize="90%" movablelimits="false" stretchy="false">&#x222B;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mi mathvariant="normal">&#x221E;</mml:mi>
</mml:munderover>
</mml:mstyle>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mi>r</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mo>&#x2225;</mml:mo>
<mml:mrow>
<mml:mtext>&#x03A3;</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x03D5;</mml:mi>
</mml:mrow>
<mml:mo>&#x2225;</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mtext>exp</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mfrac>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>r</mml:mi>
<mml:mo>-</mml:mo>
<mml:msqrt>
<mml:mi>L</mml:mi>
</mml:msqrt>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:msup>
<mml:mi mathvariant="normal">&#x03C3;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>d</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>r</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mtext>d</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x03D5;</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where the approximation turns out to be exact in the limit of infinite dimension <italic>L</italic>.</p>
<p>By minimizing this loss, we set &#x03A3; to obtain a minimum of the function (<italic>r</italic>&#x2225;&#x03A3;&#x03D5;&#x2225;<sub>2</sub>&#x2212;<italic>a</italic>)2 over a spherical shell of radius <inline-formula><mml:math id="INEQ16"><mml:msqrt><mml:mi>L</mml:mi></mml:msqrt></mml:math></inline-formula>. According to this function becoming constant in &#x03D5;, the equation reducing to:</p>
<disp-formula id="S3.E10">
<label>(10)</label>
<mml:math id="M11">
<mml:mrow>
<mml:msub>
<mml:mi class="ltx_font_mathcaligraphic">&#x2112;</mml:mi>
<mml:mi>w</mml:mi>
</mml:msub>
<mml:mo>&#x2248;</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x03C0;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mo>/</mml:mo>
<mml:mi>L</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mi class="ltx_font_mathcaligraphic">&#x1D49C;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:msup>
<mml:mi>a</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:msup>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="false">
<mml:munderover>
<mml:mo mathsize="90%" movablelimits="false" stretchy="false">&#x222B;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mi mathvariant="normal">&#x221E;</mml:mi>
</mml:munderover>
</mml:mstyle>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>r</mml:mi>
<mml:mo>-</mml:mo>
<mml:msqrt>
<mml:mi>L</mml:mi>
</mml:msqrt>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mfrac>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>r</mml:mi>
<mml:mo>-</mml:mo>
<mml:msqrt>
<mml:mi>L</mml:mi>
</mml:msqrt>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2062;</mml:mo>
<mml:msup>
<mml:mi mathvariant="normal">&#x03C3;</mml:mi>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>d</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic><bold>&#x1D49C;</bold></italic>(<italic>S</italic>) indicates the surface area of the unit sphere.</p>
<p>To summarize, we proved that, supposing a high dimensionality <italic>L</italic> of the latent space, the path length prior at each latent space point <italic>w</italic> is minimal if all the singular values of the Jacobian matrix for the generator are equal to a global constant, that is, they are orthogonal up to a global constant. We avoid the explicit computation of the Jacobian matrix by using the same <inline-formula><mml:math id="INEQ18"><mml:mrow><mml:mrow><mml:msubsup><mml:mi>J</mml:mi><mml:mi>w</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mo>&#x2062;</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mo>&#x2207;</mml:mo><mml:mo>&#x2061;</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>w</mml:mi></mml:mpadded></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mpadded width="+1.7pt"><mml:mi>g</mml:mi></mml:mpadded><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>w</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>&#x22C5;</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula>, and this could be efficiently computed by standard back-propagation. The constant <italic>a</italic> is dynamically set to a long-term exponential moving average of length <inline-formula><mml:math id="INEQ19"><mml:msub><mml:mrow><mml:mo>&#x2225;</mml:mo><mml:mrow><mml:msubsup><mml:mi>J</mml:mi><mml:mi>w</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mo>&#x2062;</mml:mo><mml:mi>y</mml:mi></mml:mrow><mml:mo>&#x2225;</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msub></mml:math></inline-formula>, enabling the optimization to discover the appropriate global scale on its own.</p>
</sec>
<sec id="S3.SS7">
<title>Res2Net</title>
<p>To alleviate pattern collapse and gradient vanishing, we use a gradient diversion approach (Res2Net) with stronger multi-scale feature extraction capabilities. In essence, a set of 3 &#x00D7; 3 filters are substituted with smaller filter groups, connected in a similar way to the residual mechanism. <xref ref-type="fig" rid="F5">Figure 5</xref> illustrates Res2Net, we split the feature map uniformly into <italic>s</italic> subsets of feature maps after 1 &#x00D7; 1 convolution, denoted by<italic>xi</italic>, where <italic>i</italic> &#x2208;{1,&#x2004;2,&#x2026;<italic>s</italic>}. Each subset of features <italic>x</italic><sub><italic>i</italic></sub> has the same spatial size compared to the input feature map, but with 1/s number of channels. Besides<italic>x</italic>1, each <italic>x</italic><sub><italic>i&#x2004;</italic></sub>has a corresponding 3 &#x00D7; 3 convolution, denoted by<italic>Ki</italic> (). We denote the output of <italic>K</italic><sub><italic>i</italic></sub> () by <italic>y</italic><sub><italic>i</italic></sub>. This feature subset is summed with the output of <italic>K</italic><sub><italic>i&#x2013;1</italic></sub> () and fed into <italic>K</italic><sub><italic>i</italic></sub> (). To minimize the parameters and increase <italic>s</italic> simultaneously, we skip the 3 &#x00D7; 3 convolution of <italic>x</italic><sub><italic>1</italic></sub>. Hence, <italic>y</italic><sub><italic>i</italic></sub>could be written as:</p>
<disp-formula id="S3.E11">
<label>(11)</label>
<mml:math id="M12">
<mml:mrow>
<mml:mpadded width="+5.6pt">
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mpadded>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mtable displaystyle="true" rowspacing="0pt">
<mml:mtr>
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo mathvariant="italic" separator="true">&#x2003;&#x2003;&#x2002;&#x2003;</mml:mo>
<mml:mpadded width="+5.6pt">
<mml:mi>i</mml:mi>
</mml:mpadded>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo>;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>K</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo mathvariant="italic" separator="true">&#x2003;&#x2006;</mml:mo>
<mml:mpadded width="+5.6pt">
<mml:mi>i</mml:mi>
</mml:mpadded>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mo>;</mml:mo>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="center">
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>K</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mn>&#x2009;&#x2009;&#x2009;&#x2009;2</mml:mn>
</mml:mrow>
<mml:mo>&lt;</mml:mo>
<mml:mpadded width="+5.6pt">
<mml:mi>i</mml:mi>
</mml:mpadded>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mi/>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>Impact of dataset size and epoch time on TC-GAN.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-850606-g005.tif"/>
</fig>
<p>Each 3 &#x00D7; 3 convolutional operator <italic>K</italic><sub><italic>i</italic></sub> () has the potential to capture feature information from feature splits{<italic>x</italic><sub><italic>j</italic></sub>,<italic>j</italic>&#x2264;<italic>i</italic>}. When the feature slice <italic>x</italic><sub><italic>j&#x2004;</italic></sub> is passed through the 3 &#x00D7; 3 convolution operator, the output may have a larger receptive field than <italic>x</italic><sub><italic>j&#x2004;</italic></sub>. Due to the combinatorial explosion effect, the output of the Res2Net module contains varying amounts and various combinations of receptive field sizes.</p>
<p>In Res2Net, the global and local information of the chrysanthemum image is extracted through processing the splits in a multi-scale approach. To better fuse feature information at different scales, we tandem all the splits and compute them by 1 &#x00D7; 1 convolution. The segmentation and tandem approach allow for efficient convolution operations and feature processing. To minimize the parameter capacity, we skip the convolution of the first segmentation. In this article, we employ <italic>s</italic> to control parameters for the scale dimension. Larger <italic>s</italic> has the potential to allow learning features with richer perceptual field dimensions, with negligible computation of tandem.</p>
</sec>
<sec id="S3.SS8">
<title>Evaluation Metrics</title>
<p>Average precision (AP) is a common evaluation metric in object detection tasks. In this article, we calculate the average precision (IoU = 0.5) of the tea chrysanthemum to test the performance of the model. The equation is as follows:</p>
<disp-formula id="S3.E12">
<label>(12)</label>
<mml:math id="M13">
<mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>P</mml:mi>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo movablelimits="false">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>N</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x0394;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>r</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>e</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>c</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>a</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>l</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mpadded width="+2.8pt">
<mml:mi>l</mml:mi>
</mml:mpadded>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>N</italic> represents the size of the test dataset, <italic>P</italic>(<italic>k</italic>) stands for the precision value of the <italic>k</italic> tea chrysanthemum images, and recall (<italic>k</italic>) denotes the change in recall between <italic>k</italic> and <italic>k</italic>-1 tea chrysanthemum images.</p>
<p>In addition, error and miss rates were introduced in section &#x201C;Impact of Different Unstructured Environments on the TC-YOLO&#x201D; to investigate the ability of TC-GAN to generate unstructured environments. error rate indicates a ratio of the number of falsely detected samples to the total samples. miss rate refers to the ratio of undetected samples to the total samples.</p>
</sec>
<sec id="S3.SS9">
<title>Experimental Setup</title>
<p>The experiments were conducted on a server with an NVIDIA Tesla P100, CUDA 11.2. We built the proposed model using python with the pytorch framework. During training, the key hyperparameters were set as follows: epoch = 500; learning rate = 0.001; and the optimizer used was Adam.</p>
</sec>
</sec>
<sec id="S4" sec-type="results">
<title>Results</title>
<sec id="S4.SS1">
<title>Performance of Tea Chrysanthemum &#x2013; Generative Adversarial Network in Datasets of Different Sizes</title>
<p>To verify the effect of the generated dataset size and the number of training epochs on the chrysanthemum detection task, we randomly selected the datasets with 10 different number of training samples (100, 500, 1000, 1500, 2000, 2500, 3000, 3500, 4000, and 4500) and corresponding ten different training epochs at 100, 200, 250, 300, 350, 400, 450, 500, 550, and 600, respectively, from the generated chrysanthemum dataset and tested them on the proposed TC-YOLO, the results are shown in <xref ref-type="fig" rid="F5">Figure 5</xref>.</p>
<p>It can be seen that the performance of TC-YOLO improves with the increase of the dataset size and training epochs. When the dataset size is less than 1500 and the training epoch is less than 300, the AP value increases rapidly as the dataset size and the training epochs increase (13.54&#x2013;80.53%, improved by 494.76%). When the dataset size reached 2500, and the training epoch reached 400, the AP values only slightly improved and finally converged (from 87.29 to 90.09%) with the increase of the number of samples and the training epochs. After the dataset size reached 4000 and the training epoch reached 550, the detection performance AP value decreased slightly to 89.51%. Combining these results, we set the optimal dataset size to 3500 and the optimal training epoch to 500 for the test experiments in Sections <italic>B</italic>, <italic>C</italic>, and <italic>D</italic>, as it achieved the highest AP values with the smallest dataset size and the least training epoch.</p>
</sec>
<sec id="S4.SS2">
<title>Study on the Performance of Traditional Data Enhancement Methods and Tea Chrysanthemum &#x2013; Generative Adversarial Network</title>
<p>To investigate the performance of classical data enhancement methods and TC-GAN, we selected nine classical data enhancement methods and TC-GAN (<xref ref-type="table" rid="T3">Table 3</xref>). These data enhancement methods were configured and tested in the TC-YOLO object detection model. The results are shown in <xref ref-type="table" rid="T3">Table 3</xref>. TC-GAN shows the best performance with an AP value of 90.09%. It was surprising that the advanced data enhancement methods, such as Mixup, Cutout and Mosaic, had a disappointing performance with AP values of only 80.33, 81.86, and 84.31%, respectively. This may be due to the fact that a large amount of redundant gradient flow would greatly reduce the learning capacity of the network. We also found that the performance of Flip and Rotation was second only to TC-GAN, with AP values of 86.33 and 86.96%. The performance of the model improves slightly, with an AP value of 87.39% when Flip and Rotation are both configured on TC-YOLO. Even so, its AP is still 2.7% lower than TC-GAN.</p>
<table-wrap position="float" id="T3">
<label>TABLE 3</label>
<caption><p>Performance comparison of different data enhancement methods.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Flip</td>
<td valign="top" align="center">Shear</td>
<td valign="top" align="center">Crop</td>
<td valign="top" align="center">Rotation</td>
<td valign="top" align="center">Grayscale</td>
<td valign="top" align="center">Blur</td>
<td valign="top" align="center">Mixup</td>
<td valign="top" align="center">Cutout</td>
<td valign="top" align="center">Mosaic</td>
<td valign="top" align="center">TC-GAN</td>
<td valign="top" align="center">AP</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">&#x221A;</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">86.33</td>
</tr>
<tr>
<td/>
<td valign="top" align="center">&#x221A;</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">84.21</td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center">&#x221A;</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">83.99</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td valign="top" align="center">&#x221A;</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">86.96</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">&#x221A;</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">82.09</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">&#x221A;</td>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">80.13</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">&#x221A;</td>
<td/>
<td/>
<td/>
<td valign="top" align="center">80.33</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">&#x221A;</td>
<td/>
<td/>
<td valign="top" align="center">81.86</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">&#x221A;</td>
<td/>
<td valign="top" align="center">84.31</td>
</tr>
<tr>
<td valign="top" align="left">&#x221A;</td>
<td/>
<td/>
<td valign="top" align="center">&#x221A;</td>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">87.39</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td/>
<td valign="top" align="center">&#x221A;</td>
<td valign="top" align="center">90.09</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn><p><italic>&#x221A; means that this data enhancement method has been adopted.</italic></p></fn>
</table-wrap-foot>
</table-wrap>
</sec>
<sec id="S4.SS3">
<title>Comparisons With State-of-the-Art Detection Models</title>
<p>To verify the superiority of the proposed model, tea chrysanthemum dataset generated by TC-GAN was used to compare TC-YOLO with nine state-of-the-art object detection frameworks (<xref ref-type="bibr" rid="B18">Kim et al., 2018</xref>; <xref ref-type="bibr" rid="B45">Zhang et al., 2018</xref>; <xref ref-type="bibr" rid="B5">Cao et al., 2020</xref>; <xref ref-type="bibr" rid="B44">Zhang and Li, 2020</xref>), and the results are shown in <xref ref-type="table" rid="T4">Table 4</xref>.</p>
<table-wrap position="float" id="T4">
<label>TABLE 4</label>
<caption><p>Comparisons with state-of-the-art detection methods.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Method</td>
<td valign="top" align="center">Backbone</td>
<td valign="top" align="center">Size</td>
<td valign="top" align="center">FPS</td>
<td valign="top" align="center">mAP</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">RetinaNet</td>
<td valign="top" align="center">ResNet101</td>
<td valign="top" align="center">800 &#x00D7; 800</td>
<td valign="top" align="center">4.54</td>
<td valign="top" align="center">82.62</td>
</tr>
<tr>
<td valign="top" align="left">RetinaNet</td>
<td valign="top" align="center">ResNet50</td>
<td valign="top" align="center">800 &#x00D7; 800</td>
<td valign="top" align="center">5.31</td>
<td valign="top" align="center">80.59</td>
</tr>
<tr>
<td valign="top" align="left">RetinaNet</td>
<td valign="top" align="center">ResNet101</td>
<td valign="top" align="center">500 &#x00D7; 500</td>
<td valign="top" align="center">7.23</td>
<td valign="top" align="center">79.13</td>
</tr>
<tr>
<td valign="top" align="left">RetinaNet</td>
<td valign="top" align="center">ResNet50</td>
<td valign="top" align="center">500 &#x00D7; 500</td>
<td valign="top" align="center">7.87</td>
<td valign="top" align="center">83.68</td>
</tr>
<tr>
<td valign="top" align="left">EfficientDetD6</td>
<td valign="top" align="center">EfficientB6</td>
<td valign="top" align="center">1280 &#x00D7; 1280</td>
<td valign="top" align="center">5.29</td>
<td valign="top" align="center">81.23</td>
</tr>
<tr>
<td valign="top" align="left">EfficientDetD5</td>
<td valign="top" align="center">EfficientB5</td>
<td valign="top" align="center">1280 &#x00D7; 1280</td>
<td valign="top" align="center">6.21</td>
<td valign="top" align="center">83.51</td>
</tr>
<tr>
<td valign="top" align="left">EfficientDetD4</td>
<td valign="top" align="center">EfficientB4</td>
<td valign="top" align="center">1024 &#x00D7; 1024</td>
<td valign="top" align="center">7.93</td>
<td valign="top" align="center">83.19</td>
</tr>
<tr>
<td valign="top" align="left">EfficientDetD3</td>
<td valign="top" align="center">EfficientB3</td>
<td valign="top" align="center">896 &#x00D7; 896</td>
<td valign="top" align="center">9.28</td>
<td valign="top" align="center">84.83</td>
</tr>
<tr>
<td valign="top" align="left">EfficientDetD2</td>
<td valign="top" align="center">EfficientB2</td>
<td valign="top" align="center">768 &#x00D7; 768</td>
<td valign="top" align="center">11.66</td>
<td valign="top" align="center">84.22</td>
</tr>
<tr>
<td valign="top" align="left">EfficientDetD1</td>
<td valign="top" align="center">EfficientB1</td>
<td valign="top" align="center">640 &#x00D7; 640</td>
<td valign="top" align="center">15.26</td>
<td valign="top" align="center">82.93</td>
</tr>
<tr>
<td valign="top" align="left">EfficientDetD0</td>
<td valign="top" align="center">EfficientB0</td>
<td valign="top" align="center">512 &#x00D7; 512</td>
<td valign="top" align="center">37.61</td>
<td valign="top" align="center">82.81</td>
</tr>
<tr>
<td valign="top" align="left">M2Det</td>
<td valign="top" align="center">VGG16</td>
<td valign="top" align="center">800 &#x00D7; 800</td>
<td valign="top" align="center">7.08</td>
<td valign="top" align="center">80.63</td>
</tr>
<tr>
<td valign="top" align="left">M2Det</td>
<td valign="top" align="center">ResNet101</td>
<td valign="top" align="center">320 &#x00D7; 320</td>
<td valign="top" align="center">16.89</td>
<td valign="top" align="center">85.16</td>
</tr>
<tr>
<td valign="top" align="left">M2Det</td>
<td valign="top" align="center">VGG16</td>
<td valign="top" align="center">512 &#x00D7; 512</td>
<td valign="top" align="center">21.22</td>
<td valign="top" align="center">80.88</td>
</tr>
<tr>
<td valign="top" align="left">M2Det</td>
<td valign="top" align="center">VGG16</td>
<td valign="top" align="center">300 &#x00D7; 300</td>
<td valign="top" align="center">42.53</td>
<td valign="top" align="center">78.24</td>
</tr>
<tr>
<td valign="top" align="left">YOLOv3</td>
<td valign="top" align="center">DarkNet53</td>
<td valign="top" align="center">608 &#x00D7; 608</td>
<td valign="top" align="center">12.14</td>
<td valign="top" align="center">86.52</td>
</tr>
<tr>
<td valign="top" align="left">YOLOv3 (SPP)</td>
<td valign="top" align="center">DarkNet53</td>
<td valign="top" align="center">608 &#x00D7; 608</td>
<td valign="top" align="center">15.66</td>
<td valign="top" align="center">83.89</td>
</tr>
<tr>
<td valign="top" align="left">YOLOv3</td>
<td valign="top" align="center">DarkNet53</td>
<td valign="top" align="center">416 &#x00D7; 416</td>
<td valign="top" align="center">43.25</td>
<td valign="top" align="center">84.13</td>
</tr>
<tr>
<td valign="top" align="left">PFPNet (R)</td>
<td valign="top" align="center">VGG16</td>
<td valign="top" align="center">512 &#x00D7; 512</td>
<td valign="top" align="center">24.35</td>
<td valign="top" align="center">82.41</td>
</tr>
<tr>
<td valign="top" align="left">RFBNetE</td>
<td valign="top" align="center">VGG16</td>
<td valign="top" align="center">512 &#x00D7; 512</td>
<td valign="top" align="center">21.54</td>
<td valign="top" align="center">77.37</td>
</tr>
<tr>
<td valign="top" align="left">RFBNet</td>
<td valign="top" align="center">VGG16</td>
<td valign="top" align="center">512 &#x00D7; 512</td>
<td valign="top" align="center">45.46</td>
<td valign="top" align="center">85.53</td>
</tr>
<tr>
<td valign="top" align="left">RefineDet</td>
<td valign="top" align="center">VGG16</td>
<td valign="top" align="center">512 &#x00D7; 512</td>
<td valign="top" align="center">31.33</td>
<td valign="top" align="center">81.12</td>
</tr>
<tr>
<td valign="top" align="left">RefineDet</td>
<td valign="top" align="center">VGG16</td>
<td valign="top" align="center">448 &#x00D7; 448</td>
<td valign="top" align="center">43.31</td>
<td valign="top" align="center">79.66</td>
</tr>
<tr>
<td valign="top" align="left">YOLOv4</td>
<td valign="top" align="center">CSPDarknet53</td>
<td valign="top" align="center">608 &#x00D7; 608</td>
<td valign="top" align="center">19.22</td>
<td valign="top" align="center">85.11</td>
</tr>
<tr>
<td valign="top" align="left">YOLOv4</td>
<td valign="top" align="center">CSPDarknet53</td>
<td valign="top" align="center">512 &#x00D7; 512</td>
<td valign="top" align="center">24.63</td>
<td valign="top" align="center">84.34</td>
</tr>
<tr>
<td valign="top" align="left">YOLOv5l</td>
<td valign="top" align="center">CSPDenseNet</td>
<td valign="top" align="center">416 &#x00D7; 416</td>
<td valign="top" align="center">42.24</td>
<td valign="top" align="center">88.83</td>
</tr>
<tr>
<td valign="top" align="left">YOLOv5m</td>
<td valign="top" align="center">CSPDenseNet</td>
<td valign="top" align="center">416 &#x00D7; 416</td>
<td valign="top" align="center">36.91</td>
<td valign="top" align="center">86.68</td>
</tr>
<tr>
<td valign="top" align="left">YOLOv5x</td>
<td valign="top" align="center">CSPDenseNet</td>
<td valign="top" align="center">416 &#x00D7; 416</td>
<td valign="top" align="center">32.28</td>
<td valign="top" align="center">84.02</td>
</tr>
<tr>
<td valign="top" align="left">YOLOv5s</td>
<td valign="top" align="center">CSPDenseNet</td>
<td valign="top" align="center">416 &#x00D7; 416</td>
<td valign="top" align="center">47.88</td>
<td valign="top" align="center">88.29</td>
</tr>
<tr>
<td valign="top" align="left">TC-YOLO</td>
<td valign="top" align="center">CSPDenseNet</td>
<td valign="top" align="center">416 &#x00D7; 416</td>
<td valign="top" align="center">47.53</td>
<td valign="top" align="center">90.09</td>
</tr>
</tbody>
</table></table-wrap>
<p><xref ref-type="table" rid="T4">Table 4</xref> shows that TC-GAN not only achieves excellent performance on the TC-YOLO object detection model with a mAP of 90.09%, but also performs well on other state-of-the-art object detection frameworks. TC-GAN is a general data enhancement method and not constrained to the specific object detectors. Generally speaking, large image sizes benefit model training by providing more local feature information, however, large image sizes (&#x003E;512 &#x00D7; 512) do not always result in improved performance. In <xref ref-type="table" rid="T4">Table 4</xref>, all the models with large image sizes (&#x003E;512 &#x00D7; 512) were unable to achieve a performance above 87%. The main reason for this may be that the image size generated in this article is 512 &#x00D7; 512, which would affect the performance of models requiring a large input size. To match the input size, the images could only be artificially resized to the smaller images, resulting in a reduction in image resolution, and this would considerably affect the final test performance of the models. Also, transfer learning ability varies between models, and this may account for some models with over 512 &#x00D7; 512 resolutions performing poorly. Given the above two reasons, TC-YOLO has relatively better transfer learning ability compared to other object detection models. Therefore, TC-YOLO is used as the test model for generating chrysanthemum images in this article. Besides, TC-YOLO requires the image input size of 416 &#x00D7; 416, making the image resolution a relatively minor impact on the final performance. Furthermore, we deployed the trained TC-YOLO in the NVIDIA Jetson TX2 embedded platform to evaluate its performance for robotics and solar insecticidal lamps systems development. <xref ref-type="fig" rid="F6">Figure 6</xref> shows the detection results.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption><p>Qualitative results of our method. The red box indicates the recognised tea chrysanthemum.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-850606-g006.tif"/>
</fig>
</sec>
<sec id="S4.SS4">
<title>Impact of Different Unstructured Environments on the TC-YOLO</title>
<p>Datasets with complex unstructured environments can effectively improve the robustness of detection models. This study investigated the ability of the proposed TC-GAN to generate complex unstructured environments, including strong light, weak light, normal light, high overlap, moderate overlap, normal overlap, high occlusion, moderate occlusion and normal occlusion, as shown in <xref ref-type="fig" rid="F7">Figure 7</xref>. A total of 26,432 chrysanthemums were at the early flowering stage in the nine unstructured environments. Since there are no mature standards to define these different environments, we set the criteria based on empirical inspection. Strong light is defined as when sunlight obscures more than fifty percent of the petal area. Weak light is defined as when the shadows cover less than fifty percent of the pedal area. Normal light is defined as when the sunlight covers between zero and fifty percent of the petal area. High overlap is defined as when the overlapping area between petals is greater than sixty percent. Moderate overlap is defined when the overlapping area between petals is between thirty to sixty percent. Normal overlap is defined when the overlapping area between petals is between zero to thirty percent. High occlusion is defined as more than sixty percent of the petal area is obscured. Moderate is defined as when thirty to sixty percent of the petal area is obscured. Normal occlusion is defined as when zero to thirty percent of the petal area is obscured. The chrysanthemums are counted separately in different environments. For example, when chrysanthemums in normal light, normal overlap and normal occlusion appear in one image simultaneously, their numbers increase by one in the calculation.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption><p>Example of nine unstructured scenarios.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-850606-g007.tif"/>
</fig>
<p><xref ref-type="table" rid="T5">Table 5</xref> shows that under normal conditions, with normal light, normal overlap and normal shading, the AP values reached at 93.43, 94.59, and 94.03%, respectively. When the unstructured environment became complicated, the AP values dropped significantly, especially under the strong light environment, with only 77.12%. AP value. Intriguingly, the error rate (10.54%) was highest under the strong light, probably because the light added shadows to the chrysanthemums. It also may be due to the poor ability of TC-GAN to generate high quality images under light environment. The high overlap had the highest miss rate of 13.39%. Furthermore, overall, overlap had the least influence on the detection of chrysanthemums at the early flowering stage. Under high overlap, the AP, error and miss rates were 79.39, 7.22, and 13.39%, respectively. Illumination had the biggest effect on chrysanthemum detection at the early flowering stage. Under high light, the accuracy, error and miss rates were 77.12, 10.54, and 7.25%, respectively.</p>
<table-wrap position="float" id="T5">
<label>TABLE 5</label>
<caption><p>Impact of different unstructured scenarios on the TC-YOLO.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Environment</td>
<td valign="top" align="center">Count</td>
<td valign="top" align="center" colspan="2">Correctly identified</td>
<td valign="top" align="center" colspan="2">Falsely identified</td>
<td valign="top" align="center" colspan="2">Missed</td>
</tr>
<tr>
<td valign="top" align="center"></td>
<td valign="top" align="center"></td>
<td valign="top" align="center" colspan="2"><hr/></td>
<td valign="top" align="center" colspan="2"><hr/></td>
<td valign="top" align="center" colspan="2"><hr/></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center">Amount</td>
<td valign="top" align="center">Rate (%)</td>
<td valign="top" align="center">Amount</td>
<td valign="top" align="center">Rate (%)</td>
<td valign="top" align="center">Amount</td>
<td valign="top" align="center">Rate (%)</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Strong light</td>
<td valign="top" align="center">6511</td>
<td valign="top" align="center">5021</td>
<td valign="top" align="center">77.12</td>
<td valign="top" align="center">686</td>
<td valign="top" align="center">10.54</td>
<td valign="top" align="center">804</td>
<td valign="top" align="center">7.25</td>
</tr>
<tr>
<td valign="top" align="left">Weak light</td>
<td valign="top" align="center">10162</td>
<td valign="top" align="center">8786</td>
<td valign="top" align="center">86.46</td>
<td valign="top" align="center">857</td>
<td valign="top" align="center">8.43</td>
<td valign="top" align="center">519</td>
<td valign="top" align="center">5.11</td>
</tr>
<tr>
<td valign="top" align="left">Normal light</td>
<td valign="top" align="center">18686</td>
<td valign="top" align="center">17458</td>
<td valign="top" align="center">93.43</td>
<td valign="top" align="center">988</td>
<td valign="top" align="center">5.29</td>
<td valign="top" align="center">240</td>
<td valign="top" align="center">1.28</td>
</tr>
<tr>
<td valign="top" align="left">High overlap</td>
<td valign="top" align="center">5249</td>
<td valign="top" align="center">4167</td>
<td valign="top" align="center">79.39</td>
<td valign="top" align="center">379</td>
<td valign="top" align="center">7.22</td>
<td valign="top" align="center">703</td>
<td valign="top" align="center">13.39</td>
</tr>
<tr>
<td valign="top" align="left">Moderate overlap</td>
<td valign="top" align="center">11892</td>
<td valign="top" align="center">10420</td>
<td valign="top" align="center">87.62</td>
<td valign="top" align="center">659</td>
<td valign="top" align="center">5.54</td>
<td valign="top" align="center">813</td>
<td valign="top" align="center">6.84</td>
</tr>
<tr>
<td valign="top" align="left">Normal overlap</td>
<td valign="top" align="center">17443</td>
<td valign="top" align="center">16499</td>
<td valign="top" align="center">94.59</td>
<td valign="top" align="center">419</td>
<td valign="top" align="center">2.4</td>
<td valign="top" align="center">525</td>
<td valign="top" align="center">3.01</td>
</tr>
<tr>
<td valign="top" align="left">High occlusion</td>
<td valign="top" align="center">7811</td>
<td valign="top" align="center">6284</td>
<td valign="top" align="center">80.45</td>
<td valign="top" align="center">729</td>
<td valign="top" align="center">9.33</td>
<td valign="top" align="center">798</td>
<td valign="top" align="center">10.22</td>
</tr>
<tr>
<td valign="top" align="left">Moderate occlusion</td>
<td valign="top" align="center">12162</td>
<td valign="top" align="center">10661</td>
<td valign="top" align="center">87.66</td>
<td valign="top" align="center">630</td>
<td valign="top" align="center">5.18</td>
<td valign="top" align="center">890</td>
<td valign="top" align="center">7.16</td>
</tr>
<tr>
<td valign="top" align="left">Normal occlusion</td>
<td valign="top" align="center">19299</td>
<td valign="top" align="center">18147</td>
<td valign="top" align="center">94.03</td>
<td valign="top" align="center">648</td>
<td valign="top" align="center">3.36</td>
<td valign="top" align="center">504</td>
<td valign="top" align="center">2.61</td>
</tr>
</tbody>
</table></table-wrap>
</sec>
<sec id="S4.SS5">
<title>Comparison of the Latest Generative Adversarial Neural Networks</title>
<p>To fully investigate the performance of TC-GAN, TC-GAN and 12 state-of-the-art generative adversarial neural networks were tested on the chrysanthemum dataset using the TC-YOLO model. The proposed TC-GAN generated chrysanthemum images with a resolution of 512 &#x00D7; 512. However, there is variability in the resolution of the generated images from different generative adversarial neural networks. Therefore, to facilitate testing of the TC-YOLO model and to ensure a fair competition between TC-GAN and these generative adversarial neural networks, we modified the output resolution of the latest generative adversarial neural networks. According to the original output resolution of these neural networks, we modified the output resolution of LSGAN, Improved WGAN-GP to 448 &#x00D7; 448, BigGAN kept the original resolution unchanged, and the output resolution of the remaining generative adversarial neural networks were all adjusted to 512 &#x00D7; 512, while other parameters were kept fixed. The performance is shown in <xref ref-type="table" rid="T6">Table 6</xref>.</p>
<table-wrap position="float" id="T6">
<label>TABLE 6</label>
<caption><p>Comparison between tea chrysanthemum &#x2013; generative adversarial network (TC-GAN) and state-of-the-art GANs.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Method</td>
<td valign="top" align="left">Size</td>
<td valign="top" align="left">Times/min</td>
<td valign="top" align="left">AP</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Improved SN-GAN</td>
<td valign="top" align="left">32 &#x00D7; 32</td>
<td valign="top" align="left">1290</td>
<td valign="top" align="left">80.61</td>
</tr>
<tr>
<td valign="top" align="left">BigGAN</td>
<td valign="top" align="left">512 &#x00D7; 512</td>
<td valign="top" align="left">1610</td>
<td valign="top" align="left">86.45</td>
</tr>
<tr>
<td valign="top" align="left">Dist-GAN</td>
<td valign="top" align="left">64 &#x00D7; 64</td>
<td valign="top" align="left">1322</td>
<td valign="top" align="left">80.68</td>
</tr>
<tr>
<td valign="top" align="left">Progressive GAN</td>
<td valign="top" align="left">64 &#x00D7; 64</td>
<td valign="top" align="left">1256</td>
<td valign="top" align="left">81.11</td>
</tr>
<tr>
<td valign="top" align="left">LSGAN</td>
<td valign="top" align="left">112 &#x00D7; 112</td>
<td valign="top" align="left">1410</td>
<td valign="top" align="left">84.03</td>
</tr>
<tr>
<td valign="top" align="left">Rob-GAN</td>
<td valign="top" align="left">128 &#x00D7; 128</td>
<td valign="top" align="left">1293</td>
<td valign="top" align="left">85.28</td>
</tr>
<tr>
<td valign="top" align="left">MGAN</td>
<td valign="top" align="left">64 &#x00D7; 64</td>
<td valign="top" align="left">1151</td>
<td valign="top" align="left">82.39</td>
</tr>
<tr>
<td valign="top" align="left">AutoGAN</td>
<td valign="top" align="left">64 &#x00D7; 64</td>
<td valign="top" align="left">1340</td>
<td valign="top" align="left">83.25</td>
</tr>
<tr>
<td valign="top" align="left">Improved DCGAN</td>
<td valign="top" align="left">64 &#x00D7; 64</td>
<td valign="top" align="left">1280</td>
<td valign="top" align="left">84.38</td>
</tr>
<tr>
<td valign="top" align="left">DAG</td>
<td valign="top" align="left">48 &#x00D7; 48</td>
<td valign="top" align="left">1768</td>
<td valign="top" align="left">83.29</td>
</tr>
<tr>
<td valign="top" align="left">Improved WGAN-GP</td>
<td valign="top" align="left">28 &#x00D7; 28</td>
<td valign="top" align="left">1640</td>
<td valign="top" align="left">76.16</td>
</tr>
<tr>
<td valign="top" align="left">Improved WGAN</td>
<td valign="top" align="left">128 &#x00D7; 128</td>
<td valign="top" align="left">1501</td>
<td valign="top" align="left">87.16</td>
</tr>
<tr>
<td valign="top" align="left">TC-GAN</td>
<td valign="top" align="left">512 &#x00D7; 512</td>
<td valign="top" align="left">1460</td>
<td valign="top" align="left">90.09</td>
</tr>
</tbody>
</table></table-wrap>
<p><xref ref-type="table" rid="T6">Table 6</xref> shows some experimental details. TC-GAN has the best performance among the latest 12 generative adversarial neural networks, with an AP value of 90.09%. It is worth noting that TC-GAN does not have an advantage in training time among all the latest generative adversarial neural networks, with all nine models training faster than TC-GAN. Only BigGAN, Improved WGAN-GP and Improved WGAN are slower than TC-GAN, with training times of 50, 180, and 241 min slower than TC-GAN, respectively. This may be due to the design of the network structure, which increases the depth of the network and adds a gradient penalty mechanism. In contrast to most convolutional neural networks, deepening the structure of generative adversarial neural networks tends to make training unstable. Also, the gradient penalty mechanism is very sensitive to the choice of parameters, and this helps training initially, but subsequently becomes difficult to optimize. Furthermore, in general, the smaller the original generated image size, the worse the performance of the generative adversarial neural network in the detection task. This is because, firstly, current mainstream adversarial neural networks generate images with low resolution, and artificially enlarging the resolution would blur the image, thus affecting the detection accuracy. Then, some latest models, such as Progressive GAN, Improved DCGAN and so on, are designed for better faces, and these models are not robust in terms of transfer ability. Interestingly, among the 12 latest generative adversarial neural networks, most of the network structures are unconditional. Nevertheless, from a comprehensive performance perspective, network structures with conditional mechanisms, such as the improved WGAN, have surprisingly good performance. Its training time is only 41 min slower than TC-GAN, while the AP value is only slightly lower by 2.93%. Network structures with conditional mechanisms are undoubtedly valuable to learn from, and adding conditional mechanisms could be a future direction to improve the performance of TC-GAN. To visualize the performance of TC-GAN, the images generated by TC-GAN are shown in <xref ref-type="table" rid="T7">Table 7</xref>.</p>
<table-wrap position="float" id="T7">
<label>TABLE 7</label>
<caption><p>Generation results of different GANs.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Methods</td>
<td valign="top" align="center">Result</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Improved<break/> WGAN-GP</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i001.jpg"/><inline-graphic xlink:href="fpls-13-850606-i002.jpg"/><inline-graphic xlink:href="fpls-13-850606-i003.jpg"/><inline-graphic xlink:href="fpls-13-850606-i004.jpg"/></td>
</tr>
<tr>
<td valign="top" align="left">SN-GAN</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i005.jpg"/><inline-graphic xlink:href="fpls-13-850606-i006.jpg"/><inline-graphic xlink:href="fpls-13-850606-i007.jpg"/><inline-graphic xlink:href="fpls-13-850606-i008.jpg"/></td>
</tr>
<tr>
<td valign="top" align="left">Dist-GAN</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i009.jpg"/><inline-graphic xlink:href="fpls-13-850606-i010.jpg"/><inline-graphic xlink:href="fpls-13-850606-i011.jpg"/><inline-graphic xlink:href="fpls-13-850606-i012.jpg"/></td>
</tr>
<tr>
<td valign="top" align="left">Progressive GAN</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i013.jpg"/><inline-graphic xlink:href="fpls-13-850606-i014.jpg"/><inline-graphic xlink:href="fpls-13-850606-i015.jpg"/><inline-graphic xlink:href="fpls-13-850606-i016.jpg"/></td>
</tr>
<tr>
<td valign="top" align="left">MGAN</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i017.jpg"/><inline-graphic xlink:href="fpls-13-850606-i018.jpg"/><inline-graphic xlink:href="fpls-13-850606-i019.jpg"/><inline-graphic xlink:href="fpls-13-850606-i020.jpg"/></td>
</tr>
<tr>
<td valign="top" align="left">AutoGAN</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i021.jpg"/><inline-graphic xlink:href="fpls-13-850606-i022.jpg"/><inline-graphic xlink:href="fpls-13-850606-i023.jpg"/><inline-graphic xlink:href="fpls-13-850606-i024.jpg"/></td>
</tr>
<tr>
<td valign="top" align="left">DAG</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i025.jpg"/><inline-graphic xlink:href="fpls-13-850606-i026.jpg"/><inline-graphic xlink:href="fpls-13-850606-i027.jpg"/><inline-graphic xlink:href="fpls-13-850606-i028.jpg"/></td>
</tr>
<tr>
<td valign="top" align="left">LSGAN</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i029.jpg"/><inline-graphic xlink:href="fpls-13-850606-i030.jpg"/><inline-graphic xlink:href="fpls-13-850606-i031.jpg"/><inline-graphic xlink:href="fpls-13-850606-i032.jpg"/></td>
</tr>
<tr>
<td valign="top" align="left">Improved DCGAN</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i033.jpg"/><inline-graphic xlink:href="fpls-13-850606-i034.jpg"/><inline-graphic xlink:href="fpls-13-850606-i035.jpg"/><inline-graphic xlink:href="fpls-13-850606-i036.jpg"/></td>
</tr>
<tr>
<td valign="top" align="left">Rob-GAN</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i037.jpg"/><inline-graphic xlink:href="fpls-13-850606-i038.jpg"/><inline-graphic xlink:href="fpls-13-850606-i039.jpg"/><inline-graphic xlink:href="fpls-13-850606-i040.jpg"/></td>
</tr>
<tr>
<td valign="top" align="left">BigGAN</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i041.jpg"/><inline-graphic xlink:href="fpls-13-850606-i042.jpg"/><inline-graphic xlink:href="fpls-13-850606-i043.jpg"/><inline-graphic xlink:href="fpls-13-850606-i044.jpg"/></td>
</tr>
<tr>
<td valign="top" align="left">Improved WGAN</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i045.jpg"/><inline-graphic xlink:href="fpls-13-850606-i046.jpg"/><inline-graphic xlink:href="fpls-13-850606-i047.jpg"/><inline-graphic xlink:href="fpls-13-850606-i048.jpg"/></td>
</tr>
<tr>
<td valign="top" align="left">TC-GAN</td>
<td valign="top" align="center"><inline-graphic xlink:href="fpls-13-850606-i049.jpg"/><inline-graphic xlink:href="fpls-13-850606-i050.jpg"/><inline-graphic xlink:href="fpls-13-850606-i051.jpg"/><inline-graphic xlink:href="fpls-13-850606-i052.jpg"/></td>
</tr>
</tbody>
</table></table-wrap>
</sec>
</sec>
<sec id="S5" sec-type="discussion">
<title>Discussion</title>
<p>To investigate the three issues summarized in the section &#x201C;Introduction,&#x201D; we proposed the TC-YOLO and compared its results with the related work in <xref ref-type="table" rid="T2">Table 2</xref>. Our proposed TC-GAN generates high resolution images (512 &#x00D7; 512), and the <italic>E</italic> section of the experimental results shows that high resolution images can significantly enrich environmental features and thus improve the robustness of the model. GAN is prone to pattern collapse and gradient vanishing during the training process, resulting in the lack of diversity in the generated image features (<xref ref-type="bibr" rid="B40">Wang et al., 2021</xref>). TC-GAN is able to generate images containing complex unstructured environments including illumination, overlap and occlusion to gain the benefit for detection under field environments, whereas most of synthetic images generated from other GANs listed in <xref ref-type="table" rid="T2">Table 2</xref> provide limited diversity and clear backgrounds. To intuitively view the image features through the generation process, we show the visualization process and training process in TC-YOLO (<xref ref-type="fig" rid="F8">Figure 8</xref>). It can be seen that the important part (flower heads) of the plants is clearly activated and captured with the TC-YOLO.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption><p><bold>(A)</bold> Visualization results and <bold>(B,C)</bold> training process.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-13-850606-g008.tif"/>
</fig>
<p>There are several points to be optimized for TC-GAN despite its good detection performance. First, currently, there are no suitable metrics to evaluate synthetic images. FID is a widely recognized metric for evaluating synthetic images, but the FID metric is dedicated to evaluating several specific datasets and is not applicable to customized datasets. We can only evaluate the quality of synthetic images by their detection results in an object detection model. Therefore, establishing a standard set of evaluation metrics is an urgent issue to be addressed. Next, the training cost of TC-GAN is expensive. As can be seen from <xref ref-type="table" rid="T6">Table 6</xref>, the training of the whole model takes 1460 min under the 16 GB video memory of Tesla P100, and an ordinary device is difficult to train effectively. Thus, the light weight of TC-GAN is beneficial to the promotion of the technology. Besides, according to the experimental results in section &#x201C;Impact of Different Unstructured Environments on the TC-YOLO&#x201D; of the experimental results, TC-GAN can not fully construct images well for the illumination environmental setting. Note that the lack of efficient interaction between the generator network and the discriminator network leads to constant oscillation in the gradient and difficulty in convergence, as shown in <xref ref-type="fig" rid="F8">Figure 8B</xref>. This is still a challenge without fully addressed in generative adversarial networks, and we suggest more attention should be paid to solve this challenge. Finally, our proposed model was deployed in NVIDIA Jetson TX2 with approximately 0.1 s per chrysanthemum inference time (the image size is 416 &#x00D7; 416). It is not real-time performance, and this deserves further optimization for network architecture such as network pruning and quantization.</p>
</sec>
<sec id="S6" sec-type="conclusion">
<title>Conclusion</title>
<p>This article presents a novel generative adversarial network architecture TC-GAN for generating tea chrysanthemum images under unstructured environments (illumination, overlap, occlusion). The TC-YOLO model is able to generate images with a resolution of 512 &#x00D7; 512 and achieves the AP of 90.09%, showing supreme results with other state-of-the-art generative adversarial networks. Finally, we deployed and tested the TC-YOLO model in the NVIDIA Jetson TX2 for robotic harvesting and solar insecticidal lamps systems development, achieving approximately 0.1 s per image (512 &#x00D7; 512). The proposed TC-GAN has the potential to be integrated into selective picking robots and solar insecticide lamp systems via the NVIDIA Jetson TX2 in the future.</p>
</sec>
<sec id="S7" sec-type="data-availability">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article, further inquiries can be directed to the corresponding authors.</p>
</sec>
<sec id="S8">
<title>Author Contributions</title>
<p>CQ: conceptualization, methodology, software, writing &#x2013; original draft, and writing &#x2013; review and editing. JG: conceptualization and writing &#x2013; review and editing. KC: supervision and writing &#x2013; review and editing. LS: supervision, project administration, funding acquisition, and writing &#x2013; review and editing. SP: writing &#x2013; review and editing. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec id="conf1" sec-type="COI-statement">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="pudiscl1" sec-type="disclaimer">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<sec id="S9" sec-type="funding-information">
<title>Funding</title>
<p>This work was supported by the Senior Foreign Expert Admission Scheme (G2021145009L), Modern Agricultural Equipment and Technology Demonstration and Promotion Project in Jiangsu Province (NJ2021-11), Lincoln Agri-Robotics as part of the Expanding Excellence in England (E3) Program, and National Key Research and Development Plan Project: Robotic Systems for Agriculture RS-Agri, (Grant No. 2019YFE0125200).</p>
</sec>
<ack><p>Appreciations are given to the editor and reviewers of the Journal.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alsamhi</surname> <given-names>S. H.</given-names></name> <name><surname>Almalki</surname> <given-names>F. A.</given-names></name> <name><surname>Afghah</surname> <given-names>F.</given-names></name> <name><surname>Hawbani</surname> <given-names>A.</given-names></name> <name><surname>Shvetsov</surname> <given-names>A. V.</given-names></name> <name><surname>Lee</surname> <given-names>B.</given-names></name><etal/></person-group> (<year>2022</year>). <article-title>Drones&#x2019; edge intelligence over smart environments in B5G: blockchain and federated learning synergy.</article-title> <source><italic>IEEE Trans. Green Commun. Netw.</italic></source> <volume>6</volume> <fpage>295</fpage>&#x2013;<lpage>312</lpage>. <pub-id pub-id-type="doi">10.1109/tgcn.2021.3132561</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alsamhi</surname> <given-names>S. H.</given-names></name> <name><surname>Almalki</surname> <given-names>F. A.</given-names></name> <name><surname>Al-Dois</surname> <given-names>H.</given-names></name> <name><surname>Shvetsov</surname> <given-names>A. V.</given-names></name> <name><surname>Ansari</surname> <given-names>M. S.</given-names></name> <name><surname>Hawbani</surname> <given-names>A.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>Multi-drone edge intelligence and SAR smart wearable devices for emergency communication.</article-title> <source><italic>Wirel. Commun. Mob. Com.</italic></source> <volume>2021</volume>:<issue>6710074</issue>.</citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ansari</surname> <given-names>M. S.</given-names></name> <name><surname>Alsamhi</surname> <given-names>S. H.</given-names></name> <name><surname>Qiao</surname> <given-names>Y.</given-names></name> <name><surname>Ye</surname> <given-names>Y.</given-names></name> <name><surname>Lee</surname> <given-names>B.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x201C;Security of distributed intelligence in edge computing: threats and countermeasures,&#x201D;</article-title> in <source><italic>The Cloud-to-Thing Continuum. Palgrave Studies in Digital Business &#x0026; Enabling Technologies</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Lynn</surname> <given-names>T.</given-names></name> <name><surname>Mooney</surname> <given-names>J.</given-names></name> <name><surname>Lee</surname> <given-names>B.</given-names></name> <name><surname>Endo</surname> <given-names>P.</given-names></name></person-group> (<publisher-loc>Cham</publisher-loc>: <publisher-name>Palgrave Macmillan</publisher-name>), <fpage>95</fpage>&#x2013;<lpage>122</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-41110-7_6</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bian</surname> <given-names>Y.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name> <name><surname>Jun</surname> <given-names>J. J.</given-names></name> <name><surname>Xie</surname> <given-names>X.-Q.</given-names></name></person-group> (<year>2019</year>). <article-title>Deep convolutional generative adversarial network (dcGAN) models for screening and design of small molecules targeting cannabinoid receptors.</article-title> <source><italic>Mol. Pharm.</italic></source> <volume>16</volume> <fpage>4451</fpage>&#x2013;<lpage>4460</lpage>. <pub-id pub-id-type="doi">10.1021/acs.molpharmaceut.9b00500</pub-id> <pub-id pub-id-type="pmid">31589460</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cao</surname> <given-names>L.</given-names></name> <name><surname>Zhang</surname> <given-names>X.</given-names></name> <name><surname>Pu</surname> <given-names>J.</given-names></name> <name><surname>Xu</surname> <given-names>S.</given-names></name> <name><surname>Cai</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>Z.</given-names></name></person-group> (<year>2020</year>). &#x201C;<article-title>The field wheat count based on the efficientdet algorithm</article-title>,&#x201D; in <source><italic>Proceedings of the 2020 IEEE 3rd International Conference on Information Systems and Computer Aided Education (ICISCAE)</italic></source> (<publisher-loc>New Delhi</publisher-loc>: <publisher-name>ICISCAE</publisher-name>). <fpage>557</fpage>&#x2013;<lpage>561</lpage>.</citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chao</surname> <given-names>W.</given-names></name> <name><surname>Wenhui</surname> <given-names>W.</given-names></name> <name><surname>Jiahan</surname> <given-names>D.</given-names></name> <name><surname>Guangxin</surname> <given-names>G.</given-names></name></person-group> (<year>2021</year>). &#x201C;<article-title>Research on network intrusion detection technology based on dcgan</article-title>,&#x201D; in <source><italic>Proceedings of the 2021 IEEE 5th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC)</italic></source>, (<publisher-loc>Vienna</publisher-loc>: <publisher-name>IAEAC</publisher-name>). <fpage>1418</fpage>&#x2013;<lpage>1422</lpage>. <pub-id pub-id-type="doi">10.3390/s19143075</pub-id> <pub-id pub-id-type="pmid">31336814</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Collier</surname> <given-names>E.</given-names></name> <name><surname>Duffy</surname> <given-names>K.</given-names></name> <name><surname>Ganguly</surname> <given-names>S.</given-names></name> <name><surname>Madanguit</surname> <given-names>G.</given-names></name> <name><surname>Kalia</surname> <given-names>S.</given-names></name> <name><surname>Shreekant</surname> <given-names>G.</given-names></name><etal/></person-group> (<year>2018</year>). &#x201C;<article-title>Progressively growing generative adversarial networks for high resolution semantic segmentation of satellite images</article-title>,&#x201D; in <source><italic>Proceedings of the 2018 IEEE International Conference on Data Mining Workshops (ICDMW)</italic></source>, (<publisher-loc>New Jersey, NJ</publisher-loc>: <publisher-name>IEEE</publisher-name>). <fpage>763</fpage>&#x2013;<lpage>769</lpage>.</citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dhaka</surname> <given-names>V. S.</given-names></name> <name><surname>Meena</surname> <given-names>S. V.</given-names></name> <name><surname>Rani</surname> <given-names>G.</given-names></name> <name><surname>Sinwar</surname> <given-names>D.</given-names></name> <name><surname>Kavita</surname> <given-names>K.</given-names></name> <name><surname>Ijaz</surname> <given-names>M. F.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>A Survey of deep convolutional neural networks applied for prediction of plant leaf diseases.</article-title> <source><italic>Sensors</italic></source> <volume>21</volume>:<issue>4749</issue>. <pub-id pub-id-type="doi">10.3390/s21144749</pub-id> <pub-id pub-id-type="pmid">34300489</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Douarre</surname> <given-names>C.</given-names></name> <name><surname>Crispim-Junior</surname> <given-names>C. F.</given-names></name> <name><surname>Gelibert</surname> <given-names>A.</given-names></name> <name><surname>Tougne</surname> <given-names>L.</given-names></name> <name><surname>Rousseau</surname> <given-names>D.</given-names></name></person-group> (<year>2019</year>). <article-title>Novel data augmentation strategies to boost supervised segmentation of plant disease.</article-title> <source><italic>Comput. Electron. Agric.</italic></source> <volume>165</volume>:<issue>104967</issue>. <pub-id pub-id-type="doi">10.1016/j.compag.2019.104967</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Espejo-Garcia</surname> <given-names>B.</given-names></name> <name><surname>Mylonas</surname> <given-names>N.</given-names></name> <name><surname>Athanasakos</surname> <given-names>L.</given-names></name> <name><surname>Vali</surname> <given-names>E.</given-names></name> <name><surname>Fountas</surname> <given-names>S.</given-names></name></person-group> (<year>2021</year>). <article-title>Combining generative adversarial networks and agricultural transfer learning for weeds identification.</article-title> <source><italic>Biosyst. Eng.</italic></source> <volume>204</volume> <fpage>79</fpage>&#x2013;<lpage>89</lpage>. <pub-id pub-id-type="doi">10.1016/j.biosystemseng.2021.01.014</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gandhi</surname> <given-names>R.</given-names></name> <name><surname>Nimbalkar</surname> <given-names>S.</given-names></name> <name><surname>Yelamanchili</surname> <given-names>N.</given-names></name> <name><surname>Ponkshe</surname> <given-names>S.</given-names></name></person-group> (<year>2018</year>). &#x201C;<article-title>Plant disease detection using CNNs and GANs as an augmentative approach</article-title>,&#x201D; in <source><italic>Proceedings of the 2018 IEEE International Conference on Innovative Research and Development (ICIRD)</italic></source>, (<publisher-loc>New Jersey, NJ</publisher-loc>: <publisher-name>IEEE</publisher-name>). <fpage>1</fpage>&#x2013;<lpage>5</lpage>.</citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gong</surname> <given-names>X. Y.</given-names></name> <name><surname>Chang</surname> <given-names>S. Y.</given-names></name> <name><surname>Jiang</surname> <given-names>Y. F.</given-names></name> <name><surname>Wang</surname> <given-names>Z. Y.</given-names></name></person-group> (<year>2019</year>). &#x201C;<article-title>AutoGAN: neural architecture search for generative adversarial networks</article-title>,&#x201D; in <source><italic>Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (Seoul, South Korea)</italic></source>, (<publisher-loc>New Jersey, NJ</publisher-loc>: <publisher-name>IEEE</publisher-name>). <fpage>3223</fpage>&#x2013;<lpage>3233</lpage>.</citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>He</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Shan</surname> <given-names>H.</given-names></name> <name><surname>Wang</surname> <given-names>L.</given-names></name></person-group> (<year>2019</year>). <article-title>Multi-Task GANs for view-specific feature learning in gait recognition.</article-title> <source><italic>IEEE Trans. Inf. Forensic Secur.</italic></source> <volume>14</volume> <fpage>102</fpage>&#x2013;<lpage>113</lpage>. <pub-id pub-id-type="doi">10.1109/tifs.2018.2844819</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>G.</given-names></name> <name><surname>Wu</surname> <given-names>H.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Wan</surname> <given-names>M.</given-names></name></person-group> (<year>2019</year>). <article-title>A low shot learning method for tea leaf&#x2019;s disease identification.</article-title> <source><italic>Comput. Electron. Agric.</italic></source> <volume>163</volume>:<issue>104852</issue>. <pub-id pub-id-type="doi">10.1016/j.compag.2019.104852</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hu</surname> <given-names>G.</given-names></name> <name><surname>Yin</surname> <given-names>C.</given-names></name> <name><surname>Wan</surname> <given-names>M.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Fang</surname> <given-names>Y.</given-names></name></person-group> (<year>2020</year>). <article-title>Recognition of diseased pinus trees in UAV images using deep learning and AdaBoost classifier.</article-title> <source><italic>Biosyst. Eng.</italic></source> <volume>194</volume> <fpage>138</fpage>&#x2013;<lpage>151</lpage>. <pub-id pub-id-type="doi">10.1016/j.biosystemseng.2020.03.021</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jeon</surname> <given-names>H.</given-names></name> <name><surname>Lee</surname> <given-names>D.</given-names></name></person-group> (<year>2021</year>). <article-title>A New Data augmentation method for time series wearable sensor data using a learning mode switching-based DCGAN.</article-title> <source><italic>IEEE Robot. Autom. Lett.</italic></source> <volume>6</volume> <fpage>8671</fpage>&#x2013;<lpage>8677</lpage>. <pub-id pub-id-type="doi">10.1109/lra.2021.3103648</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>C.</given-names></name> <name><surname>Park</surname> <given-names>S.</given-names></name> <name><surname>Hwang</surname> <given-names>H. J.</given-names></name></person-group> (<year>2021</year>). <article-title>Local stability of wasserstein GANs with abstract gradient penalty.</article-title> <source><italic>IEEE Trans. Neural Netw. Learn. Syst.</italic></source> <fpage>1</fpage>&#x2013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2021.3057885</pub-id> <pub-id pub-id-type="pmid">33606646</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>S. W.</given-names></name> <name><surname>Kook</surname> <given-names>H. K.</given-names></name> <name><surname>Sun</surname> <given-names>J. Y.</given-names></name> <name><surname>Kang</surname> <given-names>M. C.</given-names></name> <name><surname>Ko</surname> <given-names>S. J.</given-names></name></person-group> (<year>2018</year>). &#x201C;<article-title>Parallel feature pyramid network for object detection</article-title>,&#x201D; in <source><italic>Proceedings of the European Conference on Computer Vision (ECCV), (Munich, Germany)</italic></source>, (<publisher-loc>Tel Aviv</publisher-loc>: <publisher-name>ECCV</publisher-name>). <fpage>239</fpage>&#x2013;<lpage>256</lpage>.</citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kundu</surname> <given-names>N.</given-names></name> <name><surname>Rani</surname> <given-names>G.</given-names></name> <name><surname>Dhaka</surname> <given-names>V. S.</given-names></name> <name><surname>Gupta</surname> <given-names>K.</given-names></name> <name><surname>Nayak</surname> <given-names>S. C.</given-names></name> <name><surname>Verma</surname> <given-names>S.</given-names></name><etal/></person-group> (<year>2021</year>). <article-title>IoT and interpretable machine learning based framework for disease prediction in pearl millet.</article-title> <source><italic>Sensors Basel.</italic></source> <volume>21</volume>:<issue>5386</issue>. <pub-id pub-id-type="doi">10.3390/s21165386</pub-id> <pub-id pub-id-type="pmid">34450827</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>J.</given-names></name> <name><surname>Wang</surname> <given-names>X. W.</given-names></name></person-group> (<year>2020</year>). <article-title>Tomato diseases and pests detection based on improved yolo V3 convolutional neural network.</article-title> <source><italic>Front. Plant Sci.</italic></source> <volume>11</volume>:<issue>898</issue>. <pub-id pub-id-type="doi">10.3389/fpls.2020.00898</pub-id> <pub-id pub-id-type="pmid">32612632</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>L.</given-names></name> <name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Xu</surname> <given-names>X.</given-names></name> <name><surname>Zhang</surname> <given-names>Z.</given-names></name> <name><surname>Yan</surname> <given-names>S.</given-names></name></person-group> (<year>2020</year>). <article-title>Collocating clothes with generative adversarial networks cosupervised by categories and attributes: a multidiscriminator framework.</article-title> <source><italic>IEEE Trans. Neural Netw. Learn. Syst.</italic></source> <volume>31</volume> <fpage>3540</fpage>&#x2013;<lpage>3554</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2019.2944979</pub-id> <pub-id pub-id-type="pmid">31714238</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>X.</given-names></name> <name><surname>Chen</surname> <given-names>S.</given-names></name> <name><surname>Song</surname> <given-names>L.</given-names></name> <name><surname>Wo&#x017A;niak</surname> <given-names>M.</given-names></name> <name><surname>Liu</surname> <given-names>S.</given-names></name></person-group> (<year>2021</year>). <article-title>Self-attention negative feedback network for real-time image super-resolution.</article-title> <source><italic>J. King Saud. Univ-Com. Info. Sci.</italic></source> <pub-id pub-id-type="doi">10.1016/j.jksuci.2021.07.014</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>X.</given-names></name> <name><surname>Hsieh</surname> <given-names>C.</given-names></name></person-group> (<year>2019</year>). &#x201C;<article-title>Rob-GAN: generator, discriminator, and adversarial attacker</article-title>,&#x201D; in <source><italic>Proceedings of the 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)</italic></source>, (<publisher-loc>New Jersy, NJ</publisher-loc>: <publisher-name>IEEE</publisher-name>). <fpage>11226</fpage>&#x2013;<lpage>11235</lpage>.</citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>Z. L.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name> <name><surname>Tian</surname> <given-names>Y.</given-names></name> <name><surname>Dai</surname> <given-names>S. L.</given-names></name></person-group> (<year>2019</year>). <article-title>Deep learning for image-based large-flowered chrysanthemum cultivar recognition.</article-title> <source><italic>Plant Methods</italic></source> <volume>15</volume>:<issue>146</issue>. <pub-id pub-id-type="doi">10.1186/s13007-019-0532-7</pub-id> <pub-id pub-id-type="pmid">31827578</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Luo</surname> <given-names>Z.</given-names></name> <name><surname>Yu</surname> <given-names>H.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name></person-group> (<year>2020</year>). <article-title>Pine cone detection using boundary equilibrium generative adversarial networks and improved YOLOv3 model.</article-title> <source><italic>Sensors</italic></source> <volume>20</volume>:<issue>4430</issue>. <pub-id pub-id-type="doi">10.3390/s20164430</pub-id> <pub-id pub-id-type="pmid">32784403</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mao</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>Q.</given-names></name> <name><surname>Xie</surname> <given-names>H.</given-names></name> <name><surname>Lau</surname> <given-names>R. Y. K.</given-names></name> <name><surname>Wang</surname> <given-names>Z.</given-names></name> <name><surname>Smolley</surname> <given-names>S. P.</given-names></name></person-group> (<year>2019</year>). <article-title>On the effectiveness of least squares generative adversarial networks.</article-title> <source><italic>IEEE Trans. Pattern Anal. Mach. Intell.</italic></source> <volume>41</volume> <fpage>2947</fpage>&#x2013;<lpage>2960</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2018.2872043</pub-id> <pub-id pub-id-type="pmid">30273144</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Marino</surname> <given-names>S.</given-names></name> <name><surname>Beauseroy</surname> <given-names>P.</given-names></name> <name><surname>Smolarz</surname> <given-names>A.</given-names></name></person-group> (<year>2020</year>). <article-title>Unsupervised adversarial deep domain adaptation method for potato defects classification.</article-title> <source><italic>Comput. Electron. Agric.</italic></source> <volume>174</volume>:<issue>105501</issue>. <pub-id pub-id-type="doi">10.1016/j.compag.2020.105501</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mufti</surname> <given-names>A.</given-names></name> <name><surname>Antonelli</surname> <given-names>B.</given-names></name> <name><surname>Monello</surname> <given-names>J.</given-names></name></person-group> (<year>2019</year>). &#x201C;<article-title>Conditional gans for painting generation</article-title>,&#x201D; in <source><italic>Proceedings of the 12th International Conference on Machine Vision (ICMV), (Amsterdam, Netherlands)</italic></source>, (<publisher-loc>Boulevard Victor Hugo</publisher-loc>: <publisher-name>ICMV</publisher-name>). <fpage>1143335</fpage>.</citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nafi</surname> <given-names>N. M.</given-names></name> <name><surname>Hsu</surname> <given-names>W. H.</given-names></name></person-group> (<year>2020</year>). &#x201C;<article-title>Addressing class imbalance in image-based plant disease detection: deep generative vs. sampling-based approaches</article-title>,&#x201D; in <source><italic>Proceedings of the 2020 International Conference on Systems, Signals and Image Processing (IWSSIP)</italic></source>, (<publisher-loc>Bratislava</publisher-loc>: <publisher-name>IWSSIP</publisher-name>). <fpage>243</fpage>&#x2013;<lpage>248</lpage>.</citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nazki</surname> <given-names>H.</given-names></name> <name><surname>Yoon</surname> <given-names>S.</given-names></name> <name><surname>Fuentes</surname> <given-names>A.</given-names></name> <name><surname>Park</surname> <given-names>D. S.</given-names></name></person-group> (<year>2020</year>). <article-title>Unsupervised image translation using adversarial networks for improved plant disease recognition.</article-title> <source><italic>Comput. Electron. Agric.</italic></source> <volume>168</volume>:<issue>105117</issue>. <pub-id pub-id-type="doi">10.1016/j.compag.2019.105117</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Olatunji</surname> <given-names>J. R.</given-names></name> <name><surname>Redding</surname> <given-names>G. P.</given-names></name> <name><surname>Rowe</surname> <given-names>C. L.</given-names></name> <name><surname>East</surname> <given-names>A. R.</given-names></name></person-group> (<year>2020</year>). <article-title>Reconstruction of kiwifruit fruit geometry using a CGAN trained on a synthetic dataset.</article-title> <source><italic>Comput. Electron. Agric.</italic></source> <volume>177</volume>:<issue>12</issue>.</citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Padilla-Medina</surname> <given-names>J. A.</given-names></name> <name><surname>Contreras-Medina</surname> <given-names>L. M.</given-names></name> <name><surname>Gavil&#x00E1;n</surname> <given-names>M. U.</given-names></name> <name><surname>Millan-Almaraz</surname> <given-names>J. R.</given-names></name> <name><surname>Alvaro</surname> <given-names>J. E.</given-names></name></person-group> (<year>2019</year>). <article-title>Sensors in precision agriculture for the monitoring of plant development and improvement of food production.</article-title> <source><italic>J. Sens.</italic></source> <volume>2019</volume>:<issue>7138720</issue>.</citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Qi</surname> <given-names>C.</given-names></name> <name><surname>Gao</surname> <given-names>J.</given-names></name> <name><surname>Pearson</surname> <given-names>S.</given-names></name> <name><surname>Harman</surname> <given-names>H.</given-names></name> <name><surname>Chen</surname> <given-names>K.</given-names></name> <name><surname>Shu</surname> <given-names>L.</given-names></name></person-group> (<year>2022</year>). <article-title>Tea chrysanthemum detection under unstructured environments using the TC-YOLO model.</article-title> <source><italic>Expert Syst. Appl.</italic></source> <volume>193</volume>:<issue>116473</issue>. <pub-id pub-id-type="doi">10.1016/j.eswa.2021.116473</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Qiao</surname> <given-names>K.</given-names></name> <name><surname>Chen</surname> <given-names>J.</given-names></name> <name><surname>Wang</surname> <given-names>L. Y.</given-names></name> <name><surname>Zhang</surname> <given-names>C.</given-names></name> <name><surname>Tong</surname> <given-names>L.</given-names></name> <name><surname>Yan</surname> <given-names>B.</given-names></name></person-group> (<year>2020</year>). <article-title>BigGAN-based bayesian reconstruction of natural images from human brain activity.</article-title> <source><italic>Neuroscience.</italic></source> <volume>444</volume> <fpage>92</fpage>&#x2013;<lpage>105</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroscience.2020.07.040</pub-id> <pub-id pub-id-type="pmid">32736069</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shete</surname> <given-names>S.</given-names></name> <name><surname>Srinivasan</surname> <given-names>S.</given-names></name> <name><surname>Gonsalves</surname> <given-names>T. A.</given-names></name></person-group> (<year>2020</year>). <article-title>TasselGAN: an application of the generative adversarial model for creating field-based maize tassel data.</article-title> <source><italic>Plant Phenomics</italic></source> <volume>2020</volume>:<issue>8309605</issue>. <pub-id pub-id-type="doi">10.34133/2020/8309605</pub-id> <pub-id pub-id-type="pmid">33313564</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Talukdar</surname> <given-names>B.</given-names></name></person-group> (<year>2020</year>). &#x201C;<article-title>Handling of class imbalance for plant disease classification with variants of GANs</article-title>,&#x201D; in <source><italic>Proceedings of the 2020 IEEE 15th International Conference on Industrial and Information Systems (ICIIS)</italic></source>, (<publisher-loc>Sutton</publisher-loc>: <publisher-name>ICIIS</publisher-name>). <fpage>466</fpage>&#x2013;<lpage>471</lpage>.</citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tran</surname> <given-names>N. T.</given-names></name> <name><surname>Bui</surname> <given-names>A.</given-names></name> <name><surname>Cheung</surname> <given-names>N. M.</given-names></name></person-group> (<year>2018</year>). &#x201C;<article-title>Dist-GAN: an improved GAN using distance constraints</article-title>,&#x201D; in <source><italic>Proceedings of the European Conference on Computer Vision (ECCV), (Munich, Germany)</italic></source>, (<publisher-loc>Colorado, CO</publisher-loc>: <publisher-name>ECCV</publisher-name>). <fpage>387</fpage>&#x2013;<lpage>401</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-01264-9_23</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tran</surname> <given-names>N. T.</given-names></name> <name><surname>Tran</surname> <given-names>V. H.</given-names></name> <name><surname>Nguyen</surname> <given-names>N. B.</given-names></name> <name><surname>Nguyen</surname> <given-names>T. K.</given-names></name> <name><surname>Cheung</surname> <given-names>N. M.</given-names></name></person-group> (<year>2021</year>). <article-title>On Data Augmentation for GAN Training.</article-title> <source><italic>IEEE Trans. Image Process.</italic></source> <volume>30</volume> <fpage>1882</fpage>&#x2013;<lpage>1897</lpage>. <pub-id pub-id-type="doi">10.1109/TIP.2021.3049346</pub-id> <pub-id pub-id-type="pmid">33428571</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>C.</given-names></name> <name><surname>Xu</surname> <given-names>C.</given-names></name> <name><surname>Yao</surname> <given-names>X.</given-names></name> <name><surname>Tao</surname> <given-names>D.</given-names></name></person-group> (<year>2019</year>). <article-title>Evolutionary generative adversarial networks.</article-title> <source><italic>IEEE Trans. Evol. Comput.</italic></source> <volume>23</volume> <fpage>921</fpage>&#x2013;<lpage>934</lpage>.</citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Z.</given-names></name> <name><surname>She</surname> <given-names>Q.</given-names></name> <name><surname>Ward</surname> <given-names>T. E.</given-names></name></person-group> (<year>2021</year>). <article-title>Generative adversarial networks in computer vision: a survey and taxonomy.</article-title> <source><italic>ACM Comput. Surv.</italic></source> <volume>37</volume> <fpage>1</fpage>&#x2013;<lpage>38</lpage>. <pub-id pub-id-type="doi">10.1145/3439723</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wieczorek</surname> <given-names>M.</given-names></name> <name><surname>Sika</surname> <given-names>J.</given-names></name> <name><surname>Wozniak</surname> <given-names>M.</given-names></name> <name><surname>Garg</surname> <given-names>S.</given-names></name> <name><surname>Hassan</surname> <given-names>M.</given-names></name></person-group> (<year>2021</year>). <article-title>Lightweight CNN model for human face detection in risk situations.</article-title> <source><italic>IEEE Trans. Ind. Info.</italic></source> <fpage>1</fpage>&#x2013;<lpage>1</lpage>. <pub-id pub-id-type="doi">10.1109/tii.2021.3129629</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>L.</given-names></name> <name><surname>Fan</surname> <given-names>W.</given-names></name> <name><surname>Bouguila</surname> <given-names>N.</given-names></name></person-group> (<year>2020</year>). <article-title>Clustering analysis via deep generative models with mixture models.</article-title> <source><italic>IEEE Trans. Neural Netw. Learn. Syst.</italic></source> <volume>33</volume> <fpage>340</fpage>&#x2013;<lpage>350</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2020.3027761</pub-id> <pub-id pub-id-type="pmid">33048769</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yue</surname> <given-names>J.</given-names></name> <name><surname>Zhu</surname> <given-names>C.</given-names></name> <name><surname>Zhou</surname> <given-names>Y.</given-names></name> <name><surname>Niu</surname> <given-names>X.</given-names></name> <name><surname>Miao</surname> <given-names>M.</given-names></name> <name><surname>Tang</surname> <given-names>X.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Transcriptome analysis of differentially expressed unigenes involved in flavonoid biosynthesis during flower development of Chrysanthemum morifolium &#x2018;Chuju&#x2019;.</article-title> <source><italic>Sci. Rep.</italic></source> <volume>8</volume>:<issue>13414</issue>. <pub-id pub-id-type="doi">10.1038/s41598-018-31831-6</pub-id> <pub-id pub-id-type="pmid">30194355</pub-id></citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>T.</given-names></name> <name><surname>Li</surname> <given-names>L.</given-names></name></person-group> (<year>2020</year>). &#x201C;<article-title>An improved object detection algorithm based on M2Det</article-title>,&#x201D; in <source><italic>Proceedings of the 2020 IEEE International Conference on Artificial Intelligence and Computer Applications (ICAICA)</italic></source>, (<publisher-loc>New Delhi</publisher-loc>: <publisher-name>ICAICA</publisher-name>). <fpage>582</fpage>&#x2013;<lpage>585</lpage>.</citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Z.</given-names></name> <name><surname>Qiao</surname> <given-names>S.</given-names></name> <name><surname>Xie</surname> <given-names>C.</given-names></name> <name><surname>Shen</surname> <given-names>W.</given-names></name> <name><surname>Wang</surname> <given-names>B.</given-names></name> <name><surname>Yuille</surname> <given-names>A. L.</given-names></name></person-group> (<year>2018</year>). &#x201C;<article-title>Single-shot object detection with enriched semantics</article-title>,&#x201D; in <source><italic>Proceedings of the 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition</italic></source>, (<publisher-loc>New Jersy, NJ</publisher-loc>: <publisher-name>IEEE</publisher-name>). <fpage>5813</fpage>&#x2013;<lpage>5821</lpage>. <pub-id pub-id-type="doi">10.1016/j.compbiomed.2020.103867</pub-id> <pub-id pub-id-type="pmid">32658787</pub-id></citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>J.</given-names></name> <name><surname>Xiong</surname> <given-names>L.</given-names></name> <name><surname>Li</surname> <given-names>J.</given-names></name> <name><surname>Xing</surname> <given-names>J.</given-names></name> <name><surname>Yan</surname> <given-names>S.</given-names></name> <name><surname>Feng</surname> <given-names>J.</given-names></name></person-group> (<year>2019</year>). <article-title>3D-Aided Dual-agent GANs for unconstrained face recognition.</article-title> <source><italic>IEEE Trans. Pattern Anal. Mach. Intell.</italic></source> <volume>41</volume> <fpage>2380</fpage>&#x2013;<lpage>2394</lpage>. <pub-id pub-id-type="doi">10.1109/TPAMI.2018.2858819</pub-id> <pub-id pub-id-type="pmid">30040629</pub-id></citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>W.</given-names></name> <name><surname>Yamada</surname> <given-names>W.</given-names></name> <name><surname>Li</surname> <given-names>T. X.</given-names></name> <name><surname>Digman</surname> <given-names>M.</given-names></name> <name><surname>Runge</surname> <given-names>T.</given-names></name></person-group> (<year>2021a</year>). <article-title>Augmenting crop detection for precision agriculture with deep visual transfer learning-a case study of bale detection.</article-title> <source><italic>Remote Sens.</italic></source> <volume>13</volume>:<issue>17</issue>.</citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhao</surname> <given-names>Y.</given-names></name> <name><surname>Chen</surname> <given-names>Z.</given-names></name> <name><surname>Gao</surname> <given-names>X.</given-names></name> <name><surname>Song</surname> <given-names>W.</given-names></name> <name><surname>Xiong</surname> <given-names>Q.</given-names></name> <name><surname>Hu</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2021b</year>). <article-title>Plant disease detection using generated leaves based on doubleGAN.</article-title> <source><italic>IEEE-ACM Trans. Comput. Biol. Bioinfo.</italic></source> <fpage>1</fpage>&#x2013;<lpage>1</lpage>. <pub-id pub-id-type="doi">10.1109/TCBB.2021.3056683</pub-id> <pub-id pub-id-type="pmid">33534712</pub-id></citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zheng</surname> <given-names>J. Y.</given-names></name> <name><surname>Lu</surname> <given-names>B. Y.</given-names></name> <name><surname>Xu</surname> <given-names>B. J.</given-names></name></person-group> (<year>2021</year>). <article-title>An update on the health benefits promoted by edible flowers and involved mechanisms.</article-title> <source><italic>Food Chem.</italic></source> <volume>340</volume>:<issue>17</issue>. <pub-id pub-id-type="doi">10.1016/j.foodchem.2020.127940</pub-id> <pub-id pub-id-type="pmid">32889216</pub-id></citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhong</surname> <given-names>F.</given-names></name> <name><surname>Chen</surname> <given-names>Z.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Xia</surname> <given-names>F.</given-names></name></person-group> (<year>2020</year>). <article-title>Zero- and few-shot learning for diseases recognition of Citrus aurantium L. using conditional adversarial autoencoders.</article-title> <source><italic>Comput. Electron. Agric.</italic></source> <volume>179</volume>:<issue>105828</issue>. <pub-id pub-id-type="doi">10.1016/j.compag.2020.105828</pub-id></citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>P.</given-names></name> <name><surname>Gao</surname> <given-names>B.</given-names></name> <name><surname>Wang</surname> <given-names>S.</given-names></name> <name><surname>Chai</surname> <given-names>T.</given-names></name></person-group> (<year>2022</year>). <article-title>Identification of abnormal conditions for fused magnesium melting process based on deep learning and multisource information fusion.</article-title> <source><italic>IEEE T. Ind. Electron.</italic></source> <volume>69</volume> <fpage>3017</fpage>&#x2013;<lpage>3026</lpage>. <pub-id pub-id-type="doi">10.1109/tie.2021.3070512</pub-id></citation></ref>
<ref id="B52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>X.</given-names></name> <name><surname>Liang</surname> <given-names>W.</given-names></name> <name><surname>Shimizu</surname> <given-names>S.</given-names></name> <name><surname>Ma</surname> <given-names>J.</given-names></name> <name><surname>Jin</surname> <given-names>Q.</given-names></name></person-group> (<year>2021</year>). <article-title>Siamese neural network based few-shot learning for anomaly detection in industrial cyber-physical systems.</article-title> <source><italic>IEEE Trans. Ind. Info.</italic></source> <volume>17</volume> <fpage>5790</fpage>&#x2013;<lpage>5798</lpage>. <pub-id pub-id-type="doi">10.1109/tii.2020.3047675</pub-id></citation></ref>
</ref-list>
</back>
</article>
