<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" article-type="research-article" dtd-version="2.3" xml:lang="EN">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Plant Sci.</journal-id>
<journal-title>Frontiers in Plant Science</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Plant Sci.</abbrev-journal-title>
<issn pub-type="epub">1664-462X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpls.2024.1408047</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Plant Science</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Integrating multi-modal remote sensing, deep learning, and attention mechanisms for yield prediction in plant breeding experiments</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Aviles Toledo</surname>
<given-names>Claudia</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="author-notes" rid="fn001">
<sup>*</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2018636"/>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/formal-analysis/"/>
<role content-type="https://credit.niso.org/contributor-roles/methodology/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Crawford</surname>
<given-names>Melba M.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1040612"/>
<role content-type="https://credit.niso.org/contributor-roles/conceptualization/"/>
<role content-type="https://credit.niso.org/contributor-roles/formal-analysis/"/>
<role content-type="https://credit.niso.org/contributor-roles/methodology/"/>
<role content-type="https://credit.niso.org/contributor-roles/supervision/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Tuinstra</surname>
<given-names>Mitchell R.</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1113748"/>
<role content-type="https://credit.niso.org/contributor-roles/supervision/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Lyles School of Civil Engineering, Purdue University</institution>, <addr-line>West Lafayette, IN</addr-line>, <country>United States</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Department of Agronomy, Purdue University</institution>, <addr-line>West Lafayette, IN</addr-line>, <country>United States</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>Edited by: Weicheng Xu, Guangdong Academy of Agricultural Sciences (GDAAS), China</p>
</fn>
<fn fn-type="edited-by">
<p>Reviewed by: Weiguang Yang, South China Agricultural University, China</p>
<p>Wang Lele, Henan Agricultural University, China</p>
</fn>
<fn fn-type="corresp" id="fn001">
<p>*Correspondence: Claudia Aviles Toledo, <email xlink:href="mailto:cavilest@purdue.edu">cavilest@purdue.edu</email>
</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>25</day>
<month>07</month>
<year>2024</year>
</pub-date>
<pub-date pub-type="collection">
<year>2024</year>
</pub-date>
<volume>15</volume>
<elocation-id>1408047</elocation-id>
<history>
<date date-type="received">
<day>27</day>
<month>03</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>04</day>
<month>07</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2024 Aviles Toledo, Crawford and Tuinstra</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Aviles Toledo, Crawford and Tuinstra</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>In both plant breeding and crop management, interpretability plays a crucial role in instilling trust in AI-driven approaches and enabling the provision of actionable insights. The primary objective of this research is to explore and evaluate the potential contributions of deep learning network architectures that employ stacked LSTM for end-of-season maize grain yield prediction. A secondary aim is to expand the capabilities of these networks by adapting them to better accommodate and leverage the multi-modality properties of remote sensing data. In this study, a multi-modal deep learning architecture that assimilates inputs from heterogeneous data streams, including high-resolution hyperspectral imagery, LiDAR point clouds, and environmental data, is proposed to forecast maize crop yields. The architecture includes attention mechanisms that assign varying levels of importance to different modalities and temporal features that, reflect the dynamics of plant growth and environmental interactions. The interpretability of the attention weights is investigated in multi-modal networks that seek to both improve predictions and attribute crop yield outcomes to genetic and environmental variables. This approach also contributes to increased interpretability of the model's predictions. The temporal attention weight distributions highlighted relevant factors and critical growth stages that contribute to the predictions. The results of this study affirm that the attention weights are consistent with recognized biological growth stages, thereby substantiating the network's capability to learn biologically interpretable features. Accuracies of the model's predictions of yield ranged from 0.82-0.93 <italic>R<sup>2</sup>
<sub>ref</sub>
</italic> in this genetics-focused study, further highlighting the potential of attention-based models. Further, this research facilitates understanding of how multi-modality remote sensing aligns with the physiological stages of maize. The proposed architecture shows promise in improving predictions and offering interpretable insights into the factors affecting maize crop yields, while demonstrating the impact of data collection by different modalities through the growing season. By identifying relevant factors and critical growth stages, the model's attention weights provide valuable information that can be used in both plant breeding and crop management. The consistency of attention weights with biological growth stages reinforces the potential of deep learning networks in agricultural applications, particularly in leveraging remote sensing data for yield prediction. To the best of our knowledge, this is the first study that investigates the use of hyperspectral and LiDAR UAV time series data for explaining/interpreting plant growth stages within deep learning networks and forecasting plot-level maize grain yield using late fusion modalities with attention mechanisms.</p>
</abstract>
<kwd-group>
<kwd>hyperspectral</kwd>
<kwd>LiDAR</kwd>
<kwd>stacked LSTM</kwd>
<kwd>attention mechanisms</kwd>
<kwd>multi-modal networks</kwd>
<kwd>yield prediction</kwd>
<kwd>precision agriculture</kwd>
</kwd-group>
<counts>
<fig-count count="7"/>
<table-count count="4"/>
<equation-count count="2"/>
<ref-count count="67"/>
<page-count count="16"/>
<word-count count="7552"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-in-acceptance</meta-name>
<meta-value>Technical Advances in Plant Science</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec id="s1" sec-type="intro">
<label>1</label>
<title>Introduction</title>
<p>Plant breeding experiments play a critical role in the development of future generations of crops that can effectively respond to the increasing global food demand and the impact of climate change (<xref ref-type="bibr" rid="B44">Razzaq et&#xa0;al., 2021</xref>). Using advanced technologies, such as remote sensing (RS) and machine learning, plant breeders and researchers seek to make more informed decisions regarding their crops (<xref ref-type="bibr" rid="B2">Akhter and Sofi, 2022</xref>). By including genetic information and environmental inputs such as soil properties and weather patterns, predictive models can now forecast future yields and rank new hybrids with increased precision. Use of advanced predictive models has significantly altered the approach of researchers toward the development of new crop varieties in maize breeding experiments (<xref ref-type="bibr" rid="B64">Xu et&#xa0;al., 2022</xref>). These predictive tools significantly accelerate the breeding process, allowing researchers to focus their efforts on the most promising candidates, thus increasing the rate of development of high-yielding and resilient varieties. Improved prediction of end-of-season traits in the field can also allow preliminary selection of the most promising individuals based on RS plant phenotyping at early developmental stages (<xref ref-type="bibr" rid="B60">Wang et&#xa0;al., 2023</xref>).</p>
<p>Maize is an annual grass species, completing its life cycle within one growing season (<xref ref-type="bibr" rid="B12">Eckhoff et&#xa0;al., 2003</xref>). Using RS, it is possible to model development of maize through the growing season by acquiring data during different stages of physiological development, thereby creating a time series. High-throughput phenotyping provides capability for monitoring and assessing crop growth and plant attributes. The accessibility of these technologies has increased due to recent advances in sensors and platforms. This provides breeders with the opportunity to explore larger datasets to examine the relationships between genetics, environment, and management practices. The Genomes 2 Fields (G2F) project is a multi-university research initiative aimed at enhancing the productivity and sustainability of crops by integrating genomics and field-based breeding efforts (<xref ref-type="bibr" rid="B4">AlKhalifah et&#xa0;al., 2018</xref>). In these experiments, the use of doubled haploids is intended to serve as an effective means of accelerating the breeding process, validating genomic discoveries, enhancing specific traits, and conserving genetic diversity, thus contributing to the development of resilient and high-yielding crop varieties (<xref ref-type="bibr" rid="B4">AlKhalifah et&#xa0;al., 2018</xref>). However, the small number of replicates of the same doubled haploid hybrids leads to a restricted portrayal of the phenotypic traits of the variety when training models. This issue was resolved in this study by utilizing publicly available genetic data, clustering similar genetic groups of varieties, and implementing stratified sampling during the training process.</p>
<p>Long-Short-Term Memory (LSTM) networks, a form of recurrent neural network (RNN), have recently demonstrated efficacy in handling time series data, including in agronomical scenarios. Previous studies using this network architecture have achieved high accuracy in crop yield prediction (<xref ref-type="bibr" rid="B39">Masjedi et&#xa0;al., 2020</xref>; <xref ref-type="bibr" rid="B23">Khaki et&#xa0;al., 2021</xref>; <xref ref-type="bibr" rid="B60">Wang et&#xa0;al., 2023</xref>). Attention mechanisms have been investigated to enhance model accuracy, and have also demonstrated their effectiveness in improving model interpretability (<xref ref-type="bibr" rid="B14">Gangopadhyay et&#xa0;al., 2020</xref>; <xref ref-type="bibr" rid="B9">Danilevicz et&#xa0;al., 2021</xref>; <xref ref-type="bibr" rid="B52">Tian et&#xa0;al., 2021</xref>; <xref ref-type="bibr" rid="B53">Toledo and Crawford, 2023</xref>). Modeling based on multi-modal RS data has also been studied, primarily exploring early fusion (<xref ref-type="bibr" rid="B39">Masjedi et&#xa0;al., 2020</xref>; <xref ref-type="bibr" rid="B59">Wang and Crawford, 2021</xref>). Early fusion involves combining different modalities at the beginning of the processing pipeline, i.e., as integrated, normalized inputs to a model (<xref ref-type="bibr" rid="B62">Wang et&#xa0;al., 2020a</xref>). However, drawbacks of these multi-modal RS-based LSTM prediction models are: i) simultaneous representation of both internal and external interactions among the modalities; ii) reduced understanding and interpretability of the predicted results; iii) less capability to explore the connection between the physical growth stages and their relationship with the time series being modeled.</p>
<p>In this study, three LSTM-based architectures (vanilla stacked LSTM, stacked LSTM with a temporal attention mechanism and multi-modal attention networks) are investigated in plot-level end-of-season maize yield prediction experiments using multi-modal RS data and weather data. Time-step importance is first evaluated using the time domain attention weights for each modality to investigate the impact of each sensor-based input during the growing season. Based on sensitivity analysis of the time-steps provided by the temporal attention weights, multiple scenarios are explored, where different growth stages within each modality are considered. The scenarios are investigated in all three proposed architectures. Data from a two-year GxE experiment of doubled haploids using the same tester parent was used to evaluate the proposed objectives. The paper is organized as follows: Related Work provides a review of deep learning prediction models that have RS data inputs, emphasizing attention mechanisms and multi-modal networks; Materials and Methods includes a description of the study sites, datasets, and the methodology; Results are presented and discussed in Experimental Results; Conclusions and recommendations for future work are summarized in Conclusions and Discussion.</p>
</sec>
<sec id="s2">
<label>2</label>
<title>Related work</title>
<sec id="s2_1">
<label>2.1</label>
<title>Yield prediction models</title>
<p>Regression models based on inputs, including RS data, weather, soils, genetics, and management practices, have been widely investigated in agriculture for yield prediction. Early studies based on multiple regression were followed by classical machine learning approaches, including support vector regression (SVR), partial least squares regression (PLSR), and random forests (RF) to predict grain yield and biomass, respectively (<xref ref-type="bibr" rid="B50">Sujatha and Isakki, 2016</xref>; <xref ref-type="bibr" rid="B39">Masjedi et&#xa0;al., 2020</xref>). (<xref ref-type="bibr" rid="B1">Aghighi et&#xa0;al., 2018</xref>) incorporated RS time series data into classical machine learning models, such as boosted regression trees (BRT), RF, and SVR. Traditional machine learning-based models are difficult to generalize to scenarios outside the domain of the training data. Furthermore, these models lack the ability to effectively leverage inputs from time series data across multiple modalities or differentiate between categorical variables and time series data, such as environmental conditions (soils), management practices, or genetic inputs. Furthermore, they do not incorporate time series data into a step-wise framework, which is crucial for simulating the growing season and comprehending prediction outcomes. More recently, yield prediction models have been developed using deep learning architectures, particularly at large-spatial scales (<xref ref-type="bibr" rid="B37">Maimaitijiang et&#xa0;al., 2020</xref>; <xref ref-type="bibr" rid="B23">Khaki et&#xa0;al., 2021</xref>; <xref ref-type="bibr" rid="B49">Shook et&#xa0;al., 2021</xref>). These architectures have recently been investigated to predict yield using RS data as inputs (<xref ref-type="bibr" rid="B66">You et&#xa0;al., 2017</xref>; <xref ref-type="bibr" rid="B61">Wang et&#xa0;al., 2020c</xref>). (<xref ref-type="bibr" rid="B22">Jiang et&#xa0;al., 2020</xref>) developed an LSTM framework using MODIS remote sensing products to predict county level yields. At the research plot scale, (<xref ref-type="bibr" rid="B38">Masjedi et&#xa0;al., 2019</xref>; <xref ref-type="bibr" rid="B58">Wan et&#xa0;al., 2020</xref>; <xref ref-type="bibr" rid="B59">Wang and Crawford, 2021</xref>) used RS data acquired by UAV platforms to predict yields in sorghum and maize. (<xref ref-type="bibr" rid="B59">Wang and Crawford, 2021</xref>) extended this work to investigate transfer learning of models to other locations and time periods. Despite these advances, further improvement is needed in predictive models to leverage multiple modality RS and, most importantly, achieve interpretability in the predicted outcomes.</p>
<sec id="s2_1_1">
<label>2.1.1</label>
<title>LSTM-based yield prediction models</title>
<p>The application of recurrent neural networks (RNN) has led to the emergence of robust learning models, characterized by interpreting complex or abstract features to derive meaningful patterns from the inputs (<xref ref-type="bibr" rid="B35">Lipton et&#xa0;al., 2015</xref>). As with many neural network architectures, the relationship between features is established through multi-level hierarchical representations, which enables them to extract features and learn from the datasets (<xref ref-type="bibr" rid="B29">LeCun et&#xa0;al., 2015</xref>). Long-term short-memory (LSTM) based networks were developed to address the well-known vanishing gradient problem of RNNs (<xref ref-type="bibr" rid="B13">Gamboa, 2017</xref>). The core architecture of these networks includes memory cells, referred to as LSTM cells, whose purpose is to store the data and update it through forget, input and output gates (<xref ref-type="bibr" rid="B26">Kong et&#xa0;al., 2019</xref>). LSTM has become popular in long-term temporal time series predictions, such as crop yields. The majority of studies concentrate on yield predictions at county or regional levels. (<xref ref-type="bibr" rid="B52">Tian et&#xa0;al., 2021</xref>; <xref ref-type="bibr" rid="B8">Chen et&#xa0;al., 2023</xref>) used satellite time series data based on an LSTM model on time accumulated data for wheat yield predictions. (<xref ref-type="bibr" rid="B51">Sun et&#xa0;al., 2019</xref>) examined the performance of CNN, LSTM, and CNN-LSTM architectures for predicting soybean yield at the county level. Small-scale experiments with high resolution data have also demonstrated the advantages of LSTM models. (<xref ref-type="bibr" rid="B38">Masjedi et&#xa0;al., 2019</xref>) studied an LSTM-based RNN model using multi-temporal RS data to predict fresh sorghum biomass. (<xref ref-type="bibr" rid="B47">Shen et&#xa0;al., 2022</xref>) utilized both LSTM and LSTM-RF architectures with UAV thermal and multispectral imagery to forecast wheat yield at the plot level. Although all these studies produced results with R<sup>2</sup> values ranging from 0.60-0.94, they lack interpretability regarding the growing season.</p>
</sec>
<sec id="s2_1_2">
<label>2.1.2</label>
<title>Attention mechanisms</title>
<p>Attention mechanisms were first introduced to address the problem of information overload in computer vision (<xref ref-type="bibr" rid="B20">Itti et&#xa0;al., 1998</xref>). In image classification, attention mechanisms were incorporated in a neural network to extract information from an image by adaptively selecting a sequence of spatial regions and focusing on these regions at high resolution (<xref ref-type="bibr" rid="B42">Mnih et&#xa0;al., 2014</xref>). The attention models were introduced in machine-based text translation tasks by (<xref ref-type="bibr" rid="B5">Bahdanau et&#xa0;al., 2016</xref>) to distribute information of the source sentence across all sequences, rather than encoding all the information into a fixed-length vector through the encoder. Attention mechanisms have commonly been categorized as: spectral/channel, spatial, and temporal attention (<xref ref-type="bibr" rid="B17">Guo et&#xa0;al., 2022</xref>). The concept of spectral attention focuses on recalibration of channel weights and their interrelationships, thereby enhancing their representation (<xref ref-type="bibr" rid="B18">Hu et&#xa0;al., 2018</xref>). Given the high dimensionality and redundancy in adjacent spectral bands, spectral attention is commonly employed in hyperspectral image classification. The concept of temporal attention mechanisms originated in video processing, providing a dynamic method of determining &#x201c;when and where&#x201d; attention should be directed (<xref ref-type="bibr" rid="B32">Li et&#xa0;al., 2020</xref>). In time series sequences, the decoder can selectively retrieve the focused sequence at each time-step. Temporal attention mechanisms seek to localize important parts of the input features in the time dimension through attention weights from earlier time-steps. Attention weights represent a distribution over input features, providing a tool for interpretation (<xref ref-type="bibr" rid="B46">Serrano and Smith, 2019</xref>). Temporal attention enhances the inherit function of LSTM cells of capturing long time dependencies by identifying the time-steps relevant to the prediction and extracts the information from these time-steps (<xref ref-type="bibr" rid="B48">Shih et&#xa0;al., 2019</xref>). Some recent studies have investigated yield prediction attention networks in multiple crops using multispectral satellite data, focusing on the environmental component of GxE (<xref ref-type="bibr" rid="B24">Khaki and Wang, 2019</xref>; <xref ref-type="bibr" rid="B14">Gangopadhyay et&#xa0;al., 2020</xref>; <xref ref-type="bibr" rid="B49">Shook et&#xa0;al., 2021</xref>). Although integration of medium resolution multispectral data for county-level yield predictions has been studied (<xref ref-type="bibr" rid="B66">You et&#xa0;al., 2017</xref>; <xref ref-type="bibr" rid="B58">Wan et&#xa0;al., 2020</xref>), use of temporal attention mechanisms in conjunction with LSTM&#x2019;s with high-resolution UAV inputs for small plot breeding trials has not been previously explored to the best of our knowledge.</p>
</sec>
<sec id="s2_1_3">
<label>2.1.3</label>
<title>Multi-modal deep learning</title>
<p>Multi-modal deep learning involves training deep neural networks to extract and learn features from multiple types of data. The core concept of multi-modal feature learning is that including multiple data modalities enables more effective learning of one modality compared to in-depth feature extraction from a single modality. (Wang et&#xa0;al., 2020a) investigated voice and text-based fusion to improve the effect of emotion recognition (<xref ref-type="bibr" rid="B36">Liu et&#xa0;al., 2021</xref>). Recently, in time series predictions (<xref ref-type="bibr" rid="B63">Xian and Liang, 2022</xref>) used holidays, weather data, and quarterly market operation information reports in multi-modal networks to predict traffic conditions. Development of maize crops is strongly influenced by both genetic and environmental factors, motivating their inclusion in models when the data are available. Several studies have employed multiple types of RS data as input for DL models to predict crop yield (<xref ref-type="bibr" rid="B9">Danilevicz et&#xa0;al., 2021</xref>; <xref ref-type="bibr" rid="B49">Shook et&#xa0;al., 2021</xref>; <xref ref-type="bibr" rid="B59">Wang and Crawford, 2021</xref>; <xref ref-type="bibr" rid="B54">Toledo et&#xa0;al., 2022</xref>). Multi-modal deep learning models characterize and learn from different sources of input data. For example, LiDAR represents structural attributes of maize through the growing season, while hyperspectral data are related to chemistry related responses of the plant. The information represented by different modalities can potentially be leveraged in a combined model for a better representation of the task at hand (<xref ref-type="bibr" rid="B27">Kumar et&#xa0;al., 2022</xref>). (<xref ref-type="bibr" rid="B37">Maimaitijiang et&#xa0;al., 2020</xref>) integrated canopy structure, temperature, and texture, training each modality individually with multiple CNN layers and applied late fusion for the yield prediction. Furthermore, they also tested an input-level feature fusion, incorporating multiple CNN layers, but the approach underperformed in comparison to the late fusion.</p>
</sec>
</sec>
<sec id="s2_2">
<label>2.2</label>
<title>Multi-modal remote sensing</title>
<p>In agriculture, multiple RS technologies, including RGB, multi/hyperspectral, thermal cameras, and LiDAR (<xref ref-type="bibr" rid="B3">Ali et&#xa0;al., 2022</xref>) on airborne and space-based platforms have been used to assess crop properties. (<xref ref-type="bibr" rid="B67">Zhang et&#xa0;al., 2020</xref>) integrated optical, thermal, and environmental data to predict county-level maize yield, and focused on demonstrating that combining multi-modal, multi-source data explained the variation in yield. Hyperspectral sensors provide high spectral and spatial resolution data when flown on UAV platforms (<xref ref-type="bibr" rid="B30">Li et&#xa0;al., 2022</xref>). Many bands of the continuous, contiguous spectral data are highly correlated, motivating feature extraction and feature selection. In vegetation related studies, spectral indices are commonly used to represent important chemistry-based absorption features (<xref ref-type="bibr" rid="B21">Jain et&#xa0;al., 2007</xref>). Derivative and integral characteristics of hyperspectral cubes also represent important spectral changes in reflectance that can characterize the crop canopies (<xref ref-type="bibr" rid="B40">Masjedi et&#xa0;al., 2018</xref>). Many indices are also highly correlated, resulting in redundancy that may either lead to overfitting or weaken the predictive capability of deep learning models (<xref ref-type="bibr" rid="B29">LeCun et&#xa0;al., 2015</xref>). LiDAR point clouds provide geometric characteristics of the plants such as plant height, canopy cover, and canopy volume. Because of the large number of candidate features from hyperspectral and LiDAR data, (<xref ref-type="bibr" rid="B54">Toledo et&#xa0;al., 2022</xref>) investigated DeepSHAP, which uses Shapley values to quantify the contribution of each feature in a prediction made by a deep learning model.</p>
</sec>
</sec>
<sec id="s3" sec-type="materials|methods">
<label>3</label>
<title>Materials and methods</title>
<sec id="s3_1">
<label>3.1</label>
<title>Plant breeding field experiments</title>
<p>The experiments reported in this study were conducted in Indiana, USA. The experiments were planted in different fields in 2020 and 2021 at the Agronomy Center for Research and Education (ACRE) at Purdue University (40&#xb0;28&#x2019;37.18&#x201d;N, 86&#xb0;59&#x2019;22.67&#x201d;W), West Lafayette. Both were planted in a randomized incomplete block design with two replications. The core check hybrids had two complete replications, and the doubled haploid hybrids based on the PHK76 tester had an incomplete block design.</p>
<p>The experiments were planted as two row plots with a length of 4.575 m by 1.5 m with ~76 cm row spacing. Standard nutrients, herbicides, and insecticides were applied according to normal agronomic management practices at the beginning of the season, and there was no artificial irrigation. Both fields were planted in an annual crop rotation with soybeans. They were planted on May 12, 2020, and May 24, 2021, at a population of 74,000 seeds ha<sup>-1</sup>, respectively. Anhydrous ammonia (NH3) was applied prior to planting in 2020 and liquid Urea Ammonium Nitrate solution (UAN) was applied in 2021. <xref ref-type="fig" rid="f1">
<bold>Figure&#xa0;1</bold>
</xref> shows the geographic location and layout of the field experiments. The grain yield was harvested from both rows on October 1, 2020, and September 28, 2021, using a Kincaid plot combine (Kincaid 8-XP, Haven, KS, USA) with grain yields adjusted to 15% moisture.</p>
<fig id="f1" position="float">
<label>Figure&#xa0;1</label>
<caption>
<p>
<bold>(A)</bold> Geographic location of maize experiments at Purdue University&#x2019;s Agronomy Center for Research and Education. <bold>(B)</bold> Experimental plot layouts for GxE plant breeding experiments in 2020 and 2021. Check plots indicated in red.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-15-1408047-g001.tif"/>
</fig>
</sec>
<sec id="s3_2">
<label>3.2</label>
<title>Genetics and ground reference data</title>
<p>Because the G2F Initiative covers multiple environments and geographic locations, local core check hybrids are used as standards against which the performance of new breeding lines or varieties are compared. By evaluating the performance of new varieties relative to local checks, breeders can evaluate traits such as yield potential, disease resistance, and overall agronomic suitability (<xref ref-type="bibr" rid="B56">Ullah et&#xa0;al., 2017</xref>). Using local checks in multiple environments helps validate the data collected from experimental trials. The consistent performance of local checks in different environments instills confidence in the experimental setup, thus ensuring that the observed performance differences among the new breeding lines are significant and not influenced only by variation in environmental conditions. Because this study focuses on evaluating the performance of genetic variations of double haploids with the tester, local checks introduce an imbalance in terms of their genetic variation. The G2F provides a public genotypic data set in which inbred parents of the hybrids tested were genotyped using the Practical Haplotype Graph (PHG) (<xref ref-type="bibr" rid="B15">Genomes To Fields, 2023</xref>). Quality control on the initial raw genotypic dataset of the inbred lines used in this study was performed as described in (<xref ref-type="bibr" rid="B55">Tolley et&#xa0;al., 2023</xref>). The resulting genetic marker matrix had 142,568 genetic markers from 401 varieties (including local checks). The dimensionality of the genetic data was reduced via principal components (PCs) from the original genetic marker data. A scree plot, which displays the explained variance of the individual principal components, was employed in conjunction with the elbow method to determine the appropriate number of principal components to utilize (<xref ref-type="bibr" rid="B7">Cattell, 1966</xref>). In the preliminary stage, the contributions of twenty PCs were computed. As evidenced in <xref ref-type="supplementary-material" rid="SF1">
<bold>Supplementary Figure&#xa0;1</bold>
</xref>, the elbow test indicates that 6 PCs represent the ideal number for representation of the genetic variation. Two genetic clusters were identified based on the first 3 PCs, as shown in <xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2</bold>
</xref>, one associated with the DH hybrids and the other the core check hybrids. The under-representation of local checks, which comprise less than ~5% of the hybrids, can contribute to decreased accuracy in yield prediction because of the limited dataset for learning. In order to evaluate the performance of genetic variations in double haploids using the tester, local checks were excluded from the training and testing datasets. The ground reference grain yield data serves as additional evidence to support this observation, as shown in <xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2</bold>
</xref>.</p>
<fig id="f2" position="float">
<label>Figure&#xa0;2</label>
<caption>
<p>
<bold>(A)</bold> Genetic variation based on PCA <bold>(B)</bold> Ground reference data with and without check data.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-15-1408047-g002.tif"/>
</fig>
</sec>
<sec id="s3_3">
<label>3.3</label>
<title>Remote sensing data</title>
<p>The RS data were collected throughout the growing season in each study; each data collection was conducted in cloud-free conditions with calm winds. Data were acquired using a DJI multi-rotor Matrice 600 Pro (<xref ref-type="fig" rid="f3">
<bold>Figure&#xa0;3</bold>
</xref>), equipped with an Applanix APX-15v3 GNSS/IMU for accurate geo-referencing. An integrated sensor package comprised of three sensors was installed: (a) Nano-Hyperspec<sup>&#xae;</sup> VNIR camera (Headwall Photonics Inc., Bolton, MA) with a spectral range of 400-1000 nm, 270 spectral bands at 2.2nm/band from 400 nm to 1000 nm with 640 spatial channels at 7.4 &#x3bc;m/pixel, flown at 44 m to achieve 4 cm spatial resolution in the final orthorectified cubes, (b) Velodyne VLP-16 Lite LiDAR sensor and (c) Sony Alpha 7RIII high resolution RGB camera. Rigorous system calibration was performed to estimate camera distortion and the relevant rotation angles and lever arms of the pushbroom sensor (<xref ref-type="bibr" rid="B16">Gharibi and Habib, 2018</xref>; <xref ref-type="bibr" rid="B28">LaForest et&#xa0;al., 2019</xref>).  The specifications of the remote sensing data collection and its products are presented in <xref ref-type="table" rid="T1">
<bold>Table 1</bold>
</xref>.</p>
<fig id="f3" position="float">
<label>Figure&#xa0;3</label>
<caption>
<p>UAV platform with APX <bold>(A)</bold>, RGB <bold>(B)</bold>, LiDAR <bold>(C)</bold> and Hyperspectral <bold>(D)</bold> sensors.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-15-1408047-g003.tif"/>
</fig>
<table-wrap id="T1" position="float">
<label>Table&#xa0;1</label>
<caption>
<p>Data collection specifications.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="top" colspan="2" align="center">Specifications</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Flying Height and Speed</td>
<td valign="top" align="left">44 m, 4.1 m/sec</td>
</tr>
<tr>
<td valign="top" align="left">Hyperspectral Ortho mosaic</td>
<td valign="top" align="left">4 cm (GSD)<break/>4.4 nm (400 &#x2013; 1000 nm) (spectral range)</td>
</tr>
<tr>
<td valign="top" align="left">RGB Orthophoto</td>
<td valign="top" align="left">0.5 cm</td>
</tr>
<tr>
<td valign="top" align="left">LiDAR Point Cloud and DSM</td>
<td valign="top" align="left">8 cm DSM</td>
</tr>
</tbody>
</table>
</table-wrap>
<sec id="s3_3_1">
<label>3.3.1</label>
<title>Hyperspectral data</title>
<p>A dark current spectral response was collected at the beginning of each flight to allow conversion of raw DNs (Digital Numbers) to radiance using the absolute radiometric coefficients provided by the camera manufacturer. Three calibrated spectral targets (11%, 30% and 56%) were deployed for each UAV flight and used to convert radiance to reflectance values using the empirical line method. The hyperspectral imagery was orthorectified using the DSM derived from the LiDAR based georeferenced point clouds (<xref ref-type="bibr" rid="B34">Lin and Habib, 2021</xref>) and the methodology described in (<xref ref-type="bibr" rid="B16">Gharibi and Habib, 2018</xref>). Non-vegetation pixels were removed using the OSAVI value as a threshold (<xref ref-type="bibr" rid="B31">Li et&#xa0;al., 2016</xref>). The GxE experiment in 2020 is illustrated in <xref ref-type="supplementary-material" rid="SF2">
<bold>Supplementary Figure&#xa0;2</bold>
</xref>, which includes the hyperspectral orthomosaic, OSAVI image, and non-vegetation pixel mask during flowering time.</p>
</sec>
<sec id="s3_3_2">
<label>3.3.2</label>
<title>LiDAR and RGB data</title>
<p>The LiDAR point clouds were processed using the estimated mounting parameters aided by the GNSS/INS trajectory (<xref ref-type="bibr" rid="B33">Lin et&#xa0;al., 2019</xref>). For this study, the high resolution RGB orthophotos were used to extract the row/plots boundaries from each experiment, using the method described in (<xref ref-type="bibr" rid="B65">Yang et&#xa0;al., 2021</xref>). An example of the reconstructed LiDAR point cloud during flowering time is shown in <xref ref-type="supplementary-material" rid="SF3">
<bold>Supplementary Figure&#xa0;3</bold>
</xref>.</p>
</sec>
<sec id="s3_3_3">
<label>3.3.3</label>
<title>Dates for analysis</title>
<p>Data were collected throughout the growing season, with an effort to collect information every week. In the process of model development, careful consideration was given to the selection of dates to capture the essential temporal dynamics that impact crop yields. These dates and the related physiological stage of the plants are summarized in <xref ref-type="table" rid="T2">
<bold>Table&#xa0;2</bold>
</xref>.</p>
<table-wrap id="T2" position="float">
<label>Table&#xa0;2</label>
<caption>
<p>Dates of remote sensing data acquisition in 2020 and 2021.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" rowspan="2" align="center">Data Type</th>
<th valign="middle" rowspan="2" align="center">Vegetative<break/>Stage</th>
<th valign="middle" colspan="2" align="center">Field 54 2020</th>
<th valign="middle" colspan="2" align="center">Field 42 2021</th>
</tr>
<tr>
<th valign="middle" align="center">Experiment Dates</th>
<th valign="top" align="center">Growing Degree Days</th>
<th valign="middle" align="center">Experiment Dates</th>
<th valign="top" align="center">Growing Degree Days</th>
</tr>
</thead>
<tbody>
<tr>
<td valign="middle" align="center">LiDAR &amp;<break/>Hyperspectral</td>
<td valign="top" align="center">V8<break/>V12<break/>VT-R1<break/>R1<break/>R2<break/>R4<break/>R5</td>
<td valign="middle" align="center">June 17<sup>th</sup>
<break/>July 2<sup>nd</sup>
<break/>July 17<sup>th</sup>
<break/>July 28<sup>th</sup>
<break/>August 6<sup>th</sup>
<break/>August 13<sup>th</sup>
<break/>September 5<sup>th</sup>
</td>
<td valign="top" align="center">499<break/>857<break/>1222<break/>1494<break/>1660<break/>1803<break/>2288</td>
<td valign="top" align="center">June 17<sup>th</sup>
<break/>July 3<sup>rd</sup>
<break/>July 19<sup>th</sup>
<break/>July 27<sup>th</sup>
<break/>August 8<sup>th</sup>
<break/>August 16<sup>th</sup>
<break/>September 6<sup>th</sup>
</td>
<td valign="top" align="center">703<break/>1067<break/>1444<break/>1644<break/>1895<break/>2084<break/>2589</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s3_3_4">
<label>3.3.4</label>
<title>Feature extraction and feature selection</title>
<p>In each experiment, 40 cm was trimmed at each end of the rows to minimize human interactions, light differences, and treatments from neighboring plots. Both rows were used for the extraction of LiDAR and hyperspectral features, and subsequently averaged to derive the plot-based value. Initial candidate spectral features, including vegetation indices, integration, and derivative-based features, were investigated. The candidate LiDAR features were comprised of different percentiles of height, LiDAR canopy cover, volume and plot-based height statistical features (<xref ref-type="bibr" rid="B39">Masjedi et&#xa0;al., 2020</xref>). In accordance with Section 2.2, feature selection was conducted using the DeepSHAP methodology (<xref ref-type="bibr" rid="B54">Toledo et&#xa0;al., 2022</xref>). Nine hyperspectral features and seven LiDAR features were chosen as the remote sensing inputs for the time series analysis for both years. The detailed descriptions of these features are included in <xref ref-type="table" rid="T3">
<bold>Table&#xa0;3</bold>
</xref>.</p>
<table-wrap id="T3" position="float">
<label>Table&#xa0;3</label>
<caption>
<p>Remote sensing input features for time series analysis in models.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="middle" align="center">Feature</th>
<th valign="top" align="center">Equation</th>
<th valign="middle" align="center">Explanation</th>
</tr>
</thead>
<tbody>
<tr>
<th valign="top" colspan="3" align="center">Hyperspectral</th>
</tr>
<tr>
<td valign="middle" align="left">Integration features of bands in the 670-780 nm bands</td>
<td valign="middle" rowspan="3" align="center">
<italic>Intg</italic>(<italic>&#x3bb;<sub>a</sub>,&#x3bb;<sub>b</sub>
</italic>) = <inline-formula>
<mml:math display="inline" id="im1">
<mml:mrow>
<mml:msubsup>
<mml:mo>&#x222b;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext>&#x3bb;</mml:mtext>
<mml:mi>a</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mtext>&#x3bb;</mml:mtext>
<mml:mtext>b</mml:mtext>
</mml:msub>
</mml:mrow>
</mml:msubsup>
<mml:mi>S</mml:mi>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>&#x3bb;</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
<mml:mi>d</mml:mi>
<mml:mi>&#x3bb;</mml:mi>
</mml:mrow>
</mml:math>
</inline-formula> = area under the spectral curve for a given range [<italic>&#x3bb;<sub>a</sub>
</italic>, <italic>&#x3bb;<sub>b</sub>
</italic>]; where <italic>S</italic>(<italic>&#x3bb;</italic>) is the reflectance <italic>&#x3bb;</italic> nm.</td>
<td valign="middle" rowspan="3" align="center">Related to the increase in the NIR signature in the early season followed by a reduction after the maximum value, typically at flowering</td>
</tr>
<tr>
<td valign="middle" align="left">Integration features of bands in the 910-1000 nm bands</td>
</tr>
<tr>
<td valign="middle" align="left">Integration of the first derivative of the NIR</td>
</tr>
<tr>
<td valign="middle" align="left">VOG3 (<xref ref-type="bibr" rid="B57">Vogelmann et&#xa0;al., 1993</xref>)</td>
<td valign="top" align="center">
<inline-formula>
<mml:math display="inline" id="im2">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>734</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>747</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>715</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>720</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td valign="middle" rowspan="3" align="center">Chlorophyll related indices</td>
</tr>
<tr>
<td valign="middle" align="left">NDRE (<xref ref-type="bibr" rid="B6">Barnes et&#xa0;al., 2000</xref>)</td>
<td valign="top" align="center">
<inline-formula>
<mml:math display="inline" id="im3">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>790</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>720</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>790</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>720</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
</tr>
<tr>
<td valign="middle" align="left">MCARI2 (<xref ref-type="bibr" rid="B11">Daughtry, 2000</xref>)</td>
<td valign="top" align="left">
<inline-formula>
<mml:math display="inline" id="im4">
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>750</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>705</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>0.2</mml:mn>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>700</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>550</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
<mml:mo stretchy="false">]</mml:mo>
<mml:mo>*</mml:mo>
<mml:mo stretchy="false">(</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>750</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>705</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
</tr>
<tr>
<td valign="middle" align="left">DATT3 (<xref ref-type="bibr" rid="B10">Datt, 1999</xref>)</td>
<td valign="top" align="center">
<inline-formula>
<mml:math display="inline" id="im5">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>754</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>704</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td valign="top" align="center">Chlorophyll related with high sensitivity to nitrogen</td>
</tr>
<tr>
<td valign="middle" align="left">PSRI (<xref ref-type="bibr" rid="B41">Merzlyak et&#xa0;al., 1999</xref>)</td>
<td valign="top" align="center">
<inline-formula>
<mml:math display="inline" id="im6">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>678</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>500</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>750</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td valign="top" align="center">Plant senescence index</td>
</tr>
<tr>
<td valign="middle" align="left">RDVI (<xref ref-type="bibr" rid="B45">Roujean and Breon, 1995</xref>)</td>
<td valign="middle" align="center">
<inline-formula>
<mml:math display="inline" id="im7">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>800</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>670</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>800</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:msub>
<mml:mi>&#x3c1;</mml:mi>
<mml:mrow>
<mml:mn>670</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td valign="top" align="center">Similar to NDVI, but less sensitive to the effects of soil and sun viewing geometry</td>
</tr>
<tr>
<th valign="top" colspan="3" align="center">LiDAR</th>
</tr>
<tr>
<td valign="middle" align="left">75<sup>th</sup> height Percentile</td>
<td valign="middle" rowspan="2" align="center">Height of the non-ground points at <italic>i<sup>th</sup>
</italic> percentile</td>
<td valign="top" rowspan="2" align="center">Represents vertical distribution of the LiDAR points in each plot</td>
</tr>
<tr>
<td valign="middle" align="left">90<sup>th</sup> height Percentile</td>
</tr>
<tr>
<td valign="middle" align="left">Height Quadratic Mean</td>
<td valign="middle" align="center">Square root of the mean of the squared heights <inline-formula>
<mml:math display="inline" id="im8">
<mml:mrow>
<mml:mi>Q</mml:mi>
<mml:mo>=</mml:mo>
<mml:mo>&#xa0;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mn>1</mml:mn>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>+</mml:mo>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mn>2</mml:mn>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>+</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>+</mml:mo>
<mml:msubsup>
<mml:mi>x</mml:mi>
<mml:mi>n</mml:mi>
<mml:mn>2</mml:mn>
</mml:msubsup>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
</inline-formula>
</td>
<td valign="middle" align="center">Represents the average LiDAR height squared at the plot level</td>
</tr>
<tr>
<td valign="middle" align="left">Volume</td>
<td valign="middle" align="center">Area of 8 cm x 8 cm resolution</td>
<td valign="top" align="center">Aggregated volume of voxel</td>
</tr>
<tr>
<td valign="middle" align="left">Canopy Cover at 20<sup>th</sup> percentile height</td>
<td valign="middle" rowspan="3" align="center">Fraction of points above specified percentile</td>
<td valign="middle" rowspan="3" align="center">Proportion of canopy above a specified height percentile</td>
</tr>
<tr>
<td valign="middle" align="left">Canopy Cover at 50<sup>th</sup> percentile height</td>
</tr>
<tr>
<td valign="middle" align="left">Canopy Cover at 75<sup>th</sup> percentile height</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s3_4">
<label>3.4</label>
<title>Weather variables</title>
<p>Three weather related variables were included in the analysis: Cumulative Radiation, Precipitation and Growing Degree Days (GDD) from the beginning of each of the growing seasons (2020-2021), shown in <xref ref-type="fig" rid="f4">
<bold>Figure&#xa0;4</bold>
</xref>. The precipitation and the growing degree days were obtained from the Indiana State Climate Mesonet weather station located at ACRE. All the data are publicly available at (<xref ref-type="bibr" rid="B19">Indiana State Climate Office, 2022</xref>).</p>
<fig id="f4" position="float">
<label>Figure&#xa0;4</label>
<caption>
<p>Accumulated values of weather variables through the growing season.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-15-1408047-g004.tif"/>
</fig>
</sec>
<sec id="s3_5">
<label>3.5</label>
<title>Deep learning models of maize yield</title>
<p>As noted previously, three RNN network-based architectures were implemented: a) vanilla stacked LSTM, b) stacked LSTM with an attention mechanism, and c) multi-modal network for the different RS modalities. For this study, the temporal attention mechanism was based on the Bahdanau attention mechanism (<xref ref-type="bibr" rid="B5">Bahdanau et&#xa0;al., 2016</xref>). <xref ref-type="fig" rid="f5">
<bold>Figure&#xa0;5</bold>
</xref> displays the stacked LSTM models described in the following sub-sections. The hyperparameters, including the use of the Adam optimizer (<xref ref-type="bibr" rid="B25">Kingma and Ba, 2017</xref>) for weight updating, were determined experimentally. The learning rate during training was set at 0.001. The Mean Squared Error (MSE) served as the loss metric for terminating model training. The model was developed using 5-fold cross validation with 80% training/20% testing and a 90/10 training/validation split of the training data for model development based on the 500 plots in each fold. All the networks were implemented in TensorFlow on an NVIDIA Quadro P400 GPU with 68 GB RAM.</p>
<fig id="f5" position="float">
<label>Figure&#xa0;5</label>
<caption>
<p>Stacked LSTM-based networks explored for prediction of maize yield (<inline-formula>
<mml:math display="inline" id="im9">
<mml:mrow>
<mml:msub>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mrow>
<mml:mi>V</mml:mi>
<mml:mi>N</mml:mi>
<mml:mi>I</mml:mi>
<mml:mi>R</mml:mi>
<mml:mo>+</mml:mo>
<mml:mi>L</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>D</mml:mi>
<mml:mi>A</mml:mi>
<mml:mi>R</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. <bold>(A)</bold> Vanilla stacked LSTM with early fusion of both concatenated features from VNIR, LiDAR and weather data. <bold>(B)</bold> LSTM with attention mechanism, with early fusion of the modalities. <bold>(C)</bold> Multi-modal network with separate networks for each RS modality concatenated with weather, adding a late fusion module.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-15-1408047-g005.tif"/>
</fig>
<sec id="s3_5_1">
<label>3.5.1</label>
<title>Vanilla stacked LSTM</title>
<p>The stacked network utilizes early fusion multi-modality, where the raw data from both modalities, along with weather data, are concatenated and input into the LSTM-based recursive neural network. The number of LSTM-cells was determined experimentally. A dropout layer of 0.2 was added after each LSTM cell to prevent over-fitting.</p>
</sec>
<sec id="s3_5_2">
<label>3.5.2</label>
<title>Stacked LSTM with attention mechanism</title>
<p>The traditional LSTM network was coupled with an attention mechanism with a mid-season gate, which enhances performance and serves as a source of explainability. The attention mechanism computes a context vector that represents the relationship between the output of the hidden states of each time-step and each feature of the input vector (<xref ref-type="bibr" rid="B43">Niu et&#xa0;al., 2021</xref>), as depicted in <xref ref-type="fig" rid="f5">
<bold>Figure&#xa0;5B</bold>
</xref>. This comparison is typically accomplished using a weighted sum of a similarity score between the decoder state and each time-step&#x2019;s hidden representation. The attention weights indicate how much attention or importance the model assigns to each time-step when making a prediction. (e.g., higher attention weights suggest that a particular time-step has a more significant influence on the current prediction, while lower weights show less relevance). The attention weights can be visualized to acquire insights into which time-steps the model considers most important for the forecast. By examining the weights, patterns, or trends in the input sequence that the model relies on to make predictions can be identified. A key advantage of attention mechanisms is their ability to adaptively adjust the attention weights for each prediction. Thus, the model can give more weight to recent or relevant time-steps while reducing the importance of less relevant ones. Interpreting growth stage importance can be achieved by considering temporal attention weights in the time domain. The model was implemented using individual sensing modality inputs (e.g., hyperspectral features and LiDAR features in isolation), in addition to early fusion multi-modality, allowing interpretation of the temporal attention weights on each modality in order to determine the dates to be used in the different scenarios.</p>
</sec>
<sec id="s3_5_3">
<label>3.5.3</label>
<title>Multi-modal network for the different RS modalities</title>
<p>The multi-modal network consists of two modules described in Section 3.5.2, one for each of the RS modalities and a fusion module, as proposed in (Wang et&#xa0;al., 2020a). Gradient blending is used to blend the multiple loss functions in each module (Wang et&#xa0;al., 2020a) and avoid over-fitting by choosing the scaling of the weights, as the architecture of each module converges at different numbers of epochs. In the fusion module, as seen in <xref ref-type="fig" rid="f5">
<bold>Figure&#xa0;5C</bold>
</xref>, two dense layers are added using a sigmoid activation function to generate a single prediction value.</p>
</sec>
</sec>
<sec id="s3_6">
<label>3.6</label>
<title>Genetic clustering, stratified sampling, and evaluation metrics</title>
<p>Six principal components derived from the original genetic marker data explained 35% of the variance from the high dimensionality genetic marker matrix. They were clustered via k-means unsupervised classification to develop balanced groupings for stratified sampling for the training, validation, and testing datasets. Performance was evaluated using R<sup>2</sup>
<italic>
<sub>ref</sub>
</italic> calculated as <xref ref-type="disp-formula" rid="eq1">Equation (1)</xref> (relative to the one-to-one reference line) and the root mean squared error calculated as <xref ref-type="disp-formula" rid="eq2">Equation (2)</xref> (RMSE):</p>
<disp-formula id="eq1">
<label>(1)</label>
<mml:math display="block" id="M1">
<mml:mrow>
<mml:msubsup>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>r</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msubsup>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">^</mml:mo>
</mml:mover>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="eq2">
<label>(2)</label>
<mml:math display="block" id="M2">
<mml:mrow>
<mml:mi>R</mml:mi>
<mml:mi>M</mml:mi>
<mml:mi>S</mml:mi>
<mml:mi>E</mml:mi>
<mml:mo>=</mml:mo>
<mml:mtext>&#xa0;</mml:mtext>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">^</mml:mo>
</mml:mover>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <inline-formula>
<mml:math display="inline" id="im10">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the observed yield reference value, <inline-formula>
<mml:math display="inline" id="im11">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="true">^</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> is the predicted yield value, and <inline-formula>
<mml:math display="inline" id="im12">
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:math>
</inline-formula> is the mean observed value. The total number of samples is denoted by <italic>n</italic>.</p>
</sec>
</sec>
<sec id="s4">
<label>4</label>
<title>Experimental results and discussion</title>
<sec id="s4_1">
<label>4.1</label>
<title>Individual modality growth stages importance inferred from attention weights</title>
<p>Plots of the attention weights within each time-step show the relative importance of the time periods for predicting end-of-year yields. The visualization also provides a useful connection between growth stages and RS data inputs. A heatmap plot of the attention weights was obtained by summing the feature weights within each time-step. Individual RS attention weights for each plot are shown in <xref ref-type="supplementary-material" rid="SF4">
<bold>Supplementary Figure&#xa0;4</bold>
</xref>. Although the values of the attention weights vary across individual plot level predictions, a distinct trend is clear in the plot of the temporal weights averaged over all the plots. The relative importance of the LiDAR features during the early season is clear, while the impact of hyperspectral features on yield prediction was greater starting mid-season. The merged representation by the average weights of the testing dataset at each time-step is plotted in <xref ref-type="fig" rid="f6">
<bold>Figure&#xa0;6</bold>
</xref>. The average flowering date for the doubled haploid varieties is denoted by the dashed line. LiDAR attention weights show a peak value around the flowering time, while the maximum value of the attention weights for hyperspectral imagery is during the early grain filling stages, which coincides with the physiological growth characteristics of maize. In the early stages of growth, the plant prioritizes the utilization of nutrients for biomass growth. However, as it reaches the flowering stage, the plant undergoes a process of remobilization, redirecting its resources towards grain filling. This chemistry related transition can be observed in the hyperspectral imagery.</p>
<fig id="f6" position="float">
<label>Figure&#xa0;6</label>
<caption>
<p>Average values of the attention weights in the time domain in <bold>(A)</bold> LiDAR and <bold>(B)</bold> hyperspectral modalities.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-15-1408047-g006.tif"/>
</fig>
<p>Based on the importance indicated by the attention weights, four scenarios were investigated for yield prediction models: (a) Using all six dates in both modalities; (b) using only 3 dates prior to mid-season in each modality; (c) using the first four dates with LiDAR data and the middle 4 dates for the hyperspectral data; (d) using three mid-season LiDAR dates and four mid-season hyperspectral dates. The goal was to investigate the contributions of the two sources of RS data throughout the growing season, while reducing the size of the network to include the most meaningful inputs.</p>
</sec>
<sec id="s4_2">
<label>4.2</label>
<title>Maize grain yield predictions</title>
<p>The yield forecast based on inputs from the individual modalities indicates that RS data can effectively function as a time series input to deep learning models during the growing season. Integrating these modalities, whether through early fusion in the initial deep learning models or through late fusion in the multi-modal network, leads to a significant enhancement in the prediction accuracy of the models. The same hyperparameters described in Section 2.5 were used to train and test all the scenarios in both years. <xref ref-type="fig" rid="f7">
<bold>Figure&#xa0;7</bold>
</xref> displays a comparison of the projected grain yields for Scenario 1 and the ground reference data. These results represent the best model from cross-validation; the results shown in <xref ref-type="table" rid="T4">
<bold>Table&#xa0;4</bold>
</xref> include the sample mean and standard deviations from all the cross-validation predictions.</p>
<fig id="f7" position="float">
<label>Figure&#xa0;7</label>
<caption>
<p>Model performance in deep learning networks developed for maize grain yield using the full season (<xref ref-type="table" rid="T2">
<bold>Table&#xa0;2</bold>
</xref>) of RS data.</p>
</caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fpls-15-1408047-g007.tif"/>
</fig>
<table-wrap id="T4" position="float">
<label>Table&#xa0;4</label>
<caption>
<p>Performance comparison of deep learning models for different scenarios.</p>
</caption>
<table frame="hsides">
<thead>
<tr>
<th valign="bottom" rowspan="3" align="center"/>
<th valign="bottom" colspan="4" align="center">GxE 2020 Field 54</th>
<th valign="bottom" colspan="4" align="center">GxE 2021 Field 42</th>
</tr>
<tr>
<th valign="middle" colspan="2" align="center">Independent Testing Data</th>
<th valign="middle" colspan="2" align="center">Complete Dataset</th>
<th valign="middle" colspan="2" align="center">Independent Testing Data</th>
<th valign="middle" colspan="2" align="center">Complete Dataset</th>
</tr>
<tr>
<th valign="bottom" align="center">R<sup>2</sup>
<sub>ref</sub>
</th>
<th valign="bottom" align="center">RMSE</th>
<th valign="bottom" align="center">R<sup>2</sup>
<sub>ref</sub>
</th>
<th valign="bottom" align="center">RMSE</th>
<th valign="bottom" align="center">R<sup>2</sup>
<sub>ref</sub>
</th>
<th valign="bottom" align="center">RMSE</th>
<th valign="bottom" align="center">R<sup>2</sup>
<sub>ref</sub>
</th>
<th valign="bottom" align="center">RMSE</th>
</tr>
</thead>
<tbody>
<tr>
<th valign="bottom" colspan="9" align="center">Scenario 1: All Dates</th>
</tr>
<tr>
<td valign="bottom" align="center">
<bold>Stacked LSTM</bold>
</td>
<td valign="bottom" align="center">0.82 &#xb1; 0.13</td>
<td valign="bottom" align="center">9.50 &#xb1; 4.09</td>
<td valign="bottom" align="center">0.91 &#xb1; 0.06</td>
<td valign="bottom" align="center">6.57 &#xb1; 1.09</td>
<td valign="bottom" align="center">0.74 &#xb1; 0.22</td>
<td valign="bottom" align="center">11.88 &#xb1; 4.44</td>
<td valign="bottom" align="center">0.87&#xb1; 0.04</td>
<td valign="bottom" align="center">8.71 &#xb1; 1.31</td>
</tr>
<tr>
<td valign="bottom" align="center">
<bold>Attention Network</bold>
</td>
<td valign="middle" align="center">0.84 &#xb1; 0.10</td>
<td valign="middle" align="center">8.36 &#xb1; 2.35</td>
<td valign="middle" align="center">0.94 &#xb1; 0.05</td>
<td valign="middle" align="center">8.06 &#xb1; 1.05</td>
<td valign="middle" align="center">0.80 &#xb1; 0.14</td>
<td valign="middle" align="center">11.46 &#xb1; 3.87</td>
<td valign="middle" align="center">0.83 &#xb1; 0.09</td>
<td valign="middle" align="center">8.42 &#xb1; 2.20</td>
</tr>
<tr>
<td valign="bottom" align="center">
<bold>Multi-modal Network</bold>
</td>
<td valign="middle" align="center">0.89 &#xb1; 0.16</td>
<td valign="middle" align="center">6.58 &#xb1; 3.04</td>
<td valign="middle" align="center">0.96 &#xb1; 0.03</td>
<td valign="middle" align="center">3.25 &#xb1; 1.25</td>
<td valign="middle" align="center">0.87 &#xb1; 0.17</td>
<td valign="middle" align="center">8.97 &#xb1; 4.02</td>
<td valign="middle" align="center">0.96 &#xb1; 0.06</td>
<td valign="middle" align="center">3.21 &#xb1; 1.99</td>
</tr>
<tr>
<th valign="bottom" colspan="9" align="center">Scenario 2: Predictions Based on Mid-season Data</th>
</tr>
<tr>
<td valign="bottom" align="center">
<bold>Stacked LSTM</bold>
</td>
<td valign="bottom" align="center">0.64 &#xb1; 0.14</td>
<td valign="bottom" align="center">15.51 &#xb1; 2.49</td>
<td valign="bottom" align="center">0.80 &#xb1; 0.10</td>
<td valign="bottom" align="center">12.68 &#xb1; 1.32</td>
<td valign="bottom" align="center">0.63 &#xb1; 0.26</td>
<td valign="bottom" align="center">13.63 &#xb1; 2.78</td>
<td valign="bottom" align="center">0.83&#xb1; 0.04</td>
<td valign="bottom" align="center">9.98 &#xb1; 1.31</td>
</tr>
<tr>
<td valign="bottom" align="center">
<bold>Attention Network</bold>
</td>
<td valign="middle" align="center">0.79 &#xb1; 0.20</td>
<td valign="middle" align="center">11.10 &#xb1; 5.94</td>
<td valign="middle" align="center">0.89 &#xb1; 0.03</td>
<td valign="middle" align="center">7.09 &#xb1; 1.91</td>
<td valign="middle" align="center">0.74 &#xb1; 0.20</td>
<td valign="middle" align="center">12.28 &#xb1; 5.44</td>
<td valign="middle" align="center">0.78 &#xb1; 0.04</td>
<td valign="middle" align="center">10.31 &#xb1; 1.32</td>
</tr>
<tr>
<td valign="bottom" align="center">
<bold>Multi-modal Network</bold>
</td>
<td valign="middle" align="center">0.86 &#xb1; 0.06</td>
<td valign="middle" align="center">8.02 &#xb1; 1.13</td>
<td valign="middle" align="center">0.90 &#xb1; 0.04</td>
<td valign="middle" align="center">6.60 &#xb1; 1.16</td>
<td valign="middle" align="center">0.81 &#xb1; 0.15</td>
<td valign="middle" align="center">10.02 &#xb1; 3.64</td>
<td valign="middle" align="center">0.90 &#xb1; 0.04</td>
<td valign="middle" align="center">6.94 &#xb1; 2.29</td>
</tr>
<tr>
<th valign="bottom" colspan="9" align="center">Scenario 3: Predictions based on Temporally Shifted LiDAR and Hyperspectral Datasets</th>
</tr>
<tr>
<td valign="bottom" align="center">
<bold>Stacked LSTM</bold>
</td>
<td valign="bottom" align="center">0.77 &#xb1; 0.15</td>
<td valign="bottom" align="center">11.91 &#xb1; 4.64</td>
<td valign="bottom" align="center">0.88 &#xb1; 0.08</td>
<td valign="bottom" align="center">8.57 &#xb1; 1.59</td>
<td valign="bottom" align="center">0.71 &#xb1; 0.23</td>
<td valign="bottom" align="center">13.88 &#xb1; 6.22</td>
<td valign="bottom" align="center">0.87&#xb1; 0.14</td>
<td valign="bottom" align="center">7.98 &#xb1; 1.31</td>
</tr>
<tr>
<td valign="bottom" align="center">
<bold>Attention Network</bold>
</td>
<td valign="middle" align="center">0.82 &#xb1; 0.10</td>
<td valign="middle" align="center">11.36 &#xb1; 2.35</td>
<td valign="middle" align="center">0.92 &#xb1; 0.06</td>
<td valign="middle" align="center">9.06 &#xb1; 2.05</td>
<td valign="middle" align="center">0.78 &#xb1; 0.18</td>
<td valign="middle" align="center">10.86 &#xb1; 3.87</td>
<td valign="middle" align="center">0.83 &#xb1; 0.09</td>
<td valign="middle" align="center">11.02 &#xb1; 1.20</td>
</tr>
<tr>
<td valign="bottom" align="center">
<bold>Multi-modal Network</bold>
</td>
<td valign="middle" align="center">0.87 &#xb1; 0.16</td>
<td valign="middle" align="center">7.68 &#xb1; 3.04</td>
<td valign="middle" align="center">0.94 &#xb1; 0.02</td>
<td valign="middle" align="center">5.75 &#xb1; 2.25</td>
<td valign="middle" align="center">0.85 &#xb1; 0.12</td>
<td valign="middle" align="center">8.01 &#xb1; 4.02</td>
<td valign="middle" align="center">0.96 &#xb1; 0.06</td>
<td valign="middle" align="center">3.21 &#xb1; 1.99</td>
</tr>
<tr>
<td valign="bottom" colspan="9" align="center">LiDAR {6/17/20, 7/2/20, 7/17/20, 7/28/20}; {7/3/21, 7/19/21, 7/27/21, 8/16/21}<break/>Hyperspectral: {7/2/20, 7/17/20, 7/28/20, 8/13/20}; {7/19/21, 7/27/21, 8/16/21, 9/6/21}</td>
</tr>
<tr>
<th valign="bottom" colspan="9" align="center">Scenario 4: Predictions based on 3 Midseason LiDAR and 4 Midseason Hyperspectral Datasets</th>
</tr>
<tr>
<td valign="bottom" align="center">
<bold>Stacked LSTM</bold>
</td>
<td valign="bottom" align="center">0.76 &#xb1; 0.21</td>
<td valign="bottom" align="center">12.02 &#xb1; 2.04</td>
<td valign="bottom" align="center">0.85 &#xb1; 0.09</td>
<td valign="bottom" align="center">7.64 &#xb1; 2.67</td>
<td valign="bottom" align="center">0.71 &#xb1; 0.05</td>
<td valign="bottom" align="center">13.24 &#xb1; 3.01</td>
<td valign="bottom" align="center">0.85&#xb1; 0.09</td>
<td valign="bottom" align="center">9.05&#xb1; 2.45</td>
</tr>
<tr>
<td valign="bottom" align="center">
<bold>Attention Network</bold>
</td>
<td valign="middle" align="center">0.83 &#xb1; 0.16</td>
<td valign="middle" align="center">10.94 &#xb1; 3.05</td>
<td valign="middle" align="center">0.91 &#xb1; 0.10</td>
<td valign="middle" align="center">9.65 &#xb1; 3.54</td>
<td valign="middle" align="center">0.75 &#xb1; 0.02</td>
<td valign="middle" align="center">12.01 &#xb1; 3.87</td>
<td valign="middle" align="center">0.84 &#xb1; 0.12</td>
<td valign="middle" align="center">10.74 &#xb1; 2.45</td>
</tr>
<tr>
<td valign="bottom" align="center">
<bold>Multi-modal Network</bold>
</td>
<td valign="middle" align="center">0.85 &#xb1; 0.10</td>
<td valign="middle" align="center">8.87 &#xb1; 2.45</td>
<td valign="middle" align="center">0.95 &#xb1; 0.16</td>
<td valign="middle" align="center">4.98 &#xb1; 3.01</td>
<td valign="middle" align="center">0.84 &#xb1; 0.09</td>
<td valign="middle" align="center">8.85 &#xb1; 2.74</td>
<td valign="middle" align="center">0.94 &#xb1; 0.15</td>
<td valign="middle" align="center">5.01 &#xb1; 3.45</td>
</tr>
<tr>
<td valign="bottom" colspan="9" align="center">LiDAR {6/17/20, 7/2/20, 7/17/20},; {7/3/21, 7/19/21, 7/27/21}<break/>Hyperspectral: {6/17/20, 7/2/20, 7/17/20, 7/28/20}; {7/3/21, 7/19/21, 7/27/21, 8/16/21}</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As shown in <xref ref-type="fig" rid="f7">
<bold>Figure&#xa0;7</bold>
</xref>, the original basic vanilla stacked LSTM model with full season data from both remote sensing modalities had significantly worse performance in both the RMSE and R<sup>2</sup>
<sub>ref</sub> values compared to other multi-modal LSTM models. The accuracy increased as the model architecture was enhanced by adding attention mechanisms. Based on the RMSE and R<sup>2</sup>
<sub>ref</sub> metrics, the multi-modal architecture, which was also integrated with attention mechanisms, successfully established temporal relationships, and captured inter-modal connections through independent processing of the RS sources.</p>
<p>Values of the evaluation metrics from all the scenarios are listed in <xref ref-type="table" rid="T4">
<bold>Table&#xa0;4</bold>
</xref>. Comparing the results from 2020 to 2021, there was a slight decrease in model performance from 2020 to 2021 based on explained variance. One plausible reason for this could be the difference in GDD days between the RS dates for the two years. There was a minor misalignment between the dates of the RS datasets and the plants&#x2019; growth stages in 2021 due to a heavy precipitation period during tasseling. Remote sensing data acquisition was not possible on multiple days because of adverse weather conditions. This small reduction in model performance occurred across all the architectures. Alternatively, the difference can be attributed to the variation in grain yield performance observed in the same hybrids throughout both years, as depicted in <xref ref-type="fig" rid="f2">
<bold>Figure&#xa0;2B</bold>
</xref>. Despite identical hybrids being grown in both years, there is a noticeable disparity in the distributions of grain yield values. The significant discrepancy in precipitation between the two years, given that 2021 experienced above-average levels of precipitation, may be the underlying factor contributing to the difference in grain yield. To provide a more comprehensive representation of the metrics, <xref ref-type="supplementary-material" rid="SF5">
<bold>Supplementary Figure&#xa0;5</bold>
</xref> includes a plot of the error bars derived from <xref ref-type="table" rid="T4">
<bold>Table&#xa0;4</bold>
</xref>.</p>
<p>Results of the scenarios based on different dates for remote sensing data acquisitions shown in <xref ref-type="table" rid="T4">
<bold>Table&#xa0;4</bold>
</xref> demonstrate that inclusion of all the dates in the networks yielded the most accurate predictions for all the model architectures. However, the prediction accuracies in most networks did not decrease significantly for the other scenarios that were based on subsets of time periods. The second scenario, which includes both modalities until mid-season, had a significant decrease in the value of R<sup>2</sup>
<sub>ref</sub>. While the crop undergoes nutrient redistribution from biomass to grain filling by mid-season, the final grain yield is still influenced by key environmental characteristics and plant genetics in its reproductive stages. These results also illustrate the contribution of attention mechanisms in the networks, as they enable the model to learn data patterns and retain crucial features at each time-step. Among the scenarios based on subsets of the whole season data, Scenario 3 had the best performance. It used the time-steps associated with the average attention weights shown in the visualization. Compared to Scenario 2, the third scenario indicates that the combination of LiDAR and hyperspectral data acquired during periods when their individual explanatory capability is greatest, in combination with the multi-modal network architecture, can provide accurate predictions of maize grain yield. The results reinforce the complementary capability of the two technologies and are consistent with the crop physiology.</p>
</sec>
<sec id="s4_3">
<label>4.3</label>
<title>Integrative multi-modal RS for precision phenology: matching maize growth stages and dynamics</title>
<p>The results in Section 4.2 indicate that using multi-modal RS time series with either early fusion or late fusion techniques can effectively mimic the maize growing season by capturing sequential phenological features of the crop over time, corresponding to the different stages of growth. However, incorporation of late fusion resulted in enhanced accuracy and provided flexibility in remote sensing data collections. This could also improve model generalization, as all the remote sensing modalities are not required in each growth stage for future model implementations, but still yield adequate results.</p>
<p>Results of the study also support the following conclusions relative to LiDAR and hyperspectral RS data:</p>
<list list-type="bullet">
<list-item>
<p>Planting and emergence: LiDAR can capture the initial DTM that is important for estimating plant heights based on point clouds from later dates. Once the plants have emerged, hyperspectral imagery captures small green shoots, resulting in changes in the spectral signature of the field.</p>
</list-item>
<list-item>
<p>Vegetative growth: As maize enters the vegetative stage, the increase in chlorophyll content and leaf area leads to a stronger absorption of energy in the red portion of the spectrum and greater reflection in the near-infrared, as shown in the hyperspectral indices. The increase in plant material during this time is also clearly indicated in the LiDAR metrics.</p>
</list-item>
<list-item>
<p>Reproductive stage: The transition from vegetative to reproductive stages (tasseling, silking, and pollination) involves changes in both plant structure and chlorophyll. These changes can be detected through shifts in the spectral signatures captured in the time series data.</p>
</list-item>
<list-item>
<p>Maturity: As maize reaches maturity, the plants undergo senescence, where the chlorophyll content in the leaves decreases and they transition from a green color to a more yellow-brown hue. During this stage, the plant nutrients begin to break down, and nitrogen, for example, is transferred from the leaves to support the filling of the grain. These changes are visible, particularly in hyperspectral imagery.</p>
</list-item>
</list>
</sec>
</sec>
<sec id="s5">
<label>5</label>
<title>Final discussion and conclusions</title>
<p>This study investigated plot-level maize grain yield predictions through three LSTM-based RNN deep learning models, over two years of GxE experiments in Indiana. The models leveraged genotypic, remote sensing, and weather data in their predictions.</p>
<p>The advantages of integrating multi-modality remote sensing become evident when comparing the outcomes of single modality to those achieved by networks utilizing early fusion or late fusion multi-modality remote sensing data. The <italic>R<sup>2</sup>
<sub>ref</sub>
</italic> values ranged from 0.6 to 0.95, showcasing their ability to effectively model time series remote sensing and weather data. The multi-modal network provided the best results, especially when compared to the traditional vanilla stacked LSTM. Temporal attention allowed these models to focus on specific times during the growing season. By incorporating attention weights to assess the relevance of each time-step, a more comprehensive understanding of the model&#x2019;s prediction mechanism was achieved. This insight can result in more accurate forecasting and provide valuable information on experiments of plots where <italic>in situ</italic> reference data can potentially be increased, and models can be enhanced. Despite the results from all scenarios providing predictions that are potentially useful for breeders in selecting specific varieties, Scenario 3 is notable with accuracies exceeding 0.8 <italic>R<sup>2</sup>
<sub>ref</sub>
</italic> using remote sensing only for dates that align closely with the physiological stages of maize. Furthermore, the last date for remote sensing was scheduled for early August, which could provide additional information for late season testing (e.g., prioritizing in-depth nutrient studies or a stay-green study on the most successful hybrid models).</p>
<p>In the context of multi-modal architectures, RS data acquired by different sensing modalities enables more comprehensive data analysis and interpretation. The contributions of sensors in capturing important characteristics of crop physiology vary throughout the season. The process of combining complementary modalities in either early fusion or late fusion allows mitigation of the weaknesses inherent in one by utilizing the strengths of another, ultimately resulting in more accurate and reliable. For instance, when there is cloud cover, optical data may encounter difficulties. However, LiDAR data are not affected by clouds, ensuring that data collection can proceed unhindered regardless of weather conditions. The model&#x2019;s flexibility is a benefit, as it does not have to include data from each modality in every time-step. To conclude, utilization of multi-modal RS data provides a synergistic framework that enhances the capabilities of individual sensor types, ultimately leading to a more nuanced and thorough comprehension of observed processes, which is useful for both research and operational environments.</p>
<p>Through the G2F initiative, the GxE experiments offer a unique opportunity to develop predictive models by leveraging the genetic data and multiple environmental setups. The networks proposed for predicting maize grain yield are designed to provide end-of-season outcomes for individual years. Given the multiple geographic and environmental conditions encountered, current research is being conducted on the application of domain adaptation to forecast the yield of maize grain for a different year and potentially a different location using semi-supervised approaches.</p>
</sec>
<sec id="s6" sec-type="data-availability">
<title>Data availability statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, on request.</p>
</sec>
<sec id="s7" sec-type="author-contributions">
<title>Author contributions</title>
<p>CA: Conceptualization, Formal analysis, Methodology, Writing &#x2013; original draft, Writing &#x2013; review &amp; editing. MC: Conceptualization, Formal analysis, Methodology, Supervision, Writing &#x2013; review &amp; editing. MT: Supervision, Writing &#x2013; review &amp; editing.</p>
</sec>
</body>
<back>
<sec id="s8" sec-type="funding-information">
<title>Funding</title>
<p>The author(s) declare financial support was received for the research, authorship, and/or publication of this article. This work was partially supported by the Advanced Research Projects Agency-Energy (APPA-E), U.S. Department of Energy, under Grant DE-AR0000593, and by the National Science Foundation (NSF) under NSF Award Number EEC-1941529. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or any agency thereof.</p>
</sec>
<sec id="s9" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="s10" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec id="s11" sec-type="supplementary-material">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fpls.2024.1408047/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fpls.2024.1408047/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Image_1.jpeg" id="SF1" mimetype="image/jpeg">
<label>Supplementary Figure&#xa0;1</label>
<caption>
<p>Scree plot of the explained variance of the individual principal components determine the appropriate number of principal components.</p>
</caption>
</supplementary-material>
<supplementary-material xlink:href="Image_2.jpeg" id="SF2" mimetype="image/jpeg">
<label>Supplementary Figure&#xa0;2</label>
<caption>
<p>Example of a hyperspectral orthomosaic, OSAVI image, and the non-vegetation pixel mask during flowering time.</p>
</caption>
</supplementary-material>
<supplementary-material xlink:href="Image_3.jpeg" id="SF3" mimetype="image/jpeg">
<label>Supplementary Figure&#xa0;3</label>
<caption>
<p>Example of the reconstructed LiDAR point cloud during flowering time.</p>
</caption>
</supplementary-material>
<supplementary-material xlink:href="Image_4.jpeg" id="SF4" mimetype="image/jpeg">
<label>Supplementary Figure&#xa0;4</label>
<caption>
<p>Heatmap plot of the attention weights was obtained by summing the feature weights within each time-step <bold>(A)</bold> LiDAR and <bold>(B)</bold> Hyperspectral.</p>
</caption>
</supplementary-material>
<supplementary-material xlink:href="Image_5.jpeg" id="SF5" mimetype="image/jpeg">
<label>Supplementary Figure&#xa0;5</label>
<caption>
<p>Plot of the error bars derived from <xref ref-type="table" rid="T4">
<bold>Table&#xa0;4</bold>
</xref>.</p>
</caption>
</supplementary-material>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Aghighi</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Azadbakht</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Ashourloo</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Shahrabi</surname> <given-names>H. S.</given-names>
</name>
<name>
<surname>Radiom</surname> <given-names>S.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Machine learning regression techniques for the silage maize yield prediction using time-series images of landsat 8 OLI</article-title>. <source>IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.</source> <volume>11</volume>, <fpage>4563</fpage>&#x2013;<lpage>4577</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/JSTARS.2018.2823361</pub-id>
</citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Akhter</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Sofi</surname> <given-names>S. A.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Precision agriculture using IoT data analytics and machine learning</article-title>. <source>J. King Saud Univ. - Comput. Inf. Sci.</source> <volume>34</volume>, <fpage>5602</fpage>&#x2013;<lpage>5618</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.jksuci.2021.05.013</pub-id>
</citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ali</surname> <given-names>A. M.</given-names>
</name>
<name>
<surname>Abouelghar</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Belal</surname> <given-names>A. A.</given-names>
</name>
<name>
<surname>Saleh</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Yones</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Selim</surname> <given-names>A. I.</given-names>
</name>
<etal/>
</person-group>. (<year>2022</year>). <article-title>Crop yield prediction using multi sensors remote sensing (Review article)</article-title>. <source>Egypt. J. Remote Sens. Space Sci.</source> <volume>25</volume>, <fpage>711</fpage>&#x2013;<lpage>716</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.ejrs.2022.04.006</pub-id>
</citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>AlKhalifah</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Campbell</surname> <given-names>D. A.</given-names>
</name>
<name>
<surname>Falcon</surname> <given-names>C. M.</given-names>
</name>
<name>
<surname>Gardiner</surname> <given-names>J. M.</given-names>
</name>
<name>
<surname>Miller</surname> <given-names>N. D.</given-names>
</name>
<name>
<surname>Romay</surname> <given-names>M. C.</given-names>
</name>
<etal/>
</person-group>. (<year>2018</year>). <article-title>Maize Genomes to Fields: 2014 and 2015 field season genotype, phenotype, environment, and inbred ear image datasets</article-title>. <source>BMC Res. Notes</source> <volume>11</volume>, <fpage>452</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1186/s13104-018-3508-1</pub-id>
</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bahdanau</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Cho</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Bengio</surname> <given-names>Y.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Neural machine translation by jointly learning to align and translate</article-title>. <source>ArXiv</source>. <fpage>1409</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.48550/arXiv.1409.0473</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Barnes</surname> <given-names>E. M.</given-names>
</name>
<name>
<surname>Clarke</surname> <given-names>T. R.</given-names>
</name>
<name>
<surname>Richards</surname> <given-names>S. E.</given-names>
</name>
<name>
<surname>Colaizzi</surname> <given-names>P. D.</given-names>
</name>
<name>
<surname>Haberland</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Kostrzewski</surname> <given-names>M.</given-names>
</name>
<etal/>
</person-group>. (<year>2000</year>). <article-title>Coincident detection of crop water stress, nitrogen status and canopy density using ground based multispectral data</article-title>., in <conf-name>Proceedings of the&#xa0;Fifth Onternational Conference on {recision Agriculture, Bloomington, MN, USA</conf-name>. doi: <uri xlink:href="https://search.nal.usda.gov/discovery/delivery/01NAL_INST:MAIN/12284717660007426">https://search.nal.usda.gov/discovery/delivery/01NAL_INST:MAIN/12284717660007426</uri>.</citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cattell</surname> <given-names>R. B.</given-names>
</name>
</person-group> (<year>1966</year>). <article-title>The scree test for the number of factors</article-title>. <source>Multivar. Behav. Res.</source> <volume>1</volume>, <fpage>245</fpage>&#x2013;<lpage>276</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1207/s15327906mbr0102_10</pub-id>
</citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Tian</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Zhu</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Cao</surname> <given-names>W.</given-names>
</name>
<etal/>
</person-group>. (<year>2023</year>). <article-title>Improving yield prediction based on spatio-temporal deep learning approaches for winter wheat: A case study in Jiangsu Province, China</article-title>. <source>Comput. Electron. Agric.</source> <volume>213</volume>, <elocation-id>108201</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.compag.2023.108201</pub-id>
</citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Danilevicz</surname> <given-names>M. F.</given-names>
</name>
<name>
<surname>Bayer</surname> <given-names>P. E.</given-names>
</name>
<name>
<surname>Boussaid</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Bennamoun</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Edwards</surname> <given-names>D.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Maize yield prediction at an early developmental stage using multispectral images and genotype data for preliminary hybrid selection</article-title>. <source>Remote Sens.</source> <volume>13</volume>, <elocation-id>3976</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/rs13193976</pub-id>
</citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Datt</surname> <given-names>B.</given-names>
</name>
</person-group> (<year>1999</year>). <article-title>Remote sensing of water content in eucalyptus leaves</article-title>. <source>Aust. J. Bot.</source> <volume>47</volume>, <fpage>909</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1071/BT98042</pub-id>
</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Daughtry</surname> <given-names>C.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>Estimating corn leaf chlorophyll concentration from leaf and canopy reflectance</article-title>. <source>Remote Sens. Environ.</source> <volume>74</volume>, <fpage>229</fpage>&#x2013;<lpage>239</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/S0034-4257(00)00113-9</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Eckhoff</surname> <given-names>S. R.</given-names>
</name>
<name>
<surname>Paulsen</surname> <given-names>M. R.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>S. C.</given-names>
</name>
</person-group> (<year>2003</year>). &#x201c;<article-title>MAIZE</article-title>,&#x201d; in <source>Encyclopedia of Food Sciences and Nutrition</source>, <edition>2nd ed</edition>. Ed. <person-group person-group-type="editor">
<name>
<surname>Caballero</surname> <given-names>B.</given-names>
</name>
</person-group> (<publisher-name>Academic Press</publisher-name>, <publisher-loc>Oxford</publisher-loc>), <fpage>3647</fpage>&#x2013;<lpage>3653</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/B0-12-227055-X/00725-2</pub-id>
</citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gamboa</surname> <given-names>J. C. B.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Deep learning for time-series analysis</article-title>. doi:&#xa0;<pub-id pub-id-type="doi">10.48550/arXiv.1701.01887</pub-id>
</citation>
</ref>
<ref id="B14">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Gangopadhyay</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Shook</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Singh</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Sarkar</surname> <given-names>S.</given-names>
</name>
</person-group> (<year>2020</year>). &#x201c;<article-title>Interpreting the impact of weather on crop yield using attention</article-title>,&#x201d; in <conf-name>NeurIPS Workshop on AI for Earth Sciences</conf-name> (<publisher-name>NeurIPS</publisher-name>).</citation>
</ref>
<ref id="B15">
<citation citation-type="book">
<person-group person-group-type="author">
<collab>Genomes To Fields</collab>
</person-group> (<year>2023</year>). <source>Genomes to Fields genotypic data from 2014 to 2023</source>. doi:&#xa0;<pub-id pub-id-type="doi">10.25739/RAGT-7213</pub-id>
</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gharibi</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Habib</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>True orthophoto generation from aerial frame images and liDAR data: an update</article-title>. <source>Remote Sens.</source> <volume>10</volume>, <elocation-id>581</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/rs10040581</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guo</surname> <given-names>M.-H.</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>T.-X.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>J.-J.</given-names>
</name>
<name>
<surname>Liu</surname> <given-names>Z.-N.</given-names>
</name>
<name>
<surname>Jiang</surname> <given-names>P.-T.</given-names>
</name>
<name>
<surname>Mu</surname> <given-names>T.-J.</given-names>
</name>
<etal/>
</person-group>. (<year>2022</year>). <article-title>Attention mechanisms in computer vision: A survey</article-title>. <source>Comput. Vis. Media</source> <volume>8</volume>, <fpage>331</fpage>&#x2013;<lpage>368</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/s41095-022-0271-y</pub-id>
</citation>
</ref>
<ref id="B18">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Hu</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Shen</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>G.</given-names>
</name>
</person-group> (<year>2018</year>). &#x201c;<article-title>Squeeze-and-excitation networks</article-title>,&#x201d; in <conf-name>2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition</conf-name>. <fpage>7132</fpage>&#x2013;<lpage>7141</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/CVPR.2018.00745</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="web">
<person-group person-group-type="author">
<collab>Indiana State Climate Office</collab>
</person-group> (<year>2022</year>). Available online at: <uri xlink:href="https://ag.purdue.edu/Indiana-state-climate/">https://ag.purdue.edu/Indiana-state-climate/</uri>.</citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Itti</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Koch</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Niebur</surname> <given-names>E.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>A model of saliency-based visual attention for rapid scene analysis</article-title>. <source>IEEE Trans. Pattern Anal. Mach. Intell.</source> <volume>20</volume>, <fpage>1254</fpage>&#x2013;<lpage>1259</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/34.730558</pub-id>
</citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jain</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Ray</surname> <given-names>S. S.</given-names>
</name>
<name>
<surname>Singh</surname> <given-names>J. P.</given-names>
</name>
<name>
<surname>Panigrahy</surname> <given-names>S.</given-names>
</name>
</person-group> (<year>2007</year>). <article-title>Use of hyperspectral data to assess the effects of different nitrogen applications on a potato crop</article-title>. <source>Precis. Agric.</source> <volume>8</volume>, <fpage>225</fpage>&#x2013;<lpage>239</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/s11119-007-9042-0</pub-id>
</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Hu</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Zhong</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Xu</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Huang</surname> <given-names>J.</given-names>
</name>
<etal/>
</person-group>. (<year>2020</year>). <article-title>A deep learning approach to conflating heterogeneous geospatial data for corn yield estimation: A case study of the US Corn Belt at the county level</article-title>. <source>Glob. Change Biol.</source> <volume>26</volume>, <fpage>1754</fpage>&#x2013;<lpage>1766</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1111/gcb.14885</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Khaki</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Pham</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>L.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Simultaneous corn and soybean yield prediction from remote sensing data using deep transfer learning</article-title>. <source>Sci. Rep.</source> <volume>11</volume>, <fpage>11132</fpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/s41598-021-89779-z</pub-id>
</citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Khaki</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>L.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Crop yield prediction using deep neural networks</article-title>. <source>Front. Plant Sci.</source> <volume>10</volume>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fpls.2019.00621</pub-id>
</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kingma</surname> <given-names>D. P.</given-names>
</name>
<name>
<surname>Ba</surname> <given-names>J.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Adam: A method for stochastic optimization</article-title>. doi:&#xa0;<pub-id pub-id-type="doi">10.48550/arXiv.1412.6980</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kong</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Cui</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Xia</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Lv</surname> <given-names>H.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Convolution and long short-term memory hybrid deep neural networks for remaining useful life prognostics</article-title>. <source>Appl. Sci.</source> <volume>9</volume>, <elocation-id>4156</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/app9194156</pub-id>
</citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kumar</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Dheenadayalan</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Reddy</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Kulkarni</surname> <given-names>S.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Multimodal neural network for demand forecasting</article-title>. doi:&#xa0;<pub-id pub-id-type="doi">10.48550/arXiv.2210.11502</pub-id>
</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>LaForest</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Hasheminasab</surname> <given-names>S. M.</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Flatt</surname> <given-names>J. E.</given-names>
</name>
<name>
<surname>Habib</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>New strategies for time delay estimation during system calibration for UAV-based GNSS/INS-assisted imaging systems</article-title>. <source>Remote Sens.</source> <volume>11</volume>, <elocation-id>1811</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/rs11151811</pub-id>
</citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>LeCun</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Bengio</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Hinton</surname> <given-names>G.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Deep learning</article-title>. <source>Nature</source> <volume>521</volume>, <fpage>436</fpage>&#x2013;<lpage>444</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1038/nature14539</pub-id>
</citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Cheng</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Duan</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Sui</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Huang</surname> <given-names>X.</given-names>
</name>
<etal/>
</person-group>. (<year>2022</year>). <article-title>UAV-based hyperspectral and ensemble machine learning for predicting yield in winter wheat</article-title>. <source>Agronomy</source> <volume>12</volume>, <elocation-id>202</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/agronomy12010202</pub-id>
</citation>
</ref>
<ref id="B31">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Li</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>H.-L.</given-names>
</name>
<name>
<surname>Song</surname> <given-names>B.-W.</given-names>
</name>
<name>
<surname>Zhu</surname> <given-names>P.-L.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>H.-W.</given-names>
</name>
</person-group> (<year>2016</year>). <source>Vegetation pixels extraction based on red-band enhanced normalized difference vegetation index</source>. in <conf-name>Eighth International Conference on Digital Image Processing (ICDIP 2016)</conf-name>, (<publisher-name>SPIE</publisher-name>), <page-range>731&#x2013;737</page-range>.</citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Tian</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Gao</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>S.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Global-local temporal representations for video person re-identification</article-title>. <source>IEEE Trans. Image Process.</source> <volume>29</volume>, <fpage>4461</fpage>&#x2013;<lpage>4473</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/TIP.2020.2972108</pub-id>
</citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lin</surname> <given-names>Y.-C.</given-names>
</name>
<name>
<surname>Cheng</surname> <given-names>Y.-T.</given-names>
</name>
<name>
<surname>Zhou</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Ravi</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Hasheminasab</surname> <given-names>S. M.</given-names>
</name>
<name>
<surname>Flatt</surname> <given-names>J. E.</given-names>
</name>
<etal/>
</person-group>. (<year>2019</year>). <article-title>Evaluation of UAV liDAR for mapping coastal environments</article-title>. <source>Remote Sens.</source> <volume>11</volume>, <elocation-id>2893</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/rs11242893</pub-id>
</citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lin</surname> <given-names>Y.-C.</given-names>
</name>
<name>
<surname>Habib</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Quality control and crop characterization framework for multi-temporal UAV LiDAR data over mechanized agricultural fields</article-title>. <source>Remote Sens. Environ.</source> <volume>256</volume>, <elocation-id>112299</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.rse.2021.112299</pub-id>
</citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lipton</surname> <given-names>Z. C.</given-names>
</name>
<name>
<surname>Berkowitz</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Elkan</surname> <given-names>C.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>A critical review of recurrent neural networks for sequence learning</article-title>. doi:&#xa0;<pub-id pub-id-type="doi">10.48550/arXiv.1506.00019</pub-id>
</citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Liu</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Chen</surname> <given-names>L.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Multi-modal fusion emotion recognition method of speech expression based on deep learning</article-title>. <source>Front. Neurorobotics</source> <volume>15</volume>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fnbot.2021.697634</pub-id>
</citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Maimaitijiang</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Sagan</surname> <given-names>V.</given-names>
</name>
<name>
<surname>Sidike</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Hartling</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Esposito</surname> <given-names>F.</given-names>
</name>
<name>
<surname>Fritschi</surname> <given-names>F. B.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Soybean yield prediction from UAV using multimodal data fusion and deep learning</article-title>. <source>Remote Sens. Environ.</source> <volume>237</volume>, <elocation-id>111599</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.rse.2019.111599</pub-id>
</citation>
</ref>
<ref id="B38">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Masjedi</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Carpenter</surname> <given-names>N. R.</given-names>
</name>
<name>
<surname>Crawford</surname> <given-names>M. M.</given-names>
</name>
<name>
<surname>Tuinstra</surname> <given-names>M. R.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Prediction of sorghum biomass using UAV time series data and recurrent neural networks</article-title>,&#x201d; in <conf-name>2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW)</conf-name>, <conf-loc>Long Beach, CA, USA</conf-loc>. <fpage>2695</fpage>&#x2013;<lpage>2702</lpage> (<publisher-name>IEEE</publisher-name>). doi:&#xa0;<pub-id pub-id-type="doi">10.1109/CVPRW.2019.00327</pub-id>
</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Masjedi</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Crawford</surname> <given-names>M. M.</given-names>
</name>
<name>
<surname>Carpenter</surname> <given-names>N. R.</given-names>
</name>
<name>
<surname>Tuinstra</surname> <given-names>M. R.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Multi-temporal predictive modelling of sorghum biomass using UAV-based hyperspectral and liDAR data</article-title>. <source>Remote Sens.</source> <volume>12</volume>, <elocation-id>3587</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/rs12213587</pub-id>
</citation>
</ref>
<ref id="B40">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Masjedi</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Zhao</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Thompson</surname> <given-names>A. M.</given-names>
</name>
<name>
<surname>Yang</surname> <given-names>K.-W.</given-names>
</name>
<name>
<surname>Flatt</surname> <given-names>J. E.</given-names>
</name>
<name>
<surname>Crawford</surname> <given-names>M. M.</given-names>
</name>
<etal/>
</person-group>. (<year>2018</year>). &#x201c;<article-title>Sorghum biomass prediction using UAV-based remote sensing data and crop model simulation</article-title>,&#x201d; in <conf-name>IGARSS 2018 - 2018 IEEE International Geoscience and Remote Sensing Symposium</conf-name>. <fpage>7719</fpage>&#x2013;<lpage>7722</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/IGARSS.2018.8519034</pub-id>
</citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Merzlyak</surname> <given-names>M. N.</given-names>
</name>
<name>
<surname>Gitelson</surname> <given-names>A. A.</given-names>
</name>
<name>
<surname>Chivkunova</surname> <given-names>O. B.</given-names>
</name>
<name>
<surname>Rakitin</surname> <given-names>V. Y. U.</given-names>
</name>
</person-group> (<year>1999</year>). <article-title>Non-destructive optical detection of pigment changes during leaf senescence and fruit ripening</article-title>. <source>Physiol. Plant</source> <volume>106</volume>, <fpage>135</fpage>&#x2013;<lpage>141</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1034/j.1399-3054.1999.106119.x</pub-id>
</citation>
</ref>
<ref id="B42">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mnih</surname> <given-names>V.</given-names>
</name>
<name>
<surname>Heess</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Graves</surname> <given-names>A.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Recurrent models of visual attention</article-title>. <source>Adv. Neural Inf. Process. Syst.</source> <volume>27</volume>. doi:&#xa0;<pub-id pub-id-type="doi">10.5555/2969033.2969073</pub-id>
</citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Niu</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Zhong</surname> <given-names>G.</given-names>
</name>
<name>
<surname>Yu</surname> <given-names>H.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>A review on the attention mechanism of deep learning</article-title>. <source>Neurocomputing</source> <volume>452</volume>, <fpage>48</fpage>&#x2013;<lpage>62</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.neucom.2021.03.091</pub-id>
</citation>
</ref>
<ref id="B44">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Razzaq</surname> <given-names>A.</given-names>
</name>
<name>
<surname>Kaur</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Akhter</surname> <given-names>N.</given-names>
</name>
<name>
<surname>Wani</surname> <given-names>S. H.</given-names>
</name>
<name>
<surname>Saleem</surname> <given-names>F.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Next-generation breeding strategies for climate-ready crops</article-title>. <source>Front. Plant Sci.</source> <volume>12</volume>. doi:&#xa0;<pub-id pub-id-type="doi">10.3389/fpls.2021.620420</pub-id>
</citation>
</ref>
<ref id="B45">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Roujean</surname> <given-names>J.-L.</given-names>
</name>
<name>
<surname>Breon</surname> <given-names>F.-M.</given-names>
</name>
</person-group> (<year>1995</year>). <article-title>Estimating PAR absorbed by vegetation from bidirectional reflectance measurements</article-title>. <source>Remote Sens. Environ.</source> <volume>51</volume>, <fpage>375</fpage>&#x2013;<lpage>384</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/0034-4257(94)00114-3</pub-id>
</citation>
</ref>
<ref id="B46">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Serrano</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Smith</surname> <given-names>N. A.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Is attention interpretable</article-title>? doi:&#xa0;<pub-id pub-id-type="doi">10.48550/arXiv.1906.03731</pub-id>
</citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shen</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Mercatoris</surname> <given-names>B.</given-names>
</name>
<name>
<surname>Cao</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Kwan</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Guo</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Yao</surname> <given-names>H.</given-names>
</name>
<etal/>
</person-group>. (<year>2022</year>). <article-title>Improving wheat yield prediction accuracy using LSTM-RF framework based on UAV thermal infrared and multispectral imagery</article-title>. <source>Agriculture</source> <volume>12</volume>, <elocation-id>892</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/agriculture12060892</pub-id>
</citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shih</surname> <given-names>S.-Y.</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>F.-K.</given-names>
</name>
<name>
<surname>Lee</surname> <given-names>H.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Temporal pattern attention for multivariate time series forecasting</article-title>. <source>Mach. Learn.</source> <volume>108</volume>, <fpage>1421</fpage>&#x2013;<lpage>1441</lpage>. doi: <pub-id pub-id-type="doi">10.1007/s10994-019-05815-0</pub-id>
</citation>
</ref>
<ref id="B49">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shook</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Gangopadhyay</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Wu</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Ganapathysubramanian</surname> <given-names>B.</given-names>
</name>
<name>
<surname>Sarkar</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Singh</surname> <given-names>A. K.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Crop yield prediction integrating genotype and weather variables using deep learning</article-title>. <source>PloS One</source> <volume>16</volume>, <elocation-id>e0252402</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1371/journal.pone.0252402</pub-id>
</citation>
</ref>
<ref id="B50">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Sujatha</surname> <given-names>R.</given-names>
</name>
<name>
<surname>Isakki</surname> <given-names>P.</given-names>
</name>
</person-group> (<year>2016</year>). &#x201c;<article-title>A study on crop yield forecasting using classification techniques</article-title>,&#x201d; in <conf-name>2016 International Conference on Computing Technologies and Intelligent Data Engineering (ICCTIDE'16)</conf-name>. <fpage>1</fpage>&#x2013;<lpage>4</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/ICCTIDE.2016.7725357</pub-id>
</citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sun</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Di</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Shen</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Lai</surname> <given-names>Z.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>County-level soybean yield prediction using deep CNN-LSTM model</article-title>. <source>Sensors</source> <volume>19</volume>, <elocation-id>4363</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/s19204363</pub-id>
</citation>
</ref>
<ref id="B52">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tian</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>P.</given-names>
</name>
<name>
<surname>Tansey</surname> <given-names>K.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>H.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>An LSTM neural network for improving wheat yield estimates by integrating remote sensing data and meteorological data in the Guanzhong Plain, PR China</article-title>. <source>Agric. For. Meteorol.</source> <volume>310</volume>, <elocation-id>108629</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.agrformet.2021.108629</pub-id>
</citation>
</ref>
<ref id="B53">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Toledo</surname> <given-names>C. A.</given-names>
</name>
<name>
<surname>Crawford</surname> <given-names>M.</given-names>
</name>
</person-group> (<year>2023</year>). &#x201c;<article-title>Deep learning models using multi-modal remote sensing for prediction of maize yield in plant breeding experiments</article-title>,&#x201d; in <conf-name>IGARSS 2023 - 2023 IEEE International Geoscience and Remote Sensing Symposium</conf-name>. <fpage>487</fpage>&#x2013;<lpage>490</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/IGARSS52108.2023.10281741</pub-id>
</citation>
</ref>
<ref id="B54">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Toledo</surname> <given-names>C. A.</given-names>
</name>
<name>
<surname>Crawford</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Vyn</surname> <given-names>T.</given-names>
</name>
</person-group> (<year>2022</year>). &#x201c;<article-title>Maize yield prediction based on&#xa0;multi-modality remote sensing and lstm models in nitrogen management practice&#xa0;trials</article-title>,&#x201d; in <conf-name>2022 12th Workshop on Hyperspectral Imaging and Signal Processing: Evolution in Remote Sensing (WHISPERS)</conf-name>. <fpage>1</fpage>&#x2013;<lpage>7</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/WHISPERS56178.2022.9955086</pub-id>
</citation>
</ref>
<ref id="B55">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tolley</surname> <given-names>S. A.</given-names>
</name>
<name>
<surname>Brito</surname> <given-names>L. F.</given-names>
</name>
<name>
<surname>Wang</surname> <given-names>D. R.</given-names>
</name>
<name>
<surname>Tuinstra</surname> <given-names>M. R.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>Genomic prediction and association mapping of maize grain yield in multi-environment trials based on reaction norm models</article-title>. <source>Front. Genet.</source> <volume>14</volume>, <elocation-id>1221751</elocation-id>. doi: <pub-id pub-id-type="doi">10.3389/fgene.2023.1221751</pub-id>
</citation>
</ref>
<ref id="B56">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ullah</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Rahman</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Muhammad</surname> <given-names>N.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Evaluation of maize hybrids for maturity and related traits</article-title>. <source>Sarhad J. Agric.</source> <volume>33</volume>, <page-range>624&#x2013;629</page-range>. doi:&#xa0;<pub-id pub-id-type="doi">10.17582/journal.sja/2017/33.4.624.629</pub-id>
</citation>
</ref>
<ref id="B57">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Vogelmann</surname> <given-names>J. E.</given-names>
</name>
<name>
<surname>Rock</surname> <given-names>B. N.</given-names>
</name>
<name>
<surname>Moss</surname> <given-names>D. M.</given-names>
</name>
</person-group> (<year>1993</year>). <article-title>Red edge spectral measurements from sugar maple leaves</article-title>. <source>Int. J. Remote Sens.</source> <volume>14</volume>, <fpage>1563</fpage>&#x2013;<lpage>1575</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1080/01431169308953986</pub-id>
</citation>
</ref>
<ref id="B58">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wan</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Cen</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Zhu</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Zhu</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Sun</surname> <given-names>D.</given-names>
</name>
<etal/>
</person-group>. (<year>2020</year>). <article-title>Grain yield prediction of rice using multi-temporal UAV-based RGB and multispectral images and model transfer &#x2013; a case study of small farmlands in the South of China</article-title>. <source>Agric. For. Meteorol.</source> <volume>291</volume>, <elocation-id>108096</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.agrformet.2020.108096</pub-id>
</citation>
</ref>
<ref id="B59">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Crawford</surname> <given-names>M. M.</given-names>
</name>
</person-group> (<year>2021</year>). &#x201c;<article-title>Multi-year sorghum biomass prediction with UAV-based remote sensing data</article-title>,&#x201d; in <conf-name>2021 IEEE International Geoscience and Remote Sensing Symposium IGARSS</conf-name>. <fpage>4312</fpage>&#x2013;<lpage>4315</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1109/IGARSS47720.2021.9554313</pub-id>
</citation>
</ref>
<ref id="B60">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>T.</given-names>
</name>
<name>
<surname>Crawford</surname> <given-names>M. M.</given-names>
</name>
<name>
<surname>Tuinstra</surname> <given-names>M. R.</given-names>
</name>
</person-group> (<year>2023</year>). <article-title>A novel transfer learning framework for sorghum biomass prediction using UAV-based remote sensing data and genetic markers</article-title>. <source>Front. Plant Sci.</source> <volume>14</volume>, <elocation-id>1138479</elocation-id>. doi: <pub-id pub-id-type="doi">10.3389/fpls.2023.1138479</pub-id>
</citation>
</ref>
<ref id="B61">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Huang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Feng</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Yin</surname> <given-names>D.</given-names>
</name>
</person-group> (<year>2020</year>c). <article-title>Winter wheat yield prediction at county level and uncertainty analysis in main wheat-producing regions of China with deep learning approaches</article-title>. <source>Remote Sens.</source> <volume>12</volume>, <elocation-id>1744</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/rs12111744</pub-id>
</citation>
</ref>
<ref id="B62">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Wang</surname> <given-names>W.</given-names>
</name>
<name>
<surname>Tran</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Feiszli</surname> <given-names>M.</given-names>
</name>
</person-group> (<year>2020</year>a). &#x201c;<article-title>What makes training multi-modal classification networks hard</article-title>?,&#x201d; in <conf-name>Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition</conf-name>. <fpage>12695</fpage>&#x2013;<lpage>12705</lpage>.</citation>
</ref>
<ref id="B63">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Xian</surname> <given-names>Q.</given-names>
</name>
<name>
<surname>Liang</surname> <given-names>W.</given-names>
</name>
</person-group> (<year>2022</year>). &#x201c;<article-title>A multi-modal time series intelligent prediction model</article-title>,&#x201d; in <source>Proceeding of 2021 International Conference on Wireless Communications, Networking and Applications</source>. Eds. <person-group person-group-type="editor">
<name>
<surname>Qian</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Jabbar</surname> <given-names>M. A.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>X.</given-names>
</name>
</person-group> (<publisher-name>Springer Nature</publisher-name>, <publisher-loc>Singapore</publisher-loc>), <fpage>1150</fpage>&#x2013;<lpage>1157</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1007/978-981-19-2456-9_115</pub-id>
</citation>
</ref>
<ref id="B64">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Zheng</surname> <given-names>H.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Olsen</surname> <given-names>M. S.</given-names>
</name>
<etal/>
</person-group>. (<year>2022</year>). <article-title>Smart breeding driven by big data, artificial intelligence, and integrated genomic-enviromic prediction</article-title>. <source>Mol. Plant</source> <volume>15</volume>, <fpage>1664</fpage>&#x2013;<lpage>1695</lpage>. doi:&#xa0;<pub-id pub-id-type="doi">10.1016/j.molp.2022.09.001</pub-id>
</citation>
</ref>
<ref id="B65">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Yang</surname> <given-names>C.</given-names>
</name>
<name>
<surname>Baireddy</surname> <given-names>S.</given-names>
</name>
<name>
<surname>Cai</surname> <given-names>E.</given-names>
</name>
<name>
<surname>Crawford</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Delp</surname> <given-names>E. J.</given-names>
</name>
</person-group> (<year>2021</year>). &#x201c;<article-title>Field-based plot extraction using UAV RGB images</article-title>,&#x201d; in <conf-name>2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW)</conf-name>, <conf-loc>Montreal, BC, Canada</conf-loc>. <fpage>1390</fpage>&#x2013;<lpage>1398</lpage> (<publisher-name>IEEE</publisher-name>). doi:&#xa0;<pub-id pub-id-type="doi">10.1109/ICCVW54120.2021.00160</pub-id>
</citation>
</ref>
<ref id="B66">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>You</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Li</surname> <given-names>X.</given-names>
</name>
<name>
<surname>Low</surname> <given-names>M.</given-names>
</name>
<name>
<surname>Lobell</surname> <given-names>D.</given-names>
</name>
<name>
<surname>Ermon</surname> <given-names>S.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Deep gaussian process for crop yield prediction based on remote sensing data</article-title>. <source>Proc. AAAI Conf. Artif. Intell.</source> <volume>31</volume>. doi:&#xa0;<pub-id pub-id-type="doi">10.1609/aaai.v31i1.11172</pub-id>
</citation>
</ref>
<ref id="B67">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname> <given-names>L.</given-names>
</name>
<name>
<surname>Zhang</surname> <given-names>Z.</given-names>
</name>
<name>
<surname>Luo</surname> <given-names>Y.</given-names>
</name>
<name>
<surname>Cao</surname> <given-names>J.</given-names>
</name>
<name>
<surname>Tao</surname> <given-names>F.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Combining optical, fluorescence, thermal satellite, and environmental data to predict county-level maize yield in China using machine learning approaches</article-title>. <source>Remote Sens.</source> <volume>12</volume>, <elocation-id>21</elocation-id>. doi:&#xa0;<pub-id pub-id-type="doi">10.3390/rs12010021</pub-id>
</citation>
</ref>
</ref-list>
</back>
</article>