<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="review-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Energy Res.</journal-id>
<journal-title>Frontiers in Energy Research</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Energy Res.</abbrev-journal-title>
<issn pub-type="epub">2296-598X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">788320</article-id>
<article-id pub-id-type="doi">10.3389/fenrg.2021.788320</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Energy Research</subject>
<subj-group>
<subject>Methods</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A Hybrid Forecasting Model Based on CNN and Informer for Short-Term Wind Power</article-title>
<alt-title alt-title-type="left-running-head">Wang et&#x20;al.</alt-title>
<alt-title alt-title-type="right-running-head">Hybrid Short-Term Wind Power Prediction</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Wang</surname>
<given-names>Hai-Kun</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1500966/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Song</surname>
<given-names>Ke</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1553563/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Cheng</surname>
<given-names>Yi</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>School of Artificial Intelligence</institution>, <institution>Chongqing University of Technology</institution>, <addr-line>Chongqing</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Chongqing Industrial Big Data Innovation Center Co., Ltd.</institution>, <addr-line>Chongqing</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1065277/overview">Sofiane Khadraoui</ext-link>, University of Sharjah, United Arab Emirates</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1043909/overview">Neeraj Dhanraj Bokde</ext-link>, Aarhus University, Denmark</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1121848/overview">Yushuai Li</ext-link>, University of Oslo, Norway</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Hai-Kun Wang, <email>hkwang@cqut.edu.cn</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Wind Energy, a section of the journal Frontiers in Energy Research</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>24</day>
<month>01</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>9</volume>
<elocation-id>788320</elocation-id>
<history>
<date date-type="received">
<day>02</day>
<month>10</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>31</day>
<month>12</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Wang, Song and Cheng.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Wang, Song and Cheng</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>Wind power prediction reduces the uncertainty of an entire energy system, which is very important for balancing energy supply and demand. To improve the prediction accuracy, an average wind power prediction method based on a convolutional neural network and a model named Informer is proposed. The original data features comprise only one time scale, which has a minimal amount of time information and trends. A 2-D convolutional neural network was employed to extract additional time features and trend information. To improve the accuracy of long sequence input prediction, Informer is applied to predict the average wind power. The proposed model was trained and tested based on a dataset of a real wind farm in a region of China. The evaluation metrics included MAE, MSE, RMSE, and MAPE. Many experimental results show that the proposed methods achieve good performance and effectively improve the average wind power prediction accuracy.</p>
</abstract>
<kwd-group>
<kwd>average wind power prediction</kwd>
<kwd>long sequence input prediction</kwd>
<kwd>convolution</kwd>
<kwd>informer</kwd>
<kwd>A hybrid method</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>With the rapid development of the global economy, people&#x2019;s living standards and the global energy demand are continuously increasing, while fossil-fuel energy sources have declined (<xref ref-type="bibr" rid="B4">Chakraborty et&#x20;al., 2018</xref>; <xref ref-type="bibr" rid="B32">Tu et&#x20;al., 2019</xref>). Wind power generation, which has the advantages of being clean, low-cost and in ample supply, is an indispensable aspect of developing new global energy (<xref ref-type="bibr" rid="B6">Chen and Yu, 2014</xref>; <xref ref-type="bibr" rid="B19">Hu et&#x20;al., 2021</xref>; <xref ref-type="bibr" rid="B26">Oh and Son, 2020</xref>). The installed capacity of wind generation worldwide has reached 644.5&#xa0;GW in 2018, which is 17.4% higher than that in the past year (<xref ref-type="bibr" rid="B39">Zhang et&#x20;al., 2020</xref>). The Global Wind Energy Development Report 2019 shows that the newly installed capacity of global wind turbines in 2019 was 60.4&#xa0;GW. The instability of wind power is the main problem faced by the grid-connected, operation technology of wind power (<xref ref-type="bibr" rid="B3">Chai et&#x20;al., 2015</xref>; <xref ref-type="bibr" rid="B20">Jiang et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B23">Li et&#x20;al., 2019</xref>; <xref ref-type="bibr" rid="B18">Hu et&#x20;al., 2020</xref>). With an increasing number of large-capacity wind farms, when their power grid surpasses a certain limit, the stability of the power system will be seriously affected, even threatening the safety of the whole power grid due to the randomness and low energy density of wind energy. (<xref ref-type="bibr" rid="B5">Chang, 2014</xref>; <xref ref-type="bibr" rid="B15">Hazari et&#x20;al., 2018</xref>). Therefore, the effective operation of the whole mechanism can be guaranteed, and the stability of the whole system can be enhanced only by more accurate forecasting of wind power generation (<xref ref-type="bibr" rid="B16">Hong and Rioflorido, 2019</xref>; <xref ref-type="bibr" rid="B38">Zhang et&#x20;al., 2019</xref>).</p>
<p>Currently, the main wind power forecasting methods include physical methods, statistical methods, and artificial intelligence methods. The physical forecasting method is the first method applied in wind power forecasting. The physical forecasting method mainly includes three technical links: the introduction of numerical weather prediction (NWP) data, the acquisition of wind speed and direction at the height of a wind turbine hub, and wind speed-power conversion (<xref ref-type="bibr" rid="B12">Feng et&#x20;al., 2010</xref>). Men Z (<xref ref-type="bibr" rid="B25">Men et&#x20;al., 2016</xref>) used the Gauss hybrid model to construct the mapping relationship between measured wind speed and NWP data and employed this model to modify NWP wind speed. The corrected NWP data and power prediction accuracy were greatly improved. Cassola (<xref ref-type="bibr" rid="B2">Cassola and Burlando, 2012</xref>) used the Kalman filter algorithm to filter the NWP output line, which effectively reduced the systematic error of weather forecasting and significantly improved the accuracy of the NWP model. Because of the low forecast accuracy of physical methods, the accuracy of physical prediction models that directly use the NWP often cannot meet the application requirements. On the other hand, because of the low updating frequency of NWP data, it is difficult to meet the requirements of 0&#x2013;3&#xa0;h forecasting. The statistical method does not require the introduction of historical wind information from wind farms. This method can be employed to extrapolate and predict the output of wind power of wind farms at a particular time in the future based on historical sequence characteristics (such as autocorrelation, partial correlation, standard deviation, etc.) of the power generated by wind farms. Erdem (<xref ref-type="bibr" rid="B11">Erdem and Shi, 2011</xref>) decomposed the wind speed into horizontal and vertical components according to the direction of the wind and constructed an ARMA model to separately predict the wind speed, which improved the prediction results. Pan (<xref ref-type="bibr" rid="B27">Pan et&#x20;al., 2008</xref>) combined the time series analysis method with the Kalman filter and dynamically corrected the prediction model system and improved the prediction accuracy at the next moment. Dong (<xref ref-type="bibr" rid="B9">Dong et&#x20;al., 2008</xref>) utilized the phase space theory of chaotic time series to construct a wind power neural network prediction&#x20;model.</p>
<p>The artificial intelligence (AI) method mainly uses one or more AI algorithms to train historical power data and then predict future wind power. Kariniotakis (<xref ref-type="bibr" rid="B21">Kariniotakis et&#x20;al., 1996</xref>) proposed ultrashort-term wind power prediction using an ANN. Shukur and Lee (<xref ref-type="bibr" rid="B31">Shukur and Lee, 2015</xref>) proposed a Kalman filter (KF)-(ANN) system to predict the wind speed sequences of Malaysia and Iraq. Chen (<xref ref-type="bibr" rid="B8">Chen and Folly, 2021</xref>) proposed a mixed input features-based cascade-connected artificial neural network (MIF-CANN). The method is employed to train input features from many neighbouring stations without encountering overfitting issues caused by many input features. Multiple ANNs train different combinations of input features in the first layer of the MIF-CANN model to produce preliminary results and then cascade into the second phase of the MIF-CANN model as inputs. Hu (<xref ref-type="bibr" rid="B17">Hu et&#x20;al., 2014</xref>) applied Bayes theory to optimize the traditional SVM loss function and established a v-SVM model, which improved the accuracy of short-term wind speed prediction. With the development of big data technology, AI prediction methods have gradually developed from machine learning algorithms to deep learning algorithms (<xref ref-type="bibr" rid="B34">Wang et&#x20;al., 2017</xref>). Haq (<xref ref-type="bibr" rid="B14">Haq and Zhen, 2019</xref>) proposed the improved empirical mode decomposition (IEMD) to decompose the load demand time series and selected T-Copula to incorporate the effect of exogenous variables by performing correlation analysis. Recently, many advanced models based on deep learning have also been reported (<xref ref-type="bibr" rid="B36">Wu et&#x20;al., 2019</xref>). Khodayar (<xref ref-type="bibr" rid="B22">Khodayar and Wang, 2019</xref>) presented an algorithm for deep neural networks (DNNs). Zhu (<xref ref-type="bibr" rid="B43">Zhu et&#x20;al., 2017</xref>) used long short-term memory (LSTM) to model multivariable time series to achieve wind power prediction. Chen (<xref ref-type="bibr" rid="B7">Chen et&#x20;al., 2019</xref>) conducted correlation research on wind speed prediction based on extreme learning machines (ELMs), Elman neural networks, and LSTM networks. Han (<xref ref-type="bibr" rid="B13">Han et&#x20;al., 2019</xref>) proposed a model based on the copula function and LSTM, which achieved better prediction results. Zhou (<xref ref-type="bibr" rid="B41">Zhou et&#x20;al., 2019</xref>) proposed a K-means-long short-term memory (K-means-LSTM) neural network to classify wind power factors and establish a sub-prediction model. Peng (<xref ref-type="bibr" rid="B29">Peng et&#x20;al., 2021</xref>) proposed a new neural-network prediction model named encoder attention BiLSTM-quantile regression (EALSTM-QR), which was developed for wind-power prediction considering the input of NWP and the deep-learning method. The combination inputs contain historical wind-power data and features extracted and obtained from the NWP through the encoder and attention levels. The bidirectional LSTM was utilized to generate wind-power time-series probability prediction results. The QR method and confidence interval limits were applied to obtain the final prediction intervals. Hu (<xref ref-type="bibr" rid="B19">Hu et&#x20;al., 2021</xref>) proposed an improved deep belief network forecasting method for wind power, which employed a Gaussian-Bernoulli, restricted Boltzmann machine. Wang (<xref ref-type="bibr" rid="B35">Wang et&#x20;al., 2021</xref>) applied a convolutional neural network for feature reconfiguration with temporal information, which increased the proportion of valid data, reduced the influence of outliers, and helped the neural network capture features and regularities from the historical dataset. Zhang (<xref ref-type="bibr" rid="B40">Zhang et&#x20;al., 2021</xref>) proposed power prediction of a wind farm cluster based on spatiotemporal correlations. Pandey (<xref ref-type="bibr" rid="B28">Pandey et&#x20;al., 2021</xref>) proposed two hybrid models for water demand forecasting. The first approach is based on the hybridization of ensemble empirical mode decomposition (EEMD) and difference pattern sequence forecasting (DPSF), and the second approach is based on the hybridization of EEMD with DPSF and autoregressive integrated moving average (ARIMA). The EEMD-DPSF approach provides better results, whereas the EEMD-DPSF-ARIMA approach requires shorter computational times. Shi (<xref ref-type="bibr" rid="B30">Shi et&#x20;al., 2021</xref>) proposed a hybrid neural network, short-term, load forecasting model based on a temporal convolutional network (TCN) and gated recurrent unit (GRU) and utilized the state-of-the-art AdaBelief optimizer and attention mechanism were to enhance the prediction accuracy and efficiency. Dong (<xref ref-type="bibr" rid="B10">Dong et&#x20;al., 2021</xref>) proposed a regional wind power probabilistic forecasting model comprising an improved kernel density estimation (IKDE), regular vine copulas, and ensemble learning. Wu (<xref ref-type="bibr" rid="B37">Wu et&#x20;al., 2020</xref>) utilized a transformer to predict time series data. This method applied the self-attention mechanism to learn complex patterns and dynamics from time series data. However, some problems, such as high spatiotemporal complexity and limited input and output sequences, were still encountered. Zhou (<xref ref-type="bibr" rid="B42">Zhou et&#x20;al., 2021</xref>) proposed Informer, a more effective time series prediction model than Transformer (<xref ref-type="bibr" rid="B33">Vaswani et&#x20;al., 2017</xref>). Some hybrid models of wind power prediction are summarized in <xref ref-type="table" rid="T1">Table&#x20;1</xref> for reference.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Recent studies for wind power forecasting based on hybrid models.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Authors</th>
<th align="center">Year</th>
<th align="center">Approach</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Men (<xref ref-type="bibr" rid="B25">Men et&#x20;al., 2016</xref>)</td>
<td align="center">2016</td>
<td align="left">Gauss Hybrid Model</td>
</tr>
<tr>
<td align="left">Zhu (<xref ref-type="bibr" rid="B43">Zhu et&#x20;al., 2017</xref>)</td>
<td align="center">2017</td>
<td align="left">LSTM</td>
</tr>
<tr>
<td align="left">Chen (<xref ref-type="bibr" rid="B7">Chen et&#x20;al., 2019</xref>)</td>
<td align="center">2019</td>
<td align="left">ELM-LSTM</td>
</tr>
<tr>
<td align="left">Han (<xref ref-type="bibr" rid="B13">Han et&#x20;al., 2019</xref>)</td>
<td align="center">2019</td>
<td align="left">Coupla-LSTM</td>
</tr>
<tr>
<td align="left">Zhou (<xref ref-type="bibr" rid="B41">Zhou et&#x20;al., 2019</xref>)</td>
<td align="center">2019</td>
<td align="left">K-means-LSTM</td>
</tr>
<tr>
<td align="left">Haq (<xref ref-type="bibr" rid="B14">Haq and Zhen, 2019</xref>)</td>
<td align="center">2019</td>
<td align="left">IEMD-T-Coupla</td>
</tr>
<tr>
<td align="left">Khodayar (<xref ref-type="bibr" rid="B22">Khodayar and Wang, 2019</xref>)</td>
<td align="center">2019</td>
<td align="left">DNN</td>
</tr>
<tr>
<td align="left">Zhang (<xref ref-type="bibr" rid="B40">Zhang et&#x20;al., 2021</xref>)</td>
<td align="center">2021</td>
<td align="left">Spatiotemporal Correlations</td>
</tr>
<tr>
<td align="left">Wu (<xref ref-type="bibr" rid="B37">Wu et&#x20;al., 2020</xref>)</td>
<td align="center">2020</td>
<td align="left">Transformer</td>
</tr>
<tr>
<td align="left">Chen (<xref ref-type="bibr" rid="B8">Chen and Folly, 2021</xref>)</td>
<td align="center">2021</td>
<td align="left">MIF-CANN</td>
</tr>
<tr>
<td align="left">Hu (<xref ref-type="bibr" rid="B19">Hu et&#x20;al., 2021</xref>)</td>
<td align="center">2021</td>
<td align="left">Improved-DBN</td>
</tr>
<tr>
<td align="left">Pandey (<xref ref-type="bibr" rid="B28">Pandey et&#x20;al., 2021</xref>)</td>
<td align="center">2021</td>
<td align="left">EEMD-DPSF-ARIMA</td>
</tr>
<tr>
<td align="left">Shi (<xref ref-type="bibr" rid="B30">Shi et&#x20;al., 2021</xref>)</td>
<td align="center">2021</td>
<td align="left">TCN-GRU</td>
</tr>
<tr>
<td align="left">Wang (<xref ref-type="bibr" rid="B35">Wang et&#x20;al., 2021</xref>)</td>
<td align="center">2021</td>
<td align="left">CNN Feature Extract</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>To sum up, most of the latest research progress of wind power prediction is based on machine learning (ML), artificial neural network (ANN), convolutional neural network (CNN) and recurrent neural network (RNN). These methods can effectively predict wind power. However, when the amount of input data becomes larger and the length of output data is long, the effect of these models is not particularly ideal. Nowadays, a large amount of data has been used in practical applications. How to forecast wind power more accurately in the environment of large data is a problem that needs to be solved.</p>
<p>This paper presents a method based on CNN-Informer for short-term, average wind power prediction. The average wind power can reflect the overall trend of wind power for a certain period, and the total wind power generation for a certain period can be obtained by determining the average power for a certain period in the future. To overcome the insufficiency of time series information contained in the historical power generation of a wind generator set at a single time scale, a convolution neural network is used to divide the original data into time series data at different time scales, and then the sub-sequences are input in the Informer model for training. The results are fused to obtain the final wind power prediction results.</p>
<p>The main contributions of this paper are presented as follows:</p>
<p>The prediction of wind power belongs to the problem of long-time series prediction. Therefore, to solve the problem of long-term series input, Informer is used to predict wind power in this&#x20;paper.</p>
<p>To fully obtain the time-series features contained in the wind power data, this paper proposes a convolutional neural network to extract the features of the original wind power data to solve the problem that the time scale of the original wind power is single.</p>
<p>This paper is organized as follows: <italic>Methdology of Wind Power Prediction</italic> Section describes convolution, Informer, and the structure of the proposed CNN-Informer model. <italic>Experiment of Wind Power Prediction</italic> Section describes the datasets of wind power and illustrates the results of the experiment in this paper. The conclusions are summarized in <italic>Conclusion</italic> Section.</p>
</sec>
<sec id="s2">
<title>Methdology of Wind Power Prediction</title>
<p>This paper proposes a hybrid network model based on a convolutional neural network and Informer to forecast wind&#x20;power.</p>
<p>The convolutional neural network can extract sufficient features from time series data, and Informer can more accurately predict long sequence inputs. The proposed model can effectively combine the advantages of these deep learning networks.</p>
<p>This chapter introduces the convolutional neural network, Informer, and proposed&#x20;model.</p>
<sec id="s2-1">
<title>Description of Convolutional Layers</title>
<p>Single time-scale, historical wind power data contain a minimal amount of time information and cannot fully reflect the time sequence information and trend. Therefore, more time sequence features need to be extracted from the original wind power data. Convolutional neural networks can effectively extract some useful features. Therefore, this paper adopts a convolutional neural network to extract different time sequence features from original wind power data. The original wind power sequence is convoluted into a wind power sequence at different scales by two-dimensional convolution as follows:<disp-formula id="e1">
<mml:math id="m1">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="italic">Conv</mml:mi>
<mml:mn>2</mml:mn>
<mml:mi>d</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(1)</label>
</disp-formula>
<inline-formula id="inf1">
<mml:math id="m2">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> represents the sequence of wind power generated by convolution at different time scales, and <inline-formula id="inf2">
<mml:math id="m3">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>p</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> represents the original historical sequence of wind power. The network structure diagram of this part is shown in <xref ref-type="fig" rid="F1">Figure&#x20;1</xref>.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Structure of convolution layers.</p>
</caption>
<graphic xlink:href="fenrg-09-788320-g001.tif"/>
</fig>
<p>Two-dimensional convolutions with convolution kernel sizes of 15&#x2a;1, 30&#x2a;1, 60&#x2a;1, 90&#x2a;1, and 120&#x2a;1 are employed to extract features of different time scales. Five convolution kernels are selected to divide the original wind power sequence into five sub-sequences with time scales of 15, 30, 60, 90, and 120&#xa0;min.</p>
</sec>
<sec id="s2-2">
<title>Description of Informer</title>
<p>Informer (<xref ref-type="bibr" rid="B42">Zhou et&#x20;al., 2021</xref>) is a network structure that is based on an attention mechanism that improves the square computational complexity of the self-attention mechanism, multilayer network stacking, and step-by-step decoding method. Informer mainly solves the prediction problem of long series data; its overall architecture is shown in <xref ref-type="fig" rid="F2">Figure&#x20;2</xref>.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Architecture of informer.</p>
</caption>
<graphic xlink:href="fenrg-09-788320-g002.tif"/>
</fig>
<p>In the encoder part of the model, ProbSparse self-attention (<xref ref-type="bibr" rid="B42">Zhou et&#x20;al., 2021</xref>) is used to replace canonical self-attention, and self-attention distilling is used to reduce the size of the network. The decoder receives the long sequence of inputs, sets the target element to zero, and immediately predicts the outputs in a generative inference method.</p>
<p>ProbSparse Self-attention: The <inline-formula id="inf3">
<mml:math id="m4">
<mml:mi>i</mml:mi>
</mml:math>
</inline-formula>-th query&#x2019;s attention on all the keys is defined as probability <inline-formula id="inf4">
<mml:math id="m5">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>&#x7c;</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, and the output is its composition with values <inline-formula id="inf5">
<mml:math id="m6">
<mml:mi>v</mml:mi>
</mml:math>
</inline-formula> in this model (<xref ref-type="bibr" rid="B42">Zhou et&#x20;al., 2021</xref>). The likeness between <inline-formula id="inf6">
<mml:math id="m7">
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>&#x7c;</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula> and the uniform distribution <inline-formula id="inf7">
<mml:math id="m8">
<mml:mrow>
<mml:mi>q</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
<mml:mo>&#x7c;</mml:mo>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</inline-formula> is calculated by a method similar to Kullback&#x2013;Leibler divergence.<disp-formula id="e2">
<mml:math id="m9">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>M</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>K</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mi>max</mml:mi>
</mml:mrow>
<mml:mi>j</mml:mi>
</mml:munder>
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msubsup>
<mml:mi>k</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mi>d</mml:mi>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>}</mml:mo>
</mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>K</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
</mml:munderover>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:msubsup>
<mml:mi>k</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>T</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mi>d</mml:mi>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
<label>(2)</label>
</disp-formula>
</p>
<p>If the <inline-formula id="inf8">
<mml:math id="m10">
<mml:mi>i</mml:mi>
</mml:math>
</inline-formula>-th query gains a larger <inline-formula id="inf9">
<mml:math id="m11">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>M</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>K</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, its attention probability <inline-formula id="inf10">
<mml:math id="m12">
<mml:mi>p</mml:mi>
</mml:math>
</inline-formula> is more &#x201c;diverse&#x201d; and has a high chance of containing the dominant dot-product pairs in the header field of the long tail self-attention distribution (<xref ref-type="bibr" rid="B42">Zhou et&#x20;al., 2021</xref>). According to this measurement, Informer only focuses on top-<inline-formula id="inf11">
<mml:math id="m13">
<mml:mi>u</mml:mi>
</mml:math>
</inline-formula> dominant queries for each <inline-formula id="inf12">
<mml:math id="m14">
<mml:mi>k</mml:mi>
</mml:math>
</inline-formula> value:<disp-formula id="e3">
<mml:math id="m15">
<mml:mrow>
<mml:mi>A</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi>Q</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>K</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>V</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="italic">Soft</mml:mi>
<mml:mtext>&#x2009;</mml:mtext>
<mml:mi mathvariant="normal">max</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>Q</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:msup>
<mml:mi>K</mml:mi>
<mml:mi>T</mml:mi>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mi>d</mml:mi>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>V</mml:mi>
</mml:mrow>
</mml:math>
<label>(3)</label>
</disp-formula>
<inline-formula id="inf13">
<mml:math id="m16">
<mml:mrow>
<mml:msub>
<mml:mi>q</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is <inline-formula id="inf14">
<mml:math id="m17">
<mml:mi>Q</mml:mi>
</mml:math>
</inline-formula>&#x2019;s value in the <inline-formula id="inf15">
<mml:math id="m18">
<mml:mi>i</mml:mi>
</mml:math>
</inline-formula>-th row, <inline-formula id="inf16">
<mml:math id="m19">
<mml:mrow>
<mml:msub>
<mml:mi>k</mml:mi>
<mml:mi>j</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is <inline-formula id="inf17">
<mml:math id="m20">
<mml:mi>K</mml:mi>
</mml:math>
</inline-formula>&#x2019;s value in the <inline-formula id="inf18">
<mml:math id="m21">
<mml:mi>j</mml:mi>
</mml:math>
</inline-formula>-th row, and <inline-formula id="inf19">
<mml:math id="m22">
<mml:mi>d</mml:mi>
</mml:math>
</inline-formula> is the input dimension. <inline-formula id="inf20">
<mml:math id="m23">
<mml:mrow>
<mml:mover accent="true">
<mml:mi>Q</mml:mi>
<mml:mo>&#xaf;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> is a sparse matrix that contains only <inline-formula id="inf21">
<mml:math id="m24">
<mml:mi>u</mml:mi>
</mml:math>
</inline-formula> queries.</p>
<p>Self-attention distilling: The model uses the distilling operation to privilege the superior features with dominating features and to construct a focused, self-attention feature map in the next layer (<xref ref-type="bibr" rid="B42">Zhou et&#x20;al., 2021</xref>). This distilling procedure forwards from the <inline-formula id="inf22">
<mml:math id="m25">
<mml:mi>j</mml:mi>
</mml:math>
</inline-formula>-th layer to the <inline-formula id="inf23">
<mml:math id="m26">
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>j</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>-th layer as:<disp-formula id="e4">
<mml:math id="m27">
<mml:mrow>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>t</mml:mi>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="italic">MaxPooling</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="italic">ELU</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="italic">Conv</mml:mi>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="italic">d</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
<label>(4)</label>
</disp-formula>where <inline-formula id="inf24">
<mml:math id="m28">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mo>&#xb7;</mml:mo>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> represents the attention block. After each convolutional layer, the distilling adds a max-pooling layer with stride 2 and downsamples <inline-formula id="inf25">
<mml:math id="m29">
<mml:mrow>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mi>j</mml:mi>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> to its half slice. The whole memory usage can be reduced to <inline-formula id="inf26">
<mml:math id="m30">
<mml:mrow>
<mml:mi>O</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>&#x3bb;</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mi>L</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>L</mml:mi>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:math>
</inline-formula>, where <inline-formula id="inf27">
<mml:math id="m31">
<mml:mi>&#x3bb;</mml:mi>
</mml:math>
</inline-formula> is a small number.</p>
<p>Generative Inference: The model feeds the decoder with the following vectors:<disp-formula id="e5">
<mml:math id="m32">
<mml:mrow>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mi>t</mml:mi>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="italic">Concat</mml:mi>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>k</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mi>t</mml:mi>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mi>&#x211d;</mml:mi>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>k</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>L</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mo>&#xd7;</mml:mo>
<mml:msub>
<mml:mi>d</mml:mi>
<mml:mrow>
<mml:mi>mod</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:math>
<label>(5)</label>
</disp-formula>where <inline-formula id="inf28">
<mml:math id="m33">
<mml:mrow>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>d</mml:mi>
<mml:mi>e</mml:mi>
</mml:mrow>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> is the <inline-formula id="inf29">
<mml:math id="m34">
<mml:mi>i</mml:mi>
</mml:math>
</inline-formula>-th input sequence of the decoder, <inline-formula id="inf30">
<mml:math id="m35">
<mml:mrow>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mi>o</mml:mi>
<mml:mi>k</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> is the start token of the <inline-formula id="inf31">
<mml:math id="m36">
<mml:mi>i</mml:mi>
</mml:math>
</inline-formula>-th sequence and <inline-formula id="inf32">
<mml:math id="m37">
<mml:mrow>
<mml:msubsup>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mi>t</mml:mi>
</mml:msubsup>
</mml:mrow>
</mml:math>
</inline-formula> is a placeholder for the target sequence of the <inline-formula id="inf33">
<mml:math id="m38">
<mml:mi>i</mml:mi>
</mml:math>
</inline-formula>-th sequence, which are set to a scalar such as 0. This model uses a generative way to decode; its decoder predicts output by one forwards procedure.</p>
</sec>
<sec id="s2-3">
<title>Proposed Model</title>
<p>In the proposed model, the original wind power series is scaled by a convolutional neural network, from which the features of different time scales are extracted. The sub-sequences of different time scales after convolution are taken as the inputs of the Informer network, and the Informer generates five outputs. These outputs are inputted to the concatenated layer for feature fusion, and the final forecast result is outputted through a fully connected layer. The overall framework of the proposed model is shown in <xref ref-type="fig" rid="F3">Figure&#x20;3</xref>.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Overall framework of the proposed&#x20;model.</p>
</caption>
<graphic xlink:href="fenrg-09-788320-g003.tif"/>
</fig>
</sec>
</sec>
<sec id="s3">
<title>Experiment of Wind Power Prediction</title>
<sec id="s3-1">
<title>Description of Wind Power Datasets</title>
<p>In this study, historical wind power datasets of a region in China from March 1, 2020, to April 30, 2020, are employed, and the interval of datasets is 1&#xa0;minute. The dataset is collected by SCADA. <xref ref-type="fig" rid="F4">Figure&#x20;4</xref> shows the historical wind power curve of the region. The fluctuation range of the wind power data is 0&#x2013;21&#xa0;MW, and the wind power strongly fluctuates.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Historical wind&#x20;power.</p>
</caption>
<graphic xlink:href="fenrg-09-788320-g004.tif"/>
</fig>
<p>
<xref ref-type="table" rid="T2">Table&#x20;2</xref> gives descriptive statistics, including measured values: minimum, mean, maximum and median are selected to describe the characteristics of the distribution. The minimum value, mean value, maximum value and median of the dataset are 0.03717, 6.68971, 20.4642, and 6.32673&#xa0;MW. <xref ref-type="table" rid="T2">Table&#x20;2</xref> shows that the mean and median of the dataset are similar.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Statistical elements of the historical wind&#x20;power.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Statistic</th>
<th align="center">Value (MW)</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Minimum</td>
<td align="char" char=".">0.03717</td>
</tr>
<tr>
<td align="left">Mean</td>
<td align="char" char=".">6.68971</td>
</tr>
<tr>
<td align="left">Maximum</td>
<td align="char" char=".">20.4642</td>
</tr>
<tr>
<td align="left">Median</td>
<td align="char" char=".">6.32673</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s3-2">
<title>Average Wind Power Prediction</title>
<p>The average value of real wind power data can better reflect the centralized trend of wind power over this period, and the general trend of wind power over a certain period can be employed to assess the generation status of wind power. Therefore, this paper uses the method of the mean prediction of wind power to forecast the centralized trend of generation power over the next 3&#xa0;hours. The power curve for 3&#xa0;hours is shown in <xref ref-type="fig" rid="F5">Figure&#x20;5</xref>. The fluctuation range of the wind power data is 2&#x2013;5.5&#xa0;MW. The mean value of the wind power of 3&#xa0;hours is 4. 2421MW.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Three-hour wind power and&#x20;mean.</p>
</caption>
<graphic xlink:href="fenrg-09-788320-g005.tif"/>
</fig>
</sec>
<sec id="s3-3">
<title>Data Standardization and Missing Value Processing</title>
<p>Due to the fluctuation of actual wind power data, extensive data will cause numerical problems. To accelerate the speed of gradient descent to obtain the optimal solution, this paper standardizes the original power data before constructing the model, as shown in <xref ref-type="disp-formula" rid="e6">Equation 6</xref>, and converts the predictive results to the final predictive results, as shown in <xref ref-type="disp-formula" rid="e7">Equation 7</xref>.<disp-formula id="e6">
<mml:math id="m39">
<mml:mrow>
<mml:mi mathvariant="normal">x</mml:mi>
<mml:mo>&#x2032;</mml:mo>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
<label>(6)</label>
</disp-formula>
<disp-formula id="e7">
<mml:math id="m40">
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="normal">x</mml:mi>
<mml:mo>&#x2032;</mml:mo>
<mml:mo>&#x2217;</mml:mo>
<mml:msub>
<mml:mi mathvariant="normal">x</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
<label>(7)</label>
</disp-formula>
<inline-formula id="inf34">
<mml:math id="m41">
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> is the normalized variable, <inline-formula id="inf35">
<mml:math id="m42">
<mml:mi>x</mml:mi>
</mml:math>
</inline-formula> is the original variable, <inline-formula id="inf36">
<mml:math id="m43">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the mean of the variable, and <inline-formula id="inf37">
<mml:math id="m44">
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>d</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> is the standard deviation of the variable. For missing values of wind power datasets, this paper uses mean interpolation to process missing values.</p>
</sec>
<sec id="s3-4">
<title>Division of Datasets</title>
<p>The partitioning of datasets is an important step and a prerequisite for training wind power data. To obtain reasonable forecasting results, wind power datasets are divided into training sets, testing sets, and validation sets at a ratio of 8.5:1:0.5. As shown in <xref ref-type="fig" rid="F6">Figure&#x20;6</xref>, the training set and validation set are employed to train the model. We then input the testing set into the trained model for prediction.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Partition of wind power datasets.</p>
</caption>
<graphic xlink:href="fenrg-09-788320-g006.tif"/>
</fig>
</sec>
<sec id="s3-5">
<title>Evaluation Metrics</title>
<p>The forecasting of the average wind power uses 6&#xa0;hours of wind power to forecast the average wind power over the next 3&#xa0;hours.</p>
<p>To evaluate the predictive performance of the model, this paper uses four evaluation metrics to evaluate the performance of the model. Four evaluation metrics are the mean absolute error (MAE), mean square error (MSE), root mean square error (RMSE) and mean absolute percent error (MAPE). The MAE is the average of the sum of the absolute difference between the true value and the predicted value. The MSE is the mean of the sum of the squares of the errors between the true value and the predicted value. The RMSE is the square root of the MSE. The MAPE is the percentage of the MAE. The four error evaluation indices are shown in <xref ref-type="disp-formula" rid="e8">Eqs 8</xref>&#x2013;<xref ref-type="disp-formula" rid="e11">11</xref>.<disp-formula id="e8">
<mml:math id="m45">
<mml:mrow>
<mml:mi mathvariant="italic">MAE</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>n</mml:mi>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
<label>(8)</label>
</disp-formula>
<disp-formula id="e9">
<mml:math id="m46">
<mml:mrow>
<mml:mi mathvariant="italic">MSE</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>n</mml:mi>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
<label>(9)</label>
</disp-formula>
<disp-formula id="e10">
<mml:math id="m47">
<mml:mrow>
<mml:mi mathvariant="italic">RMSE</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi>n</mml:mi>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msup>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:math>
<label>(10)</label>
</disp-formula>
<disp-formula id="e11">
<mml:math id="m48">
<mml:mrow>
<mml:mi mathvariant="italic">MAPE</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>100</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:mfrac>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo>&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:mo>&#x7c;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x7c;</mml:mo>
</mml:mrow>
</mml:mstyle>
</mml:mrow>
</mml:math>
<label>(11)</label>
</disp-formula>where <inline-formula id="inf38">
<mml:math id="m49">
<mml:mi>n</mml:mi>
</mml:math>
</inline-formula> represents the number of predicted points, <inline-formula id="inf39">
<mml:math id="m50">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mi>y</mml:mi>
<mml:mo>&#x5e;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> represents the predicted value, and <inline-formula id="inf40">
<mml:math id="m51">
<mml:mrow>
<mml:msub>
<mml:mi>y</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:math>
</inline-formula> represents the real&#x20;value.</p>
</sec>
<sec id="s3-6">
<title>Experimental Environment and Strategies</title>
<p>In this paper, the experimental code is Python 3.7; the deep learning framework is PyTorch 1.8; and the experiment is implemented on a PC (Windows 10 operating system, Intel (R) core (TM) I7-9750&#xa0;h CPU 2.6&#xa0;GHz, 16 Gbyte RAM, and NVIDIA GeForce RTX 3070&#x20;GPU).</p>
<p>This paper adopts the cross-validation (<xref ref-type="bibr" rid="B1">Bokde et&#x20;al., 2020</xref>) training strategy. In the experiments of out study, we divide the training data into training set and validation set and perform 100 iterations on each epoch. We take the average loss value over 100 iterations as the final loss value of each epoch. We test the model on the testing set and achieve the final forecasting results. The Gelu activation function is utilized as the activation function of the model; MSE is employed as the loss function of the model; and Adam is applied as the optimizer of the model. The Adam algorithm has no smoothing requirements for the objective function, and its loss function changes with time, so it can better handle noise samples. In the experiment, the batch size was 16, and the methods of early stopping and reducing the learning rate were adopted to prevent overfitting.</p>
<p>The forecasting time horizons of all the simulation results presented in this study were 3-h ahead forecasting. This paper uses 6&#xa0;h of historical wind power data to predict the average wind power in the next 3&#xa0;hours.</p>
</sec>
<sec id="s3-7">
<title>Comparison of the Proposed Model</title>
<p>To achieve the best predictive performance, this paper compares CNN-Informer models with different time scales. To achieve the best predictive performance, this paper divides the original wind power data into four types of time scales. The first type is 15 and 30&#xa0;min; the second type is 15, 30, and 60&#xa0;min; the third type is 15, 30, 60, 90&#xa0;min; and the fourth type is 15, 30, 60, 90, and 120&#xa0;min. As shown in <xref ref-type="fig" rid="F7">Figure&#x20;7</xref>, the error metrics reached the highest error metrics, while the time scales were 15, 30, and 60&#xa0;min. The fourth type had the lowest error metrics.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Metrics of the proposed models: <bold>(A)</bold>, MAE, <bold>(B)</bold>, MSE, <bold>(C)</bold>, RMSE, and <bold>(D)</bold>, MAPE.</p>
</caption>
<graphic xlink:href="fenrg-09-788320-g007.tif"/>
</fig>
<p>As shown in <xref ref-type="fig" rid="F8">Figure&#x20;8</xref>, the performance of CNN &#x2b; Informer models is similar, while the fourth type has less fluctuation and a forecast closer to the true value than other types. Furthermore, the convergence speed of the model slows with an increase in the number of convolution kernels, and the performance of the model with more convolution kernels show minimal improvement. Therefore, this paper selects the fourth type&#x2014;15, 30, 60, 90, and 120&#xa0;minutes&#x2014;as the proposed&#x20;model.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>Predictive results of CNN-Informer models.</p>
</caption>
<graphic xlink:href="fenrg-09-788320-g008.tif"/>
</fig>
</sec>
<sec id="s3-8">
<title>Comparison of the Previous Model</title>
<p>To verify the comprehensive performance of the proposed CNN-Informer model, five algorithms are selected and developed for comparison, including the proposed model, Informer, Long-Short Term Memory (LSTM), DeepAR, and Recurrent Neural Network (RNN). The hyperparameters and neural network topology of all comparison models have been optimized and summarized in <xref ref-type="table" rid="T3">Table&#x20;3</xref>.</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Hyperparameters of these methods.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Method</th>
<th align="center">Parameters</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Proposed</td>
<td align="left">Kernel size:15&#x2a;1, 30&#x2a;1, 60&#x2a;1, 90&#x2a;1, and 120&#x2a;1</td>
</tr>
<tr>
<td align="left">DeepAR</td>
<td align="left">LSTM units: 16 LSTM layers: 1</td>
</tr>
<tr>
<td align="left">LSTM</td>
<td align="left">LSTM units: 16 LSTM layers: 2</td>
</tr>
<tr>
<td align="left">RNN</td>
<td align="left">RNN units: 16 RNN layers: 2</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Six hours of historical wind power data are used to predict the mean value of wind power in the next 3&#xa0;hours, as shown in <xref ref-type="fig" rid="F9">Figure&#x20;9</xref>, which is the prediction diagram of the experimental results of the proposed CNN-Informer, Informer, DeepAR, LSTM and RNN models. The performance of the proposed model is the best, slightly higher than that of Informer, while the performance of RNN and LSTM is poor, which is far from the performance of the proposed model CNN-Informer, Informer and DeepAR.</p>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption>
<p>Curve of the forecast results: <bold>(A)</bold>, Proposed model, <bold>(B)</bold>, Informer, <bold>(C)</bold>, DeepAR, <bold>(D)</bold>, LSTM, and <bold>(E)</bold>, RNN.</p>
</caption>
<graphic xlink:href="fenrg-09-788320-g009.tif"/>
</fig>
<p>The experimental error results and convergence time of the proposed model, Informer, LSTM, RNN and DeepAR are shown in <xref ref-type="table" rid="T4">Table&#x20;4</xref>. Among the five models mentioned in <xref ref-type="table" rid="T4">Table&#x20;4</xref>, the minimal error results and shortest convergence time are bold. As shown in <xref ref-type="table" rid="T4">Table&#x20;4</xref>, for the proposed model, the MAE, MSE, RMSE, MAPE, and convergence time are 0.063611, 0.007379, 0.085901, 1.118828%, and 672.23 s. For the Informer, the MAE, MSE, RMSE, MAPE, and convergence time are 0.088493, 0.011234, 0.105994, 1.709026%, and 668.47s. Although the convergence time of the proposed model is higher than that of Informer, the performance of the proposed model is improved compared with that of Informer. Compared with the traditional model, the proposed method significantly improves the prediction performance and the convergence&#x20;time.</p>
<table-wrap id="T4" position="float">
<label>TABLE 4</label>
<caption>
<p>Metrics of five models.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Method</th>
<th align="center">MAE</th>
<th align="center">MSE</th>
<th align="center">RMSE</th>
<th align="center">MAPE (%)</th>
<th align="center">Time(s)</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Proposed</td>
<td align="char" char=".">
<bold>0.063611</bold>
</td>
<td align="char" char=".">
<bold>0.007379</bold>
</td>
<td align="char" char=".">
<bold>0.085901</bold>
</td>
<td align="char" char=".">
<bold>1.118828</bold>
</td>
<td align="char" char=".">672.23</td>
</tr>
<tr>
<td align="left">Informer</td>
<td align="char" char=".">0.088493</td>
<td align="char" char=".">0.011234</td>
<td align="char" char=".">0.105994</td>
<td align="char" char=".">1.709026</td>
<td align="char" char=".">
<bold>668.47</bold>
</td>
</tr>
<tr>
<td align="left">DeepAR</td>
<td align="char" char=".">0.351596</td>
<td align="char" char=".">0.182385</td>
<td align="char" char=".">0.427006</td>
<td align="char" char=".">4.724828</td>
<td align="char" char=".">780.59</td>
</tr>
<tr>
<td align="left">LSTM</td>
<td align="char" char=".">0.815108</td>
<td align="char" char=".">1.155223</td>
<td align="char" char=".">1.074813</td>
<td align="char" char=".">11.213216</td>
<td align="char" char=".">1278.05</td>
</tr>
<tr>
<td align="left">RNN</td>
<td align="char" char=".">0.711205</td>
<td align="char" char=".">0.794341</td>
<td align="char" char=".">0.891258</td>
<td align="char" char=".">10.223156</td>
<td align="char" char=".">1423.82</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<fn>
<p>The minimal error results and shortest convergence time are bold.</p>
</fn>
</table-wrap-foot>
</table-wrap>
<p>In conclusion, convolution of the original wind power series to a certain extent can improve the predictive performance of the model. The prediction performance of the model can obtain better performance when the original wind power sequence is convoluted to time scales of 15, 30, 60, 90, and 120&#xa0;min.</p>
</sec>
</sec>
<sec sec-type="conclusion" id="s4">
<title>Conclusion</title>
<p>Due to the instability and intermittency of wind power generation in a complex environment and to better obtain the historical wind power data, this paper proposes a composite network that is composed of a convolutional neural network and Informer and that uses this model to improve the prediction accuracy of wind power. The historical wind power data of a wind farm in China are employed for verification and compared with Informer, LSTM, RNN, and DeepAR. The detailed contributions of this paper are listed as follows:</p>
<p>The original historical wind power data are divided into multiple time scales by using a convolution neural network, and more time series features are extracted. This method can make better use of historical wind power&#x20;data.</p>
<p>Based on the Informer network, this paper establishes a wind power prediction model that can input a long time series and predict the average power in the next 3&#xa0;hours. Compared with Informer, LSTM, RNN, and DeepAR, the proposed CNN-Informer model can more accurately predict wind&#x20;power.</p>
<p>Several limitations deserve further study. The model parameters proposed in this paper are large. In future research, we intend to propose a lightweight network. For the method of temporal feature extraction, in follow-up research, we hope to establish a more effective method to extract temporal features. In the task of short-term wind power prediction, the model has high requirements for convergence speed and accuracy that require the algorithm to balance time cost and accuracy. How to optimize the model to achieve this balance is worthy of further research.</p>
</sec>
</body>
<back>
<sec id="s5">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/Supplementary Materials, further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>H-KW contributed to conception and design of the study. KS organized the database, performed the statistical analysis, and wrote the first draft of the manuscript. YC wrote sections of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.</p>
</sec>
<sec id="s7">
<title>Funding</title>
<p>This study was supported by the Scientific and Technological Research Program of Chongqing Municipal Education Commission (KJQN202001142), the Chongqing Research Program of Basic Research and Frontier Technology (Grant No. cstc2020jcyj-msxmX0352), the fellowship of China Postdoctoral Science Foundation (2021M700616), and the Chongqing University of Technology (2019ZD118).</p>
</sec>
<sec sec-type="COI-statement" id="s8">
<title>Conflict of Interest</title>
<p>HW was employed by the Company Chongqing Industrial Big Data Innovation Center Co.,&#x20;Ltd.</p>
<p>The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, orclaim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bokde</surname>
<given-names>N. D.</given-names>
</name>
<name>
<surname>Yaseen</surname>
<given-names>Z. M.</given-names>
</name>
<name>
<surname>Andersen</surname>
<given-names>G. B.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>ForecastTB-An R Package as a Test-Bench for Time Series Forecasting-Application of Wind Speed and Solar Radiation Modeling</article-title>. <source>Energies</source> <volume>13</volume>, <fpage>2578</fpage>. <pub-id pub-id-type="doi">10.3390/en13102578</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Cassola</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Burlando</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Wind Speed and Wind Energy Forecast through Kalman Filtering of Numerical Weather Prediction Model Output</article-title>. <source>Appl. Energ.</source> <volume>99</volume>, <fpage>154</fpage>&#x2013;<lpage>166</lpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2012.03.054</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Chai</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>L. L.</given-names>
</name>
<name>
<surname>Wong</surname>
<given-names>K. P.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>An Overview on Wind Power Forecasting Methods</article-title>,&#x201d; in <conf-name>Proceedings of the 2015 International Conference on Machine Learning and Cybernetics (ICMLC)</conf-name>, <conf-loc>Guangzhou, China</conf-loc>, <conf-date>July 2015</conf-date> (<publisher-name>IEEE</publisher-name>), <fpage>765</fpage>&#x2013;<lpage>770</lpage>. <pub-id pub-id-type="doi">10.1109/ICMLC.2015.7340651</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chakraborty</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Watson</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Rodgers</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Automatic Generation Control Using an Energy Storage System in a Wind Park</article-title>. <source>IEEE Trans. Power Syst.</source> <volume>33</volume>, <fpage>198</fpage>&#x2013;<lpage>205</lpage>. <pub-id pub-id-type="doi">10.1109/tpwrs.2017.2702102</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chang</surname>
<given-names>W.-Y.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>A Literature Review of Wind Forecasting Methods</article-title>. <source>J.&#x20;Power Energ. Eng.</source> <volume>02</volume>, <fpage>161</fpage>&#x2013;<lpage>168</lpage>. <pub-id pub-id-type="doi">10.4236/jpee.2014.24023</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Short-term Wind Speed Prediction Using an Unscented Kalman Filter Based State-Space Support Vector Regression Approach</article-title>. <source>Appl. Energ.</source> <volume>113</volume>, <fpage>690</fpage>&#x2013;<lpage>705</lpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2013.08.025</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>M.-R.</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>G.-Q.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>K.-D.</given-names>
</name>
<name>
<surname>Weng</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>A Two-Layer Nonlinear Combination Method for Short-Term Wind Speed Prediction Based on ELM, ENN, and LSTM</article-title>. <source>IEEE Internet Things J.</source> <volume>6</volume>, <fpage>6997</fpage>&#x2013;<lpage>7010</lpage>. <pub-id pub-id-type="doi">10.1109/JIOT.2019.2913176</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Folly</surname>
<given-names>K. A.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Short-Term Wind Power Forecasting Using Mixed Input Feature-Based Cascade-connected Artificial Neural Networks</article-title>. <source>Front. Energ. Res.</source> <volume>9</volume>, <fpage>1</fpage>&#x2013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.3389/fenrg.2021.634639</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dong</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Liao</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>Power Prediction Modeling and Research of Large Wind Farms Based on Chaotic Time Se</article-title>. <source>J.&#x20;Electr. Technol.</source> <volume>23</volume>, <fpage>125</fpage>&#x2013;<lpage>129</lpage>. <pub-id pub-id-type="doi">10.3321/j.issn:1000-6753.2008.12.020</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Dong</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Regional Wind Power Probabilistic Forecasting Based on an Improved Kernel Density Estimation, Regular Vine Copulas, and Ensemble Learning</article-title>. <source>Energy</source> <volume>238</volume>, <fpage>122045</fpage>. <pub-id pub-id-type="doi">10.1016/j.energy.2021.122045</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Erdem</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2011</year>). <article-title>ARMA Based Approaches for Forecasting the Tuple of Wind Speed and Direction</article-title>. <source>Appl. Energ.</source> <volume>88</volume>, <fpage>1405</fpage>&#x2013;<lpage>1414</lpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2010.10.031</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Feng</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Dai</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Research on Physical Methods of Wind Farm Power Prediction</article-title>. <source>J.&#x20;China Electra. Eng.</source> <volume>30</volume>, <fpage>1</fpage>&#x2013;<lpage>6</lpage>. <pub-id pub-id-type="doi">10.13334/j.0258-8013.pcsee</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Han</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Qiao</surname>
<given-names>Y.-h.</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.-q.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Mid-to-long Term Wind and Photovoltaic Power Generation Prediction Based on Copula Function and Long Short Term Memory Network</article-title>. <source>Appl. Energ.</source> <volume>239</volume>, <fpage>181</fpage>&#x2013;<lpage>191</lpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2019.01.193</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Haq</surname>
<given-names>M. R.</given-names>
</name>
<name>
<surname>Ni</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>A New Hybrid Model for Short-Term Electricity Load Forecasting</article-title>. <source>IEEE. Access</source> <volume>7</volume>, <fpage>125413</fpage>&#x2013;<lpage>125423</lpage>. <pub-id pub-id-type="doi">10.1109/ACCESS.2019.2937222</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hazari</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Mannan</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Muyeen</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Umemura</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Takahashi</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Tamura</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Stability Augmentation of a Grid-Connected Wind Farm by Fuzzy-Logic-Controlled DFIG-Based Wind Turbines</article-title>. <source>Appl. Sci.</source> <volume>8</volume>, <fpage>20</fpage>. <pub-id pub-id-type="doi">10.3390/app8010020</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hong</surname>
<given-names>Y.-Y.</given-names>
</name>
<name>
<surname>Rioflorido</surname>
<given-names>C. L. P. P.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>A Hybrid Deep Learning-Based Neural Network for 24-h Ahead Wind Power Forecasting</article-title>. <source>Appl. Energ.</source> <volume>250</volume>, <fpage>530</fpage>&#x2013;<lpage>539</lpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2019.05.044</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hu</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Xie</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Mi</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wan</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Noise Model Based &#x3bd;-support Vector Regression with its Application to Short-Term Wind Speed Forecasting</article-title>. <source>Neural Networks</source> <volume>57</volume>, <fpage>1</fpage>&#x2013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.1016/j.neunet.2014.05.003</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hu</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Guo</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Very Short-Term Spatial and Temporal Wind Power Forecasting: A Deep Learning Approach</article-title>. <source>CSEE J.&#x20;Power Energ. Syst.</source> <volume>6</volume>, <fpage>434</fpage>&#x2013;<lpage>443</lpage>. <pub-id pub-id-type="doi">10.17775/CSEEJPES.2018.00010</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hu</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Xiang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Huo</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Jawad</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>An Improved Deep Belief Network Based Hybrid Forecasting Method for Wind Power</article-title>. <source>Energy</source> <volume>224</volume>, <fpage>120185</fpage>. <pub-id pub-id-type="doi">10.1016/j.energy.2021.120185</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiang</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Heng</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>A Hybrid Forecasting System Based on Fuzzy Time Series and Multi-Objective Optimization for Wind Speed Forecasting</article-title>. <source>Appl. Energ.</source> <volume>235</volume>, <fpage>786</fpage>&#x2013;<lpage>801</lpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2018.11.012</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kariniotakis</surname>
<given-names>G. N.</given-names>
</name>
<name>
<surname>Stavrakakis</surname>
<given-names>G. S.</given-names>
</name>
<name>
<surname>Nogaret</surname>
<given-names>E. F.</given-names>
</name>
</person-group> (<year>1996</year>). <article-title>Wind Power Forecasting Using Advanced Neural Networks Models</article-title>. <source>IEEE Trans. Energy Convers.</source> <volume>11</volume>, <fpage>762</fpage>&#x2013;<lpage>767</lpage>. <pub-id pub-id-type="doi">10.1109/60.556376</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Khodayar</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Spatio-Temporal Graph Deep Neural Network for Short-Term Wind Speed Forecasting</article-title>. <source>IEEE Trans. Sustain. Energ.</source> <volume>10</volume>, <fpage>670</fpage>&#x2013;<lpage>681</lpage>. <pub-id pub-id-type="doi">10.1109/TSTE.2018.2844102</pub-id> </citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Xue</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Saeed</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Hu</surname>
<given-names>X.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Short-term Wind Speed Interval Prediction Based on Ensemble GRU Model</article-title>. <source>IEEE Trans. Sustain. Energ.</source> <volume>11</volume>, <fpage>1370</fpage>&#x2013;<lpage>1380</lpage>. <pub-id pub-id-type="doi">10.1109/TSTE.2019.2926147</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Men</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Yee</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Lien</surname>
<given-names>F.-S.</given-names>
</name>
<name>
<surname>Wen</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Short-term Wind Speed and Power Forecasting Using an Ensemble of Mixture Density Neural Networks</article-title>. <source>Renew. Energ.</source> <volume>87</volume>, <fpage>203</fpage>&#x2013;<lpage>211</lpage>. <pub-id pub-id-type="doi">10.1016/j.renene.2015.10.014</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Oh</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Son</surname>
<given-names>S.-Y.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Theoretical Energy Storage System Sizing Method and Performance Analysis for Wind Power Forecast Uncertainty Management</article-title>. <source>Renew. Energ.</source> <volume>155</volume>, <fpage>1060</fpage>&#x2013;<lpage>1069</lpage>. <pub-id pub-id-type="doi">10.1016/j.renene.2020.03.170</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pan</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>A Wind Speed Forecasting Optimization Model for Wind Farms Based on Time Series Analysis and Kalman Filter Algorithm</article-title>. <source>Power Sys. Technol.</source> <volume>32</volume>, <fpage>82</fpage>&#x2013;<lpage>86</lpage>. <pub-id pub-id-type="doi">10.13335/j.1000-3673.pst.2008.07.012</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pandey</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Bokde</surname>
<given-names>N. D.</given-names>
</name>
<name>
<surname>Dongre</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Gupta</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Hybrid Models for Water Demand Forecasting</article-title>. <source>Water Resour. Plann. Manage.</source> <volume>147</volume> (<issue>2</issue>), <fpage>0733</fpage>&#x2013;<lpage>9496</lpage>. <pub-id pub-id-type="doi">10.1061/(asce)wr.1943-5452.0001331</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Peng</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Lang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Z.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>EALSTM-QR: Interval Wind-Power Prediction Model Based on Numerical Weather Prediction and Deep Learning</article-title>. <source>Energy</source> <volume>220</volume>, <fpage>119692</fpage>. <pub-id pub-id-type="doi">10.1016/j.energy.2020.119692</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shi</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Scherer</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Wozniak</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Wei</surname>
<given-names>W.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Short-Term Load Forecasting Based on Adabelief Optimized Temporal Convolutional Network and Gated Recurrent Unit Hybrid Neural Network</article-title>. <source>IEEE Access</source> <volume>9</volume>, <fpage>66965</fpage>&#x2013;<lpage>66981</lpage>. <pub-id pub-id-type="doi">10.1109/ACCESS.2021.3076313</pub-id> </citation>
</ref>
<ref id="B31">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shukur</surname>
<given-names>O. B.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>M. H.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>Daily Wind Speed Forecasting through Hybrid KF-ANN Model Based on ARIMA</article-title>. <source>Renew. Energ.</source> <volume>76</volume>, <fpage>637</fpage>&#x2013;<lpage>647</lpage>. <pub-id pub-id-type="doi">10.1016/j.renene.2014.11.084</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tu</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Betz</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Mo</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Fan</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Achieving Grid Parity of Wind Power in China - Present Levelized Cost of Electricity and Future Evolution</article-title>. <source>Appl. Energ.</source> <volume>250</volume>, <fpage>1053</fpage>&#x2013;<lpage>1064</lpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2019.05.039</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Vaswani</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Shazeer</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Parmar</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Uszkoreit</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Jones</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gomez</surname>
<given-names>A. N.</given-names>
</name>
<etal/>
</person-group> (<year>2017</year>). <source>Attention Is All You Need</source>. <comment>arXiv:1706.03762</comment>. </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>H.-z.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G.-q.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>G.-b.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>J.-c.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.-t.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Deep Learning Based Ensemble Approach for Probabilistic Wind Power Forecasting</article-title>. <source>Appl. Energ.</source> <volume>188</volume>, <fpage>56</fpage>&#x2013;<lpage>70</lpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2016.11.111</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Yao</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Short-Term Wind Power Prediction Based on Multidimensional Data Cleaning and Feature Reconfiguration</article-title>. <source>Appl. Energ.</source> <volume>292</volume>, <fpage>116851</fpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2021.116851</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wu</surname>
<given-names>Y. X.</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Q. B.</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>J.&#x20;Q.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Data-driven Wind Speed Forecasting Using Deep Feature Extraction and LSTM</article-title>. <source>IET Renew. Power Generation</source> <volume>13</volume>, <fpage>2062</fpage>&#x2013;<lpage>2069</lpage>. <pub-id pub-id-type="doi">10.1049/iet-rpg.2018.5917</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Wu</surname>
<given-names>N.</given-names>
</name>
<name>
<surname>Green</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Xue</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>O&#x27;Banion</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2020</year>). <source>Deep Transformer Models for Time Series Forecasting: The Influenza Prevalence Case</source>. <comment>arXiv:2001.08317v1</comment>. </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Tong</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Research on the Impact of Large-Scale Wind Power Integration on Power Quality</article-title>. <source>Henan Sci. Technol.</source> <volume>678</volume>, <fpage>143</fpage>&#x2013;<lpage>144</lpage>. <pub-id pub-id-type="doi">10.3969/j.issn.1003-5168.2019.16.051</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yan</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Long</surname>
<given-names>Q.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Improved Deep Mixture Density Network for Regional Wind Power Probabilistic Forecasting</article-title>. <source>IEEE Trans. Power Syst.</source> <volume>35</volume>, <fpage>2549</fpage>&#x2013;<lpage>2560</lpage>. <pub-id pub-id-type="doi">10.1109/TPWRS.2020.2971607</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Han</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Dong</surname>
<given-names>C.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <article-title>Power Prediction of a Wind Farm Cluster Based on Spatiotemporal Correlations</article-title>. <source>Appl. Energ.</source> <volume>302</volume>, <fpage>117568</fpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2021.117568</pub-id> </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Ma</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Wind Power Prediction Based on LSTM Networks and Nonparametric Kernel Density Estimation</article-title>. <source>IEEE Access</source> <volume>7</volume>, <fpage>165279</fpage>&#x2013;<lpage>165292</lpage>. <pub-id pub-id-type="doi">10.1109/access.2019.2952555</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Xiong</surname>
<given-names>H.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <source>Informer: Beyond Efficient Transformer for Long Sequence Time-Series Forecasting</source>. <comment>arXiv:2012.07436v3</comment>. </citation>
</ref>
<ref id="B43">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhu</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Ultra-short Term Prediction of Wind Farm Power Generation Based on Long-Short Term Memory Network</article-title>. <source>Grid Technol.</source> <volume>41</volume>, <fpage>3797</fpage>&#x2013;<lpage>3802</lpage>. <pub-id pub-id-type="doi">10.13335/j.1000-3673.pst.2017.1657</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>