<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="brief-report">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Energy Res.</journal-id>
<journal-title>Frontiers in Energy Research</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Energy Res.</abbrev-journal-title>
<issn pub-type="epub">2296-598X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fenrg.2021.666130</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Energy Research</subject>
<subj-group>
<subject>Brief Research Report</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Coordinated Cyber-Attack Detection Model of Cyber-Physical Power System Based on the Operating State Data Link</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Wang</surname> <given-names>Lei</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1113059/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Xu</surname> <given-names>Pengcheng</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Qu</surname> <given-names>Zhaoyang</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Bo</surname> <given-names>Xiaoyong</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Dong</surname> <given-names>Yunchang</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1166734/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Zhang</surname> <given-names>Zhenming</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1143458/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Li</surname> <given-names>Yang</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>School of Electrical Engineering, Northeast Electric Power University</institution>, <addr-line>Jilin</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>Jilin Engineering Technology Research Center of Intelligent Electric Power Big Data Processing</institution>, <addr-line>Jilin</addr-line>, <country>China</country></aff>
<aff id="aff3"><sup>3</sup><institution>Siping Power Supply Company of State Grid Jilin Electric Power Company Limited</institution>, <addr-line>Siping</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Liang Chen, Nanjing University of Information Science and Technology, China</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Donglai Wang, Shenyang Institute of Engineering, China; Ruyi Dong, Jilin Institute of Chemical Technology, China; Shuaibing Lu, Beijing University of Technology, China</p></fn>
<corresp id="c001">&#x002A;Correspondence: Lei Wang, <email>752953593@qq.com</email></corresp>
<fn fn-type="other" id="fn004"><p>This article was submitted to Smart Grids, a section of the journal Frontiers in Energy Research</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>21</day>
<month>04</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>9</volume>
<elocation-id>666130</elocation-id>
<history>
<date date-type="received">
<day>09</day>
<month>02</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>24</day>
<month>02</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2021 Wang, Xu, Qu, Bo, Dong, Zhang and Li.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Wang, Xu, Qu, Bo, Dong, Zhang and Li</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Existing coordinated cyber-attack detection methods have low detection accuracy and efficiency and poor generalization ability due to difficulties dealing with unbalanced attack data samples, high data dimensionality, and noisy data sets. This paper proposes a model for cyber and physical data fusion using a data link for detecting attacks on a Cyber&#x2013;Physical Power System (CPPS). The two-step principal component analysis (PCA) is used for classifying the system&#x2019;s operating status. An adaptive synthetic sampling algorithm is used to reduce the imbalance in the categories&#x2019; samples. The loss function is improved according to the feature intensity difference of the attack event, and an integrated classifier is established using a classification algorithm based on the cost-sensitive gradient boosting decision tree (CS-GBDT). The simulation results show that the proposed method provides higher accuracy, recall, and F-Score than comparable algorithms.</p>
</abstract>
<kwd-group>
<kwd>cyber-physical power system</kwd>
<kwd>coordinated cyber-attack</kwd>
<kwd>cluster analysis</kwd>
<kwd>oversampling</kwd>
<kwd>gradient boosting decision tree</kwd>
</kwd-group>
<counts>
<fig-count count="4"/>
<table-count count="0"/>
<equation-count count="17"/>
<ref-count count="30"/>
<page-count count="9"/>
<word-count count="0"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1">
<title>Introduction</title>
<p>In recent years, a new type of coordinated cyber-physical attack has caused blackouts of the power grid and disrupted power systems. The main reason is that the coordinated attack on the power grid by hackers was not detected in time, and effective measures to prevent major accidents could not be implemented at the optimum time (<xref ref-type="bibr" rid="B7">Haes Alhelou et al., 2019</xref>; <xref ref-type="bibr" rid="B13">Lai et al., 2019</xref>). In the 2015 attack on the Ukrainian power grid, the attack point was not the power infrastructure, and the 0-day vulnerability was not used. Its attack cost is significantly lower than that of Stuxnet, Equation, and other attacks, but it is also more effective (<xref ref-type="bibr" rid="B30">Zhang et al., 2016</xref>; <xref ref-type="bibr" rid="B11">Koopman et al., 2019</xref>). Therefore, traditional security protection methods for power systems have their limitations, and it is urgent to research detection and defense methods for coordinated attacks on the Cyber&#x2013;Physical Power System (CPPS) to identify attack types and intentions. It is crucial to establish a comprehensive active defense system to ensure the security of power systems (<xref ref-type="bibr" rid="B2">Chen et al., 2011</xref>; <xref ref-type="bibr" rid="B3">Dai et al., 2019</xref>; <xref ref-type="bibr" rid="B28">Wang X. et al., 2019</xref>).</p>
<p>Many scholars have investigated the detection and identification of coordinated attacks on the CPPS. The coupling relationship between the cyber side and the physical side has been considered in several studies (<xref ref-type="bibr" rid="B6">Drayer and Routtenberg, 2019</xref>; <xref ref-type="bibr" rid="B24">Shen et al., 2019</xref>), which focused on the fusion of the attack path on the information side and the attack object on the physical side. <xref ref-type="bibr" rid="B29">Xu and Abur (2017)</xref> combined <italic>a priori</italic> and <italic>a posteriori</italic> bad data detection and proposed a new decomposition method to solve the state estimation data corruption in cyber-attacks. <xref ref-type="bibr" rid="B12">Kurt et al. (2018)</xref> and <xref ref-type="bibr" rid="B1">Basin et al. (2016)</xref> used a dynamic equation of the measured variables with a joint transformation to detect false data injection (FDI) attacks in real time to improve the detection accuracy.</p>
<p>In summary, existing detection methods for cyber-attacks on the CPPS have the following limitations: (1) the cyberspace and the physical space are closely coupled and interact with each other. An attack detection from the cyber side or the physical side alone is not sufficient (<xref ref-type="bibr" rid="B15">Lin et al., 2016</xref>; <xref ref-type="bibr" rid="B18">Nath et al., 2019</xref>). (2) Attack detection methods based on physical power grid data ignore the impact of cyber network attacks on the performance of smart grids. The effects of power grid failures and cyber-attacks on the physical side are similar, and it is difficult to distinguish them based on data characteristics (<xref ref-type="bibr" rid="B16">Liu et al., 2016</xref>; <xref ref-type="bibr" rid="B8">Huang and Zhu, 2020</xref>). (3) A cyber-attack is characterized by unbalanced attack samples, high data dimensionality, and noise, and data with a long tail are common. Low detection accuracy of attacks and low real-time detection efficiency are typical (<xref ref-type="bibr" rid="B20">Osanaiye et al., 2018</xref>; <xref ref-type="bibr" rid="B25">Tian et al., 2019</xref>).</p>
<p>In this paper, the cyber-side alarm data and the physical-side measurement data are merged to establish a cyber-physical coupling state chain. A clustering method is designed to classify and distinguish different operating states of the CPPS. An oversampling algorithm is used to reduce the imbalance in the operating states&#x2019; samples. Subsequently, a coordinated cyber-attack detection algorithm based on the improved gradient boosting decision tree (GBDT) is proposed. The algorithm optimizes the cost-sensitive (CS) loss function, minimizing the error associated with the small sample size of attack data and providing high accuracy of attack detection and a high recall rate and F1-score.</p>
</sec>
<sec id="S2">
<title>Detection Model for Coordinated Cyber-Attacks on the <bold>Cyber&#x2013;Physical</bold> Power <bold>System</bold></title>
<p>The framework of the coordinated cyber-attack detection model is shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. The model exploits the data characteristics in different states, such as normal operation, fault operation, and coordinated attack of the CPPS. First, the data link of the cyber&#x2013;physical operation state is established according to the coupling relationship. A clustering algorithm is used to classify the state data link, and a feature set is obtained under different operating conditions. Then, the adaptive synthetic sampling algorithm (ADASYN) is used to balance the majority of the samples and the minority of the samples in different state data sets. Finally, new CS conditions are added using the GBDT&#x2019;s CS loss function to detect different coordinated cyber-attacks.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Framework of the coordinated cyber-attack detection model for the Cyber&#x2013;Physical Power System (CPPS). CS-GBDT, CS gradient boosting decision tree; PCA, principal component analysis.</p></caption>
<graphic xlink:href="fenrg-09-666130-g001.tif"/>
</fig>
</sec>
<sec id="S3">
<title>Establishment of the Data Link of the Cyber&#x2013;Physical Operating State</title>
<sec id="S3.SS1">
<title>Data Link of the Operating State of the Physical Power Grid</title>
<p>The physical grid measurement data reflect the real-time operating status of the grid under different working conditions. The measurement data of each section of the grid reflect the operating status at that moment. We do not consider the reasons for changes in the grid state (caused by cyber-attacks or general equipment failures); it can be described as a specific interval <sub>&#x0394;t</sub>(<italic>t</italic><sub>1</sub>&#x223C;<italic>t</italic><sub><italic>n</italic></sub>). According to the acquisition sequence, all state data fragments <italic>S</italic><sub><italic>p</italic></sub>(<italic>t</italic><sub><italic>i</italic></sub>) consisting of the physical grid operating data link <sub><italic>Q</italic><sub>p</sub></sub>(&#x0394;t) are defined as follows:</p>
<disp-formula id="S3.E1">
<label>(1)</label>
<mml:math id="M1">
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mtable displaystyle="true" rowspacing="0pt">
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>h</mml:mi>
</mml:msub>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>m</mml:mi>
</mml:msub>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>Q</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">&#x0394;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mi>n</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mi/>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>X</italic><sub><italic>p</italic></sub>(<italic>t</italic><sub><italic>i</italic></sub>) represents the <italic>h</italic> measured attributes obtained from the physical-side device <italic>X</italic><sub><italic>p</italic></sub> at time <italic>t</italic><sub><italic>i</italic></sub>, including the voltage, current, phase angle, active power, and reactive power; <italic>S</italic><sub><italic>p</italic></sub>(<italic>t</italic><sub><italic>i</italic></sub>) represents all the measurement data collected by <italic>m</italic> devices on the physical side at time <italic>t</italic><sub><italic>i</italic></sub>.</p>
</sec>
<sec id="S3.SS2">
<title>Data Link of the Operating State of the Cyber Network</title>
<p>The transmission delay and data packet loss rate typically reflect the performance status of a cyber-network. When the control signal or status information is lost during the transmission of the data packet because it exceeds the allowable proportion, the control of the device has been lost due to a network attack (<xref ref-type="bibr" rid="B4">Davarikia and Barati, 2018</xref>; <xref ref-type="bibr" rid="B27">Wang Q. et al., 2019</xref>). Three indicators (delay rate, packet loss rate, and threat degree) are established to characterize the operating data link of the cyber network.</p>
<list list-type="simple">
<list-item>
<label>1.</label>
<p>The delay ratio (DR) is defined as follows:</p>
</list-item>
</list>
<disp-formula id="S3.E2">
<label>(2)</label>
<mml:math id="M2">
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mtext>dr</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mo largeop="true" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:msubsup>
<mml:mi>P</mml:mi>
<mml:mi>k</mml:mi>
<mml:mo>&#x2032;</mml:mo>
</mml:msubsup>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mfrac>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>P</mml:mi>
<mml:mi>T</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:mfrac>
<mml:mo>&#x00D7;</mml:mo>
<mml:mrow>
<mml:mn>100</mml:mn>
<mml:mo lspace="0pt" rspace="3.5pt">%</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>n</italic> is the number of communication links transmitting data, <italic>P&#x2032;</italic> is the number of data packet losses for link <italic>k</italic>, <italic>P</italic><sub><italic>k</italic></sub> is the number of data packets sent by link <italic>k</italic>, and <italic>P</italic><sub><italic>T</italic></sub> is the threshold of the packet loss rate of link <italic>k</italic>.</p>
<list list-type="simple">
<list-item>
<label>1.</label>
<p>The packet loss ratio (PR) indicator is defined as follows:</p>
</list-item>
</list>
<disp-formula id="S3.E3">
<label>(3)</label>
<mml:math id="M3">
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>m</mml:mi>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mo largeop="true" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:msubsup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mi>T</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mtext>send</mml:mtext>
</mml:mrow>
</mml:msubsup>
<mml:mo>-</mml:mo>
<mml:msubsup>
<mml:mi>T</mml:mi>
<mml:mi>l</mml:mi>
<mml:mrow>
<mml:mtext>receive</mml:mtext>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mi>m</mml:mi>
</mml:mfrac>
<mml:mo>&#x00D7;</mml:mo>
<mml:mrow>
<mml:mn>100</mml:mn>
<mml:mo lspace="0pt" rspace="3.5pt">%</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>m</italic> is the number of devices that send information, <italic>T</italic><sub><italic>l</italic></sub><sup><italic>send</italic></sup> is the sending time of data packet <italic>l</italic>, and <italic>T</italic><sub><italic>l</italic></sub><sup><italic>receive</italic></sup> is the receiving time of data packet <italic>l</italic>.</p>
<list list-type="simple">
<list-item>
<label>1.</label>
<p>Threat degree <italic>W</italic><sub><italic>th</italic></sub> (<italic>a</italic><sub><italic>i,j</italic></sub>). Assuming that <italic>n</italic> alarm events are generated within the sampling time window <sub>&#x0394;t,</sub> the address set of the information equipment is {IP<sub>1</sub>, IP<sub>2</sub>, &#x2026;IP<italic><sub><italic>m</italic></sub></italic>}, and <italic>a</italic><sub><italic>i,j</italic></sub> indicates that IP<sub><italic>i</italic></sub> contains <italic>j</italic> alarm events. The intrusion detection system (IDS) deployed in the power cyber network indicates that the original threat degree is W. The threat degree is redefined as follows to determine the impact of alarm events on the attack risk of the entire system:</p>
</list-item>
</list>
<disp-formula id="S3.E4">
<label>(4)</label>
<mml:math id="M4">
<mml:mrow>
<mml:msub>
<mml:mtext>W</mml:mtext>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mo largeop="true" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:msubsup>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:munderover>
<mml:mo mathvariant="italic" movablelimits="false">w</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>-</mml:mo>
</mml:munderover>
</mml:mrow>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:msub>
<mml:mi>n</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mfrac>
<mml:mo>&#x00D7;</mml:mo>
<mml:mrow>
<mml:mn>100</mml:mn>
<mml:mo lspace="0pt" rspace="3.5pt">%</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>w</italic><sub><italic>ij</italic></sub> is the threat degree of alarm events <italic>a</italic><sub><italic>i,j</italic></sub>, <italic>w</italic><sub><italic>i</italic></sub> is the average value of the threat degrees of all alarm events in IP<italic><sub><italic>i</italic></sub></italic>, and <italic>n</italic><sub><italic>i</italic></sub> is the number of all alarm events in IP<italic><sub><italic>i</italic></sub></italic>.</p>
<p>The three performance indicators of the operating status of the cyber network are used to establish the cyber system operating data link <italic>Q</italic><sub><italic>c</italic></sub>(&#x0394;<italic>t</italic>) in the interval &#x0394;<italic>t</italic>(<italic>t</italic><sub>1</sub>&#x223C;<italic>t</italic><sub><italic>n</italic></sub>):</p>
<disp-formula id="S3.E5">
<label>(5)</label>
<mml:math id="M5">
<mml:mrow>
<mml:mo>{</mml:mo>
<mml:mtable displaystyle="true" rowspacing="0pt">
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>Y</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>d</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo rspace="4.2pt">,</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>p</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo rspace="4.2pt">,</mml:mo>
<mml:msub>
<mml:mi>W</mml:mi>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>h</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo rspace="4.2pt">,</mml:mo>
<mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo rspace="4.2pt">,</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>Y</mml:mi>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="left">
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>Q</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">&#x0394;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>S</mml:mi>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>t</mml:mi>
<mml:mrow>
<mml:mtext>n</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
<mml:mi/>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>Y</italic><sub><italic>c</italic></sub>(<italic>t</italic><sub><italic>i</italic></sub>) represents the <italic>R</italic><sub><italic>dr</italic></sub>, <italic>R</italic><sub><italic>pr</italic></sub>, and <italic>W</italic><sub><italic>th</italic></sub> obtained from the cyber-side device <italic>Y</italic><sub><italic>c</italic></sub> at time <italic>t</italic><sub><italic>i</italic></sub>; <italic>S</italic><sub><italic>c</italic></sub>(<italic>t</italic><sub><italic>i</italic></sub>) represents the status data obtained from <italic>k</italic> devices on the cyber side at time <italic>t</italic><sub><italic>i</italic></sub>.</p>
</sec>
<sec id="S3.SS3">
<title>Coupled Mapping of the Operating State of the Cyber&#x2013;Physical System</title>
<p>We use topological mapping to couple and map the data links of the two heterogeneous networks to form a data link of the cyber&#x2013;physical operating state. The grid can be divided into <italic>m</italic> areas according to the physical grid connectivity, and each area has <italic>n</italic> transmission lines. It is assumed that a line consists of <italic>k</italic> electrical components {<italic>X</italic><sub>1</sub>, <italic>X</italic><sub>2</sub>, &#x2026;, <italic>X</italic><sub><italic>k</italic></sub>}, each line is connected to <italic>n</italic> communication devices {<italic>Y</italic><sub>1</sub>, <italic>Y</italic><sub>2</sub>, &#x2026;, <italic>Y</italic><sub><italic>n</italic></sub>}, and each communication device has a unique IP address {IP<sub>1</sub>, IP<sub>2</sub>, &#x2026;, IP<italic><sub><italic>n</italic></sub></italic>} in the cyber network. We sequentially connect each electrical component number, line number, and connected area in the data chain to create an index table linking the &#x003C;connected area number Area, line number Line, electrical component ID number, and information component IP address&#x003E;. The cyber network operating data link <italic>Q</italic><sub><italic>c</italic></sub> and the physical power grid operating data link <italic>Q</italic><sub><italic>p</italic></sub> in the interval are compared using the index table, and the data are stored in the corresponding index.</p>
<p>The cyber network clock with a collection period of T is used, and we set the sampling time window to &#x03B5; = [T&#x2212;&#x03B1;T&#x2032;,T], where <sub>&#x03B1;</sub> is the window size parameter. The larger the value, the longer the collection period is. In the sampling time window <sub>&#x025B;</sub> many identical state events may occur in the cyber&#x2013;physical coupling state chain. Therefore, these repetitive events are filtered and compressed to form the cyber&#x2013;physical operating state data link, which is expressed as follows:</p>
<disp-formula id="S3.Ex1">
<label>(6)</label>
<mml:math id="M6">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mtext>Q(&#x03B5;)=&#x007B;&#x00A0;x</mml:mtext>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>[Q</mml:mtext>
</mml:mrow>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>&#x00A0;(t</mml:mtext>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>),Q</mml:mtext>
</mml:mrow>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>&#x00A0;(t</mml:mtext>
</mml:mrow>
<mml:mn>1</mml:mn>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>)],x</mml:mtext>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>&#x00A0;[Q</mml:mtext>
</mml:mrow>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>&#x00A0;(t</mml:mtext>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>),Q</mml:mtext>
</mml:mrow>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>&#x00A0;(t</mml:mtext>
</mml:mrow>
<mml:mn>2</mml:mn>
</mml:msub>
<mml:mtext>)],</mml:mtext>
<mml:mn>...</mml:mn>
<mml:msub>
<mml:mrow>
<mml:mtext>,x</mml:mtext>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>&#x00A0;[Q</mml:mtext>
</mml:mrow>
<mml:mi>p</mml:mi>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>&#x00A0;(t</mml:mtext>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>),Q</mml:mtext>
</mml:mrow>
<mml:mi>c</mml:mi>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext>&#x00A0;(t</mml:mtext>
</mml:mrow>
<mml:mi>n</mml:mi>
</mml:msub>
<mml:mtext>)]&#x007D;</mml:mtext>
</mml:mrow>
</mml:math>
</disp-formula>
</sec>
</sec>
<sec id="S4">
<title>Coordinated Cyber-Attack Detection Model of the Cyber&#x2013;Physical Power System</title>
<sec id="S4.SS1">
<title>Operating State Clustering Based on the Two-Step Principal Component Analysis</title>
<p>There are no labels for the different state categories in the original cyber&#x2013;physical operating state data link <italic>Q</italic>(&#x03B5;). It is necessary to distinguish the different state categories using cluster analysis. In this paper, the two-step principal component analysis (PCA) clustering algorithm is proposed. The PCA algorithm is used to cluster, transform, and filter the correlated attributes to extract linear uncorrelated attributes (<xref ref-type="bibr" rid="B10">Jian et al., 2004</xref>). The two-step algorithm is used to cluster the attribute set; it reduces the computational complexity and provides high clustering accuracy (<xref ref-type="bibr" rid="B5">Dom et al., 2003</xref>; <xref ref-type="bibr" rid="B19">Northrup et al., 2004</xref>; <xref ref-type="bibr" rid="B21">Phelps et al., 2009</xref>). The algorithm steps are as follows:</p>
<p>Input: cyber&#x2013;physical operating state data link <italic>Q</italic>(&#x03B5;) = {<italic>x</italic><sub>1</sub>, <italic>x</italic><sub>2</sub>, &#x2026;, <italic>x</italic><sub><italic>n</italic></sub>}.</p>
<p>Output: D = {<italic>x</italic><sub><italic>i</italic></sub>, <italic>C</italic><sub><italic>i</italic></sub>}, where <italic>C</italic><sub><italic>i</italic></sub> is the operating state of the clusters <italic>C</italic> = {<italic>C</italic><sub>1</sub>, <italic>C</italic><sub>2</sub>, &#x2026;, <italic>C</italic><sub><italic>k</italic></sub>}.</p>
<p>Step 1: Feature selection for clustering. The PCA algorithm is used to map <italic>n</italic> attributes in the data link to <italic>m</italic> dimension (<italic>m</italic> &#x003C; n). The correlated attributes are filtered using an orthogonal transformation to obtain <italic>m</italic>-dimensional new features, <italic>A</italic> = {<italic>A</italic><sub>1</sub>, <italic>A</italic><sub>2</sub>, &#x2026;<italic>A</italic><sub><italic>m</italic></sub>}. The centralizing mean <inline-formula><mml:math id="INEQ6"><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mtext>i</mml:mtext></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mtext>i</mml:mtext></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:mi>n</mml:mi></mml:mfrac><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>n</mml:mi></mml:msubsup><mml:msub><mml:mi>x</mml:mi><mml:mrow><mml:mtext>i</mml:mtext></mml:mrow></mml:msub></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> is used to derive the covariance matrix <sub><italic>XX^T</italic></sub>, whose eigenvalues and eigenvectors are obtained. The data link set <italic>Q</italic>&#x2019;(&#x03B5;) is obtained after dimensionality reduction.</p>
<p>Step 2: Calculate the number of category clusters in the operating state. After the traversal process, the clustering feature (CF tree) growth in the balanced iterative reducing and clustering using hierarchies (BIRCH) algorithm is applied to the data link set <italic>Q</italic>&#x2019;(&#x03B5;). The data points in the data set are evaluated one by one to collect all data points in the dense area while generating the CF tree. The log-likelihood distance <italic>d</italic>(<italic>C</italic><sub><italic>s</italic></sub>,<italic>C</italic><sub><italic>t</italic></sub>) = &#x03B6;<sub><italic>s</italic></sub> + &#x03B6;<sub><italic>t</italic></sub>&#x2212;&#x03B6;<sub>&#x003C; <italic>s</italic>,<italic>t</italic> &#x003E;</sub> between the two clusters is used to create many small subclusters. The Bayes information criterion (BIC) is used to calculate the number of possible division schemes for the state category.</p>
<p>Step 3: Determine the number of categories <italic>C</italic><sub><italic>J</italic></sub> in the <italic>Q</italic>&#x2019;(&#x03B5;). The agglomerative hierarchical clustering (AHC) method is used to merge the subclusters one by one, and the desired number of clusters is reached according to the <italic>R</italic>(<italic>k</italic>) between the two clusters.</p>
<disp-formula id="S4.E7">
<label>(7)</label>
<mml:math id="M7">
<mml:mrow>
<mml:mrow>
<mml:mi>R</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mi>d</mml:mi>
<mml:mi>min</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>d</mml:mi>
<mml:mi>min</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>+</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>C</italic><sub><italic>k</italic></sub> and <italic>C</italic><sub><italic>k</italic></sub><sub>+</sub><italic><sub>1</sub></italic> is a partition scheme with <italic>k</italic> or <italic>k</italic>+1 cluster numbers; <italic>d</italic><sub><italic>min</italic></sub>(<italic>C</italic><sub><italic>k</italic></sub><sub>+</sub><sub>1</sub>) and <italic>d</italic><sub><italic>min</italic></sub>(<italic>C</italic><sub><italic>k</italic></sub>) is the distance between the two smallest clusters in the scheme.</p>
<p>Step 4: Label the sample data in each operating state cluster. The data points in each cluster are determined; the data points <italic>x</italic><sub><italic>i</italic></sub> in the state data link set <italic>Q</italic>&#x2019;(&#x03B5;) are regarded as single-point clusters according to the clustering results <italic>C</italic><sub><italic>J</italic></sub>. The logarithm similarity between <italic>x</italic><sub><italic>i</italic></sub> and each cluster in <italic>C</italic><sub><italic>J</italic></sub> is determined. Given the distance <italic>d</italic>{{<italic>x</italic><sub><italic>i</italic></sub>}, <italic>C</italic><sub><italic>J</italic></sub>}, <italic>x</italic><sub><italic>i</italic></sub> is placed into the nearest cluster, and labels are generated for each operating state category <italic>C</italic> = {<italic>C</italic><sub>1</sub>, <italic>C</italic><sub>2</sub>,&#x2026;, <italic>C</italic><sub><italic>k</italic></sub>}.</p>
</sec>
<sec id="S4.SS2">
<title>Algorithm to Reduce the Imbalance of the Operating State Classes</title>
<p>A coordinated cyber-attack event of the CPPS has a small probability and high risk. In the data link <sub><italic>Q</italic>(&#x025B;)</sub> normal operation data account for the largest proportion, whereas the proportion of attack data is relatively small, resulting in unbalanced data. Therefore, the ADASYN algorithm is used to deal with the imbalance of the operating state classes (<xref ref-type="bibr" rid="B22">Qu et al., 2018</xref>; <xref ref-type="bibr" rid="B26">Wang et al., 2020</xref>). Balanced data distribution is obtained by adaptive synthetic oversampling. Different minority samples are given different weights to generate different numbers of samples. The algorithm process is as follows:</p>
<p>Input:<italic>D</italic> = {<italic>x</italic><sub><italic>i</italic></sub>, <italic>C</italic><sub><italic>i</italic></sub>}, where <italic>x</italic><sub><italic>i</italic></sub> is the cyber&#x2013;physical operating state data link <italic>Q</italic>(&#x03B5;), <italic>C</italic><sub><italic>i</italic></sub> is the class label. &#x03B1; is the imbalance threshold, <italic>C</italic><sub><italic>k</italic></sub> is a minority class, and <italic>C</italic><sub><italic>l</italic></sub> is the majority class.</p>
<p>Output: Balanced data set <italic>D</italic>&#x2032;.</p>
<p>Step 1: Calculate the class imbalance, where <inline-formula><mml:math id="INEQ12"><mml:mrow><mml:mi>Imbalance</mml:mi><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mpadded width="+2.8pt"><mml:mi>Lagre</mml:mi></mml:mpadded><mml:mo>&#x2062;</mml:mo><mml:mi>num</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi mathvariant="normal">C</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mpadded width="+2.8pt"><mml:mi>Small</mml:mi></mml:mpadded><mml:mo>&#x2062;</mml:mo><mml:mi>num</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi mathvariant="normal">C</mml:mi><mml:mi>k</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula>. Calculate the number of samples to be synthesized based on the degree of imbalance <italic>G</italic> = (<italic>Lagre</italic><italic>num</italic>(C<sub><italic>l</italic></sub>)&#x2212;<italic>Small</italic><italic>num</italic>(C<sub><italic>k</italic></sub>))&#x00D7;&#x03B2;,&#x03B2;&#x2208;[0,1].</p>
<p>Step 2: Calculate the proportion of the majority class in the <italic>K</italic>-nearest neighbors (KNNs). <italic>r</italic><sub><italic>i</italic></sub> = &#x0394;<sub><italic>i</italic></sub>/<italic>K</italic>, where &#x0394;<sub><italic>i</italic></sub> is the number of samples of the majority class in the KNN.</p>
<p>Step 3: Calculate the majority class surrounding each minority sample.</p>
<disp-formula id="S4.E8">
<label>(8)</label>
<mml:math id="M8">
<mml:mrow>
<mml:mover>
<mml:mpadded lspace="2.8pt" width="+2.8pt">
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mpadded>
<mml:mrow>
<mml:mi mathvariant="normal">&#x2227;</mml:mi>
<mml:mphantom>
<mml:mi>i</mml:mi>
</mml:mphantom>
</mml:mrow>
</mml:mover>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:msubsup>
<mml:mo largeop="true" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mtext>i</mml:mtext>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mpadded width="+2.8pt">
<mml:mrow>
<mml:mtext>samll</mml:mtext>
</mml:mrow>
</mml:mpadded>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mtext>num</mml:mtext>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Step 4: Calculate the number of samples that need to be generated for each minority sample <italic>C</italic><sub><italic>k</italic></sub>.</p>
<disp-formula id="S4.E9">
<label>(9)</label>
<mml:math id="M9">
<mml:mrow>
<mml:msub>
<mml:mi>g</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mover>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mrow>
<mml:mi mathvariant="normal">&#x2227;</mml:mi>
<mml:mphantom>
<mml:mi>i</mml:mi>
</mml:mphantom>
</mml:mrow>
</mml:mover>
<mml:mo>&#x00D7;</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Step 5: Select a minority sample among <italic>k</italic> neighbors around each minority sample and synthesize using Eq. (10). Repeat the synthesis g<italic><sub><italic>i</italic></sub></italic> times until the desired number of synthesized samples is obtained.</p>
<disp-formula id="S4.E10">
<label>(10)</label>
<mml:math id="M10">
<mml:mrow>
<mml:msub>
<mml:mi>s</mml:mi>
<mml:mrow>
<mml:mtext>i</mml:mtext>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mrow>
<mml:mi>z</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:msub>
<mml:mi>X</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x00D7;</mml:mo>
<mml:mi mathvariant="normal">&#x03B7;</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where s<italic><sub><italic>i</italic></sub></italic> is the composite sample, <sub><italic>X_i</italic></sub> is the <italic>i</italic>-th sample in the minority sample, <italic>X</italic><sub><italic>i</italic></sub> &#x2208; [0,1], <italic>X</italic><sub><italic>z</italic><italic>i</italic></sub> is a randomly selected minority sample among the KNNs of <italic>X</italic><sub><italic>i</italic></sub>.</p>
<p>Repeat the synthesis until the desired number of synthesized samples in Eq. (5) has been obtained.</p>
</sec>
<sec id="S4.SS3">
<title>Classification Algorithm of Coordinated Cyber-Attacks Based on Cost-Sensitive Gradient Boosting Decision Tree</title>
<p>The purpose of attack detection is to minimize the harm to the power grid caused by the attack. The harm caused by misinterpreting an attack as a normal event is far greater than that caused by misinterpreting a normal event as an attack (<xref ref-type="bibr" rid="B9">Huang et al., 2018</xref>). We propose using the CS function to improve the GBDT (<xref ref-type="bibr" rid="B23">Sakhnovich, 2011</xref>; <xref ref-type="bibr" rid="B14">Liao et al., 2016</xref>). The CS loss function replaces the standard cost loss function to prevent attack event misclassification. The improved CS loss function is defined as follows:</p>
<disp-formula id="S4.E11">
<label>(11)</label>
<mml:math id="M11">
<mml:mrow>
<mml:mrow>
<mml:mtext>Loss</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mi>C</mml:mi>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">f</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="normal">x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>K</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mi>w</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mi>log</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>K</italic> is the class of all attacks, <sub><italic>C_k</italic></sub> is the sample of the <italic>k</italic>-th attack, and <sub><italic>p_k (x)</italic></sub> is the probability of the <italic>k</italic>-th attack, <sub><italic>w_k</italic></sub> is the CS function, it can be divided into two costs, i.e., the missed detection cost <inline-formula><mml:math id="INEQ19"><mml:mrow><mml:mrow><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mo>-</mml:mo><mml:mo>,</mml:mo><mml:mo>+</mml:mo><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>&#x2265;</mml:mo><mml:mfrac><mml:msub><mml:mi>w</mml:mi><mml:mo>-</mml:mo></mml:msub><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mo>+</mml:mo></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>w</mml:mi><mml:mo>-</mml:mo></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:math></inline-formula> and the misdetection cost <inline-formula><mml:math id="INEQ20"><mml:mrow><mml:mrow><mml:mrow><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mo>+</mml:mo><mml:mo>,</mml:mo><mml:mo>-</mml:mo><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mi>p</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mi>p</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>x</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>&lt;</mml:mo><mml:mfrac><mml:msub><mml:mi>w</mml:mi><mml:mo>-</mml:mo></mml:msub><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mo>+</mml:mo></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>w</mml:mi><mml:mo>-</mml:mo></mml:msub></mml:mrow></mml:mfrac></mml:mrow><mml:mo>.</mml:mo></mml:mrow></mml:math></inline-formula></p>
<p>Coordinated cyber-attack detection is a multi-classification task. A total of <italic>K</italic> types of attacks are assumed. The sample <italic>x</italic> in the cyber&#x2013;physical operating state set is obtained, and the CS-GBDT algorithm is used to determine which class the <italic>x</italic> sample belongs to. The specific steps of the algorithm are as follows:</p>
<p>Input: Balanced data set <inline-formula><mml:math id="INEQ21"><mml:mrow><mml:mtext>D</mml:mtext><mml:mmultiscripts><mml:mo>=</mml:mo><mml:mprescripts/><mml:none/><mml:mo>&#x2032;</mml:mo></mml:mmultiscripts><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:mi mathvariant="normal">&#x2026;</mml:mi><mml:mo>,</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>x</mml:mi><mml:mi>N</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>C</mml:mi><mml:mi>N</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula>, loss function Loss (<italic>C</italic><sub><italic>k</italic></sub>, <italic>f</italic><sub><italic>k</italic></sub>(<italic>x</italic>)), and the number of classifiers <italic>M</italic>.</p>
<p>Output: A strong learner for attack classification <italic>F</italic>(<italic>x</italic>).</p>
<p>Step 1: Initialize <italic>f</italic><sub><italic>k</italic></sub><sub>0</sub>(<italic>x</italic>) = 0, the number of categories classified <italic>k</italic> = 1,2,&#x2026;<italic>K.</italic></p>
<p>Step 2: Starting from <italic>t</italic> = 1 to <italic>t</italic> = <italic>M</italic>, there are <italic>M</italic> iterations in total, repeating steps 3 through 6, at last building <italic>M</italic> classifiers.</p>
<p>Step 3: The one-hot code for each class <italic>y</italic><sub><italic>i</italic></sub> is generated. We calculate the probability of sending the <italic>k</italic>-th attack sample <italic>p</italic><sub><italic>k</italic></sub>(<italic>x</italic>).</p>
<disp-formula id="S4.E12">
<label>(12)</label>
<mml:math id="M12">
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
<mml:mrow>
<mml:msubsup>
<mml:mo largeop="true" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>K</mml:mi>
</mml:msubsup>
<mml:msup>
<mml:mi>e</mml:mi>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>l</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>Step 4: Start from <italic>k</italic> = 1 to <italic>k</italic> = <italic>K</italic>, repeating steps 5 through 6, we generate <italic>K</italic> different CART classification trees <italic>f</italic><sub>1</sub>(<italic>x</italic>), <italic>f</italic><sub>2</sub>(<italic>x</italic>),&#x2026;<italic>f</italic><sub><italic>K</italic></sub>(<italic>x</italic>).</p>
<p>Step5: Calculate the negative gradient of each class in the <italic>m</italic> class and obtain the negative gradient error of the <italic>i</italic>-th sample corresponding to category <italic>k</italic> in the <italic>t</italic>-th iteration:</p>
<disp-formula id="S4.E13">
<label>(13)</label>
<mml:math id="M13">
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mpadded lspace="2.8pt" width="+2.8pt">
<mml:mi>r</mml:mi>
</mml:mpadded>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>p</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x2202;</mml:mo>
<mml:mo>&#x2061;</mml:mo>
<mml:mrow>
<mml:mi>L</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>o</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>s</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>s</mml:mi>
</mml:mrow>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo>&#x2202;</mml:mo>
<mml:mo>&#x2061;</mml:mo>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mi>k</mml:mi>
</mml:msub>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
<mml:mo>]</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mn>2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="normal">&#x2026;</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>N</italic> is the number of sample data.</p>
<p>We use the estimated residual {(<italic>x<sub>1</sub>, r<sub><italic>k1</italic></sub></italic>),&#x2026;(<italic>x<sub><italic>N</italic></sub>, r<sub><italic>kN</italic></sub></italic>))} as an input to calculate the leaf node area of the <italic>m</italic>-th decision tree:</p>
<disp-formula id="S4.E14">
<label>(14)</label>
<mml:math id="M14">
<mml:mrow>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>K</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>K</mml:mi>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mo largeop="true" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mi>r</mml:mi>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mo largeop="true" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mi>x</mml:mi>
<mml:mi>i</mml:mi>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>|</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mo>|</mml:mo>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>|</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>R</italic><sub><italic>mkj</italic></sub> is the leaf node region <italic>R</italic><sub><italic>mj</italic></sub> of the m-th tree. <italic>K</italic> is the number of categories.</p>
<p>Step 6: Update the classifier <italic>f</italic><sub>mk</sub>(<italic>x</italic>).</p>
<disp-formula id="S4.E15">
<label>(15)</label>
<mml:math id="M15">
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mpadded lspace="2.8pt" width="+2.8pt">
<mml:mi>f</mml:mi>
</mml:mpadded>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mi>f</mml:mi>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>J</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>I</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo rspace="5.3pt">,</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo rspace="5.3pt">&#x2208;</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>J</italic> is the number of leaf nodes per tree.</p>
<p>Step 7: Build final classification tree with high accuracy used for attack detection.</p>
<disp-formula id="S4.E16">
<label>(16)</label>
<mml:math id="M16">
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mpadded lspace="5.6pt" width="+5.6pt">
<mml:mi>F</mml:mi>
</mml:mpadded>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>x</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>=</mml:mo>
<mml:mrow>
<mml:munderover>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>M</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:munderover>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>=</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mi>J</mml:mi>
</mml:munderover>
<mml:mrow>
<mml:msub>
<mml:mi>C</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>I</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mi>R</mml:mi>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>k</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
</sec>
</sec>
<sec id="S5">
<title>Experimental Analysis</title>
<sec id="S5.SS1">
<title>Experimental Environment and Data</title>
<p>We simulate the different fault states of the physical power grid caused by cyber-attacks on the IEEE39-bus system in the RT-LAB and OPNET co-simulation environment. We collect the DR, PR, and threat information at different times on the cyber side. The voltage, current, impedance, and other data are collected on the physical side. The 10 data sets are obtained. Each set contains 56 attributes, and the cumulative number of records is about 50,000, including five types of operating states in the CPPS, as follows:</p>
<p>(1) Normal operating state (S1): there is no network attack on the cyber side, and the power grid on the physical side is operating normally. (2) Distributed denial of service (DDOS) attack state (S2): the data in the communication system are blocked by a DDOS attack, affecting the normal operation of the power system, measurement acquisition, and control commands. (3) Data injection attack state (S3): malicious data injection into physical power grid disguised as a normal fault, resulting in the operator mistakenly assuming a short-circuit fault. (4) Protection device parameter tampering attack state (S4): the attacker tampers with the distance parameter of the protection device, causing a failure of the protection device to disconnect the fault area. (5) Fault operation state (S5): the physical power grid has a single-phase, two-phase, or short-circuit fault.</p>
</sec>
<sec id="S5.SS2">
<title>Results of the Operating State Classification of the Cyber&#x2013;Physical Power System</title>
<p>The data set 1 with 4,966 records is selected in the experiment. After implementing the two-step PCA clustering algorithm, the number of outliers is 89, and there are five operating states, as shown in <xref ref-type="fig" rid="F2">Figure 2A</xref>. Cluster-3 (S2) has the largest number of records (1,325). The clustering superiority is 0.93, and the clustering importance is 0.85, accounting for 29.3% of the records. The smallest cluster is Cluster-4 (S5), with 59 records, accounting for 1.3% of all records. The clustering superiority is 0.97, and the clustering importance is 0.93. Clustering superiority is a measure of cluster separation (&#x2212;1&#x223C;0.2 poor| 0.2&#x223C;0.5 medium| 0.5&#x223C;1 good), and clustering importance is a measure of cluster cohesion (0&#x223C;0.2 poor| 0.2&#x223C;0.6 medium| 0.6&#x223C;1 good) (<xref ref-type="bibr" rid="B17">Nair and Narendran, 1997</xref>).</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Clustering results of five states <bold>(A)</bold>, Current amplitude under DDOS attack <bold>(B)</bold>, Current amplitude under fault data injection attack <bold>(C)</bold>, Current amplitude under fault state <bold>(D)</bold>, Current amplitude under parameter tampering attack <bold>(E)</bold>, Performance comparison <bold>(F)</bold>.</p></caption>
<graphic xlink:href="fenrg-09-666130-g002.tif"/>
</fig>
<p>According to the negative sequence current and zero sequence current amplitude of each cluster in the experiment, the curves of the three attack states and the fault state are obtained, as shown in <xref ref-type="fig" rid="F2">Figures 2B&#x2013;E</xref>. Cluster-2 is significantly different from the other four states, while Cluster-4 and Cluster-5 have high similarities. The reason is that Cluster-2 is an attack that causes network blocking and delay, which is significantly different from the other types of data tampering attacks. Cluster-3 and Cluster-5 are physical power grid failures caused by information tampering attacks. These states are similar to the changes occurring in the Cluster-4 power grid normal fault.</p>
<p>The adjusted Rand index (ARI) is used to measure the accuracy of the clustering results; ARI &#x2208; [&#x2212;1,1], the closer the value is to 1, the better the clustering performance is. The index is calculated as follows:</p>
<disp-formula id="S5.E17">
<label>(17)</label>
<mml:math id="M17">
<mml:mrow>
<mml:mtext>ARI</mml:mtext>
<mml:mo>=</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mrow>
<mml:mi>R</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>I</mml:mi>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">RI</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mi>max</mml:mi>
<mml:mo>&#x2061;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">RI</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mi>E</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="italic">RI</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <italic>RI</italic> is the Rand coefficient, and <italic>E</italic>(<italic>RI</italic>) is the expected value of each class.</p>
<p>Four typical clustering algorithms are selected for performance comparison, i.e., K-means, density-based spatial clustering of applications with noise (DBSCAN), clustering using representatives (CURE), and BIRCH. In the experiment, the sample size of the test data set is randomly selected and ranges from 5 to 100% of the data set. The ARI values of the different algorithms are shown in <xref ref-type="fig" rid="F2">Figure 2F</xref>. As the proportion of the test data set increases, the ARI increases significantly. The accuracy of the proposed two-step PCA method is 97% for a sample size of 100%, demonstrating the excellent performance of this method. The K-means algorithm has the lowest ARI values.</p>
</sec>
<sec id="S5.SS3">
<title>Result of Balancing the Operation State Classes</title>
<p>The number of samples in the operating state classes in 10 data sets before implementing the algorithm is shown in <xref ref-type="fig" rid="F3">Figure 3A</xref>. The number of samples is imbalanced in the different operating states. The largest number of records (143,766) occurs in the S3 state, and the fewest number (3,080) is observed in the S5 state. The maximum class imbalance is 3.77. There are multiple minority and majority categories in the joint data set, showing multi-category imbalance.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Data set before balancing <bold>(A)</bold>, Data set after balancing <bold>(B)</bold>.</p></caption>
<graphic xlink:href="fenrg-09-666130-g003.tif"/>
</fig>
<p>The ADASYN algorithm is used to oversample the categories whose number is less than the threshold. We set the maximum imbalance threshold to 1.2. The results in the 10 data sets are shown in <xref ref-type="fig" rid="F3">Figure 3B</xref>. The proportion of records in each dataset is close to 20%. The ADASYN algorithm uses local screening and sampling to reduce the influence of data imbalance on the false alarm rate of coordinated cyber-attack detection.</p>
</sec>
<sec id="S5.SS4">
<title>Performance Verification of the Coordinated Cyber-Attack Detection in the Cyber&#x2013;Physical Power System</title>
<p>The balanced data set is divided into a training set (70% samples) and a test set (30% samples). The model loss parameters are set according to the improved CS loss function. There are 130 integrated base classifiers, and the depth of each independent tree (max_depth) is seven.</p>
<p>The receiver operating characteristic (ROC) curve obtained by classifying the test data set is shown in <xref ref-type="fig" rid="F4">Figure 4A</xref>. The curves of the five categories are close to the (0,1) position, and the average area under the ROC curve (AUC) is 0.982. This result shows that the attack detection model has a low false alarm rate and high accuracy.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Attack detection receiver operating characteristic (ROC) curve <bold>(A)</bold>, Precision-Recall curve <bold>(B)</bold>, Confusion matrix <bold>(C)</bold>, and Performance comparison <bold>(D)</bold>. AUC, area under the ROC curve; CS-GBDT, CS gradient boosting decision tree; KNN, <italic>K</italic>-nearest neighbors; SVM, support vector machine.</p></caption>
<graphic xlink:href="fenrg-09-666130-g004.tif"/>
</fig>
<p>The precision-recall curve obtained by classifying the test data set is shown in <xref ref-type="fig" rid="F4">Figure 4B</xref>. The precision-recall curve are all close to the (1,1) position, indicating that the attack detection model has high recall and accuracy, even when the ratio of positive and negative samples is large. Therefore, the proposed attack detection model has a high classification accuracy for unbalanced data.</p>
<p>The confusion matrix of the attack detection results is shown in <xref ref-type="fig" rid="F4">Figure 4C</xref>. The detection accuracy for the DDOS blocking attack (S2) is 98%, that of the data injection attack (S3) is 96%, that of the protection device parameter tampering attack (S4) is 97%, that of the normal operation (S1) is 99%, and that of the fault operation (S5) is 98%. These results demonstrate that the proposed coordinated cyber-attack detection model accurately detects coordinated attack events on the network and distinguishes attack states from the fault operation state, with a maximum false-positive rate of only 4%.</p>
<p>Finally, the proposed model is compared with typical classification algorithms, including the KNN, Xgboost, Random Forest, Adaboost, and support vector machine (SVM). The overall accuracy, average recall, average precision, and average F1-score of the algorithms are shown in <xref ref-type="fig" rid="F4">Figure 4D</xref>. The recall and precision of the CS-GBDT algorithm are higher than 97%. The algorithm performance is stable, and it provides better performance for detecting various attack events than comparable algorithms.</p>
</sec>
</sec>
<sec id="S6">
<title>Conclusion</title>
<p>In this paper, a cyber&#x2013;physical operating state data link was established using data fusion mapping. The two-step PCA clustering algorithm is proposed for accurate labeling of the different operating states of the network. A coordinated cyber-attack classifier based on the CS-GBDT was established that considers the imbalance of the attack status categories and the cost sensitivity of the attack event. The algorithm can detect attacks on the CPPS and distinguish different attack types. The proposed model has a low false alarm rate and high accuracy for attack detection. It is suitable for the detection of coordinated cyber-attack events with unbalanced attack sample data and high data dimensionality.</p>
</sec>
<sec id="S7">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/supplementary material, further inquiries can be directed to the corresponding author/s.</p>
</sec>
<sec id="S8">
<title>Author Contributions</title>
<p>LW designed the model framework of the manuscript and experimental verification. PX contributed to the construction method of the cyber&#x2013;physical operation state data link. ZQ contributed to design the two-step PCA algorithm for operating state clustering. XB completed the simulation experiment of attack classification detection. YD performed the data collection and researched balance processing algorithm of operating state classes. ZZ studied the classification algorithm of cyber cooperative attack based on CS_GBDT. YL built a simulation environment and improved the grammar and sentence structure of the full manuscript. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>PX was employed by Siping Power Supply Company of State Grid Jilin Electric Power Company Limited, China. The remaining authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="financial-disclosure">
<p><bold>Funding.</bold> This manuscript was supported in part by the science and technology innovation development plan project of Jilin (20200401097GX).</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Basin</surname> <given-names>D.</given-names></name> <name><surname>Cremers</surname> <given-names>C.</given-names></name> <name><surname>Kim</surname> <given-names>T. H. J.</given-names></name> <name><surname>Perrig</surname> <given-names>A.</given-names></name> <name><surname>Sasse</surname> <given-names>R.</given-names></name> <name><surname>Szalachowski</surname> <given-names>P.</given-names></name></person-group> (<year>2016</year>). <article-title>Design, analysis, and implementation of ARPKI: an attack-resilient public-key infrastructure.</article-title> <source><italic>IEEE Trans. Depend. Sec. Comput.</italic></source> <volume>15</volume> <fpage>393</fpage>&#x2013;<lpage>408</lpage>. <pub-id pub-id-type="doi">10.1109/tdsc.2016.2601610</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>T. M.</given-names></name> <name><surname>Sanchez-Aarnoutse</surname> <given-names>J. C.</given-names></name> <name><surname>Buford</surname> <given-names>J.</given-names></name></person-group> (<year>2011</year>). <article-title>Petri net modeling of cyber-physical attacks on smart grid.</article-title> <source><italic>IEEE Trans. Smart Grid</italic></source> <volume>2</volume> <fpage>741</fpage>&#x2013;<lpage>749</lpage>. <pub-id pub-id-type="doi">10.1109/tsg.2011.2160000</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dai</surname> <given-names>Q.</given-names></name> <name><surname>Shi</surname> <given-names>L.</given-names></name> <name><surname>Ni</surname> <given-names>Y.</given-names></name></person-group> (<year>2019</year>). <article-title>Risk assessment for cyberattack in active distribution systems considering the role of feeder automation.</article-title> <source><italic>IEEE Trans. Power Syst.</italic></source> <volume>34</volume> <fpage>3230</fpage>&#x2013;<lpage>3240</lpage>. <pub-id pub-id-type="doi">10.1109/tpwrs.2019.2899983</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Davarikia</surname> <given-names>H.</given-names></name> <name><surname>Barati</surname> <given-names>M.</given-names></name></person-group> (<year>2018</year>). <article-title>A tri-level programming model for attack-resilient control of power grids.</article-title> <source><italic>J. Modern Power Syst. Clean Energy</italic></source> <volume>6</volume> <fpage>918</fpage>&#x2013;<lpage>929</lpage>. <pub-id pub-id-type="doi">10.1007/s40565-018-0436-y</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dom</surname> <given-names>G.</given-names></name> <name><surname>Shaw-Jackson</surname> <given-names>C.</given-names></name> <name><surname>Matis</surname> <given-names>C.</given-names></name> <name><surname>Bouffioux</surname> <given-names>O.</given-names></name> <name><surname>Picard</surname> <given-names>J. J.</given-names></name> <name><surname>Prochiantz</surname> <given-names>A.</given-names></name><etal/></person-group> (<year>2003</year>). <article-title>Cellular uptake of Antennapedia Penetratin peptides is a two-step process in which phase transfer precedes a tryptophan-dependent translocation.</article-title> <source><italic>Nucleic Acids Res.</italic></source> <volume>31</volume> <fpage>556</fpage>&#x2013;<lpage>561</lpage>. <pub-id pub-id-type="doi">10.1093/nar/gkg160</pub-id> <pub-id pub-id-type="pmid">12527762</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Drayer</surname> <given-names>E.</given-names></name> <name><surname>Routtenberg</surname> <given-names>T.</given-names></name></person-group> (<year>2019</year>). <article-title>Detection of false data injection attacks in smart grids based on graph signal processing.</article-title> <source><italic>IEEE Syst. J.</italic></source> <volume>14</volume> <fpage>1886</fpage>&#x2013;<lpage>1896</lpage>. <pub-id pub-id-type="doi">10.1109/jsyst.2019.2927469</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Haes Alhelou</surname> <given-names>H.</given-names></name> <name><surname>Hamedani-Golshan</surname> <given-names>M. E.</given-names></name> <name><surname>Njenda</surname> <given-names>T. C.</given-names></name> <name><surname>Siano</surname> <given-names>P.</given-names></name></person-group> (<year>2019</year>). <article-title>A survey on power system blackout and cascading events: Research motivations and challenges.</article-title> <source><italic>Energies</italic></source> <volume>12</volume>:<issue>682</issue>. <pub-id pub-id-type="doi">10.3390/en12040682</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>L.</given-names></name> <name><surname>Zhu</surname> <given-names>Q.</given-names></name></person-group> (<year>2020</year>). <article-title>A dynamic games approach to proactive defense strategies against advanced persistent threats in cyber-physical systems.</article-title> <source><italic>Comput. Sec.</italic></source> <volume>89</volume>:<issue>101660</issue>. <pub-id pub-id-type="doi">10.1016/j.cose.2019.101660</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>T.</given-names></name> <name><surname>Satchidanandan</surname> <given-names>B.</given-names></name> <name><surname>Kumar</surname> <given-names>P. R.</given-names></name> <name><surname>Xie</surname> <given-names>L.</given-names></name></person-group> (<year>2018</year>). <article-title>An online detection framework for cyber-attacks on automatic generation control.</article-title> <source><italic>IEEE Trans. Power Syst.</italic></source> <volume>33</volume> <fpage>6816</fpage>&#x2013;<lpage>6827</lpage>. <pub-id pub-id-type="doi">10.1109/tpwrs.2018.2829743</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jian</surname> <given-names>Y.</given-names></name> <name><surname>David</surname> <given-names>Z.</given-names></name> <name><surname>Frangi</surname> <given-names>A. F.</given-names></name></person-group> (<year>2004</year>). <article-title>Two-dimensional PCA: a new approach to appearance-based face representation and recognition.</article-title> <source><italic>IEEE Trans. Pattern Analysis Machine Intelligence</italic></source> <volume>26</volume> <fpage>131</fpage>&#x2013;<lpage>137</lpage>. <pub-id pub-id-type="doi">10.1109/tpami.2004.1261097</pub-id> <pub-id pub-id-type="pmid">15382693</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koopman</surname> <given-names>G.</given-names></name> <name><surname>Mortier</surname> <given-names>D.</given-names></name> <name><surname>Michels</surname> <given-names>S.</given-names></name> <name><surname>Hofman</surname> <given-names>S.</given-names></name> <name><surname>Fagrouch</surname> <given-names>Z.</given-names></name> <name><surname>Remarque</surname> <given-names>E. J.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Influenza virus infection as well as immunization with DNA encoding haemagglutinin protein induces potent antibody-dependent phagocytosis (ADP) and monocyte infection-enhancing responses in macaques.</article-title> <source><italic>J. Gen. Virol.</italic></source> <volume>100</volume> <fpage>738</fpage>&#x2013;<lpage>751</lpage>. <pub-id pub-id-type="doi">10.1099/jgv.0.001251</pub-id> <pub-id pub-id-type="pmid">30920368</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kurt</surname> <given-names>M. N.</given-names></name> <name><surname>Y&#x0131;lmaz</surname> <given-names>Y.</given-names></name> <name><surname>Wang</surname> <given-names>X.</given-names></name></person-group> (<year>2018</year>). <article-title>Distributed quickest detection of cyber-attacks in smart grid.</article-title> <source><italic>IEEE Trans. Inform. Forensics Sec.</italic></source> <volume>13</volume> <fpage>2015</fpage>&#x2013;<lpage>2030</lpage>. <pub-id pub-id-type="doi">10.1109/tifs.2018.2800908</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lai</surname> <given-names>K.</given-names></name> <name><surname>Illindala</surname> <given-names>M.</given-names></name> <name><surname>Subramaniam</surname> <given-names>K.</given-names></name></person-group> (<year>2019</year>). <article-title>A tri-level optimization model to mitigate coordinated attacks on electric power systems in a cyber-physical environment.</article-title> <source><italic>Appl. Energy</italic></source> <volume>235</volume> <fpage>204</fpage>&#x2013;<lpage>218</lpage>. <pub-id pub-id-type="doi">10.1016/j.apenergy.2018.10.077</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liao</surname> <given-names>Z.</given-names></name> <name><surname>Huang</surname> <given-names>Y.</given-names></name> <name><surname>Yue</surname> <given-names>X.</given-names></name> <name><surname>Lu</surname> <given-names>H.</given-names></name> <name><surname>Xuan</surname> <given-names>P.</given-names></name> <name><surname>Ju</surname> <given-names>Y.</given-names></name></person-group> (<year>2016</year>). <article-title>In silico prediction of gamma-aminobutyric acid type-A receptors using novel machine-learning-based SVM and GBDT approaches.</article-title> <source><italic>BioMed. Res. Int.</italic></source> <volume>2016</volume>:<issue>2375268</issue>.</citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lin</surname> <given-names>H.</given-names></name> <name><surname>Slagell</surname> <given-names>A.</given-names></name> <name><surname>Kalbarczyk</surname> <given-names>Z. T.</given-names></name> <name><surname>Sauer</surname> <given-names>P. W.</given-names></name> <name><surname>Iyer</surname> <given-names>R. K.</given-names></name></person-group> (<year>2016</year>). <article-title>Runtime semantic security analysis to detect and mitigate control-related attacks in power grids.</article-title> <source><italic>IEEE Trans. Smart Grid</italic></source> <volume>9</volume> <fpage>163</fpage>&#x2013;<lpage>178</lpage>. <pub-id pub-id-type="doi">10.1109/tsg.2016.2547742</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>X.</given-names></name> <name><surname>Li</surname> <given-names>Z.</given-names></name> <name><surname>Li</surname> <given-names>Z.</given-names></name></person-group> (<year>2016</year>). <article-title>Optimal protection strategy against false data injection attacks in power systems.</article-title> <source><italic>IEEE Trans. Smart Grid</italic></source> <volume>8</volume> <fpage>1802</fpage>&#x2013;<lpage>1810</lpage>. <pub-id pub-id-type="doi">10.1109/tsg.2015.2508449</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nair</surname> <given-names>G. J.</given-names></name> <name><surname>Narendran</surname> <given-names>T. T.</given-names></name></person-group> (<year>1997</year>). <article-title>Cluster goodness: a new measure of performance for cluster formation in the design of cellular manufacturing systems.</article-title> <source><italic>Int. J. Prod. Econ.</italic></source> <volume>48</volume> <fpage>49</fpage>&#x2013;<lpage>61</lpage>. <pub-id pub-id-type="doi">10.1016/s0925-5273(96)00067-9</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nath</surname> <given-names>S.</given-names></name> <name><surname>Akingeneye</surname> <given-names>I.</given-names></name> <name><surname>Wu</surname> <given-names>J.</given-names></name> <name><surname>Han</surname> <given-names>Z.</given-names></name></person-group> (<year>2019</year>). <article-title>Quickest detection of false data injection attacks in smart grid with dynamic models.</article-title> <source><italic>IEEE J. Emerg. Selected Top. Power Electron.</italic></source> <volume>99</volume>, <fpage>1</fpage>&#x2013;<lpage>10</lpage>.</citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Northrup</surname> <given-names>A. B.</given-names></name> <name><surname>Mangion</surname> <given-names>I. K.</given-names></name> <name><surname>Hettche</surname> <given-names>F.</given-names></name> <name><surname>MacMillan</surname> <given-names>D. W.</given-names></name></person-group> (<year>2004</year>). <article-title>Enantioselective organocatalytic direct aldol reactions of &#x03B1;&#x2212;oxyaldehydes: step one in a two-step synthesis of carbohydrates.</article-title> <source><italic>Angewandte Chem. Int. Edition</italic></source> <volume>43</volume> <fpage>2152</fpage>&#x2013;<lpage>2154</lpage>. <pub-id pub-id-type="doi">10.1002/anie.200453716</pub-id> <pub-id pub-id-type="pmid">15083470</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Osanaiye</surname> <given-names>O. A.</given-names></name> <name><surname>Alfa</surname> <given-names>A. S.</given-names></name> <name><surname>Hancke</surname> <given-names>G. P.</given-names></name></person-group> (<year>2018</year>). <article-title>Denial of service defence for resource availability in wireless sensor networks.</article-title> <source><italic>IEEE Access</italic></source> <volume>6</volume> <fpage>6975</fpage>&#x2013;<lpage>7004</lpage>. <pub-id pub-id-type="doi">10.1109/access.2018.2793841</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Phelps</surname> <given-names>R. A.</given-names></name> <name><surname>Chidester</surname> <given-names>S.</given-names></name> <name><surname>Dehghanizadeh</surname> <given-names>S.</given-names></name> <name><surname>Phelps</surname> <given-names>J.</given-names></name> <name><surname>Sandoval</surname> <given-names>I. T.</given-names></name> <name><surname>Rai</surname> <given-names>K.</given-names></name><etal/></person-group> (<year>2009</year>). <article-title>A two-step model for colon adenoma initiation and progression caused by APC loss.</article-title> <source><italic>Cell</italic></source> <volume>137</volume> <fpage>623</fpage>&#x2013;<lpage>634</lpage>. <pub-id pub-id-type="doi">10.1016/s9999-9994(09)00528-5</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Qu</surname> <given-names>Z.</given-names></name> <name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Qu</surname> <given-names>N.</given-names></name> <name><surname>Wang</surname> <given-names>L.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Dong</surname> <given-names>Y.</given-names></name></person-group> (<year>2018</year>). <article-title>Method for quantitative estimation of the risk propagation threshold in electric power CPS based on seepage probability.</article-title> <source><italic>IEEE Access</italic></source> <volume>6</volume> <fpage>68813</fpage>&#x2013;<lpage>68823</lpage>. <pub-id pub-id-type="doi">10.1109/access.2018.2879488</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sakhnovich</surname> <given-names>A. L.</given-names></name></person-group> (<year>2011</year>). <article-title>The time-dependent Schr&#x00F6;dinger equation of dimension k+ 1: explicit and rational solutions via GBDT and multinodes.</article-title> <source><italic>J. Phys. A: Math. Theoret.</italic></source> <volume>44</volume>:<issue>475201</issue>. <pub-id pub-id-type="doi">10.1088/1751-8113/44/47/475201</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shen</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>W. A.</given-names></name> <name><surname>Ni</surname> <given-names>H.</given-names></name> <name><surname>Zhang</surname> <given-names>D.</given-names></name> <name><surname>Yu</surname> <given-names>L.</given-names></name></person-group> (<year>2019</year>). <article-title>Guaranteed cost control of networked control systems with DoS attack and time-varying delay.</article-title> <source><italic>Int. J. Control Automat. Syst.</italic></source> <volume>17</volume> <fpage>811</fpage>&#x2013;<lpage>821</lpage>. <pub-id pub-id-type="doi">10.1007/s12555-018-0324-2</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tian</surname> <given-names>W.</given-names></name> <name><surname>Ji</surname> <given-names>X.</given-names></name> <name><surname>Liu</surname> <given-names>W.</given-names></name> <name><surname>Liu</surname> <given-names>G.</given-names></name> <name><surname>Lin</surname> <given-names>R.</given-names></name> <name><surname>Zhai</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Defense strategies against network attacks in cyber-physical systems with analysis cost constraint based on honeypot game model.</article-title> <source><italic>Comput. Mater. Continua</italic></source> <volume>60</volume> <fpage>193</fpage>&#x2013;<lpage>211</lpage>. <pub-id pub-id-type="doi">10.32604/cmc.2019.05290</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>L.</given-names></name> <name><surname>Qu</surname> <given-names>Z.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Hu</surname> <given-names>K.</given-names></name> <name><surname>Sun</surname> <given-names>J.</given-names></name> <name><surname>Xue</surname> <given-names>K.</given-names></name><etal/></person-group> (<year>2020</year>). <article-title>Method for extracting patterns of coordinated network attacks on electric power CPS based on temporal&#x2013;topological correlation.</article-title> <source><italic>IEEE Access</italic></source> <volume>8</volume> <fpage>57260</fpage>&#x2013;<lpage>57272</lpage>. <pub-id pub-id-type="doi">10.1109/access.2020.2982057</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Q.</given-names></name> <name><surname>Tai</surname> <given-names>W.</given-names></name> <name><surname>Tang</surname> <given-names>Y.</given-names></name> <name><surname>Ni</surname> <given-names>M.</given-names></name> <name><surname>You</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>A two-layer game theoretical attack-defense model for a false data injection attack against power systems.</article-title> <source><italic>Int. J. Elect. Power Energy Syst.</italic></source> <volume>104</volume> <fpage>169</fpage>&#x2013;<lpage>177</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijepes.2018.07.007</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>X.</given-names></name> <name><surname>Tian</surname> <given-names>M.</given-names></name> <name><surname>Cao</surname> <given-names>M.</given-names></name> <name><surname>Li</surname> <given-names>X.</given-names></name> <name><surname>Zhao</surname> <given-names>Y.</given-names></name> <name><surname>Zhao</surname> <given-names>X.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Countermeasures to false data injection attacks on power system state estimation based on protecting measurements.</article-title> <source><italic>J. Nanoelectron. Optoelectron.</italic></source> <volume>14</volume> <fpage>626</fpage>&#x2013;<lpage>634</lpage>. <pub-id pub-id-type="doi">10.1166/jno.2019.2590</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>C.</given-names></name> <name><surname>Abur</surname> <given-names>A.</given-names></name></person-group> (<year>2017</year>). <article-title>A massively parallel framework for very large scale linear state estimation.</article-title> <source><italic>IEEE Trans. Power Syst.</italic></source> <volume>33</volume> <fpage>4407</fpage>&#x2013;<lpage>4413</lpage>. <pub-id pub-id-type="doi">10.1109/tpwrs.2017.2788360</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Xiang</surname> <given-names>Y.</given-names></name> <name><surname>Wang</surname> <given-names>L.</given-names></name></person-group> (<year>2016</year>). <article-title>Power system reliability assessment incorporating cyber-attacks against wind farm energy management systems.</article-title> <source><italic>IEEE Trans. Smart Grid</italic></source> <volume>8</volume> <fpage>2343</fpage>&#x2013;<lpage>2357</lpage>. <pub-id pub-id-type="doi">10.1109/tsg.2016.2523515</pub-id></citation></ref>
</ref-list>
</back>
</article>