<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Control. Eng.</journal-id>
<journal-title>Frontiers in Control Engineering</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Control. Eng.</abbrev-journal-title>
<issn pub-type="epub">2673-6268</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">1104745</article-id>
<article-id pub-id-type="doi">10.3389/fcteg.2023.1104745</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Control Engineering</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Graph neural networks for decentralized multi-agent perimeter defense</article-title>
<alt-title alt-title-type="left-running-head">Lee et al.</alt-title>
<alt-title alt-title-type="right-running-head">
<ext-link ext-link-type="uri" xlink:href="https://doi.org/10.3389/fcteg.2023.1104745">10.3389/fcteg.2023.1104745</ext-link>
</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Lee</surname>
<given-names>Elijah S.</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2104035/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Zhou</surname>
<given-names>Lifeng</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/2104098/overview"/>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Ribeiro</surname>
<given-names>Alejandro</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
<contrib contrib-type="author">
<name>
<surname>Kumar</surname>
<given-names>Vijay</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>GRASP Laboratory</institution>, <institution>University of Pennsylvania</institution>, <addr-line>Philadelphia</addr-line>, <addr-line>PA</addr-line>, <country>United States</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>Department of Electrical and Computer Engineering</institution>, <institution>Drexel University</institution>, <addr-line>Philadelphia</addr-line>, <addr-line>PA</addr-line>, <country>United States</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1835831/overview">Douglas Guimar&#xe3;es Macharet</ext-link>, Federal University of Minas Gerais, Brazil</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1117249/overview">Ziyang Meng</ext-link>, Tsinghua University, China</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1118310/overview">Christopher Nielsen</ext-link>, University of Waterloo, Canada</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Elijah S. Lee, <email>elslee@seas.upenn.edu</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Networked Control, a section of the journal Frontiers in Control Engineering</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>13</day>
<month>01</month>
<year>2023</year>
</pub-date>
<pub-date pub-type="collection">
<year>2023</year>
</pub-date>
<volume>4</volume>
<elocation-id>1104745</elocation-id>
<history>
<date date-type="received">
<day>22</day>
<month>11</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>03</day>
<month>01</month>
<year>2023</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2023 Lee, Zhou, Ribeiro and Kumar.</copyright-statement>
<copyright-year>2023</copyright-year>
<copyright-holder>Lee, Zhou, Ribeiro and Kumar</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>In this work, we study the problem of decentralized multi-agent perimeter defense that asks for computing actions for defenders with local perceptions and communications to maximize the capture of intruders. One major challenge for practical implementations is to make perimeter defense strategies scalable for large-scale problem instances. To this end, we leverage graph neural networks (GNNs) to develop an imitation learning framework that learns a mapping from defenders&#x2019; local perceptions and their communication graph to their actions. The proposed GNN-based learning network is trained by imitating a centralized expert algorithm such that the learned actions are close to that generated by the expert algorithm. We demonstrate that our proposed network performs closer to the expert algorithm and is superior to other baseline algorithms by capturing more intruders. Our GNN-based network is trained at a small scale and can be generalized to large-scale cases. We run perimeter defense games in scenarios with different team sizes and configurations to demonstrate the performance of the learned network.</p>
</abstract>
<kwd-group>
<kwd>graph neural networks</kwd>
<kwd>perimeter defense</kwd>
<kwd>multi-agent systems</kwd>
<kwd>perception-action-communication loops</kwd>
<kwd>imitation learning</kwd>
</kwd-group>
<contract-num rid="cn001">W911NF-17-2-0181</contract-num>
<contract-sponsor id="cn001">Army Research Laboratory<named-content content-type="fundref-id">10.13039/100006754</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>The problem of perimeter defense games considers a scenario where the defenders are constrained to move along a perimeter and try to capture the intruders while the intruders aim to reach the perimeter without being captured by the defenders (<xref ref-type="bibr" rid="B30">Shishika and Kumar, 2020</xref>). A number of previous works have solved this problem with engagements on a planar game space (<xref ref-type="bibr" rid="B31">Shishika and Kumar, 2018</xref>; <xref ref-type="bibr" rid="B3">Chen et al., 2021</xref>). However, in the real world, the perimeter may be represented by a three-dimensional shape as the players (e.g., defenders and intruders) may have the ability to perform three-dimensional motions. For example, a perimeter of a building that defenders aim to protect can be enclosed by a hemisphere. As a result, the defender robots should be able to move in three-dimensional space. For example, aerial robots have been well studied in various settings (<xref ref-type="bibr" rid="B15">Lee et al., 2016</xref>; <xref ref-type="bibr" rid="B11">Lee et al., 2020a</xref>; <xref ref-type="bibr" rid="B25">Nguyen et al., 2019</xref>; <xref ref-type="bibr" rid="B5">Chen et al., 2020</xref>), and all these settings can be real-world use-cases for perimeter defense. For instance, intruders try to attack a military base in the forest and defenders aim to capture the intruders.</p>
<p>In this work, we tackle the perimeter defense problem in a domain where multiple agents collaborate to accomplish a task. Multi-agent collaboration has been explored in many areas including environmental mapping (<xref ref-type="bibr" rid="B32">Thrun et al., 2000</xref>; <xref ref-type="bibr" rid="B19">Liu et al., 2022</xref>), search and rescue (<xref ref-type="bibr" rid="B2">Baxter et al., 2007</xref>; <xref ref-type="bibr" rid="B22">Miller et al., 2020</xref>), target tracking (<xref ref-type="bibr" rid="B14">Lee et al., 2022b</xref>; <xref ref-type="bibr" rid="B7">Ge et al., 2022</xref>), on-demand wireless infrastructure (<xref ref-type="bibr" rid="B23">Mox et al., 2020</xref>), transportation (<xref ref-type="bibr" rid="B24">Ng et al., 2022</xref>; <xref ref-type="bibr" rid="B36">Xu et al., 2022</xref>), and multi-agent learning (<xref ref-type="bibr" rid="B9">Kim et al., 2021</xref>). Our approach employs a team of robots that work collectively towards a common goal of defending a perimeter. We focus on developing decentralized strategies for a team of defenders for various reasons: i) the teammates can be dynamically added or removed without disrupting explicit hierarchy; ii) the centralized system may fail to cope with the high dimensionality of a team&#x2019;s joint state space; and iii) the defenders have a limited communication range and can only communicate locally.</p>
<p>To this end, we aim to develop a framework where a team of defenders collaborates to defend the perimeter using decentralized strategies based on local perceptions and communications. Specifically, we explore learning-based approaches to learn policies by imitating expert algorithms such as the maximum matching algorithm (<xref ref-type="bibr" rid="B4">Chen et al., 2014</xref>). Maximum matching algorithm that runs the exhaustive search to find the best policy is very computationally intensive at large scales since this approach is combinatorial in nature and assumes global information. We utilize GNN as the learning paradigm and demonstrate that the trained network can perform close to the expert algorithm. GNNs have decentralized communication architecture that capture the neighboring interactions and transferability that allows for generalization to previously unseen scenarios (<xref ref-type="bibr" rid="B28">Ruiz et al., 2021</xref>). We demonstrate that our proposed GNN-based network can be generalized to large scales in solving multi-robot perimeter defense games.</p>
<p>With this insight, we make the following primary contributions in this paper:</p>
<p>Framework for decentralized perimeter defense using graph neural networks. We propose a novel learning framework that utilizes a graph-based representation for the perimeter defense game. To the best of our knowledge, we are the first to solve the decentralized hemisphere perimeter defense problem by learning decentralized strategies <italic>via</italic> graph neural networks.</p>
<p>Robust perimeter defense performance with scalability. We demonstrate that our methods perform close to an expert policy (i.e., maximum matching algorithm <xref ref-type="bibr" rid="B4">Chen et al. (2014)</xref>) and are superior to other baseline algorithms. Our proposed networks are trained at a small scale and can be generalized to large scales.</p>
</sec>
<sec id="s2">
<title>2 Related work</title>
<sec id="s2-1">
<title>2.1 Perimeter defense</title>
<p>In a perimeter defense game, defenders aim to capture intruders by moving along a perimeter while intruders try to reach the perimeter without being captured by the defenders. We refer to (<xref ref-type="bibr" rid="B30">Shishika and Kumar, 2020</xref>) for a detailed survey. Many previous works dealt with engagements on a planar game space (<xref ref-type="bibr" rid="B31">Shishika and Kumar, 2018</xref>; <xref ref-type="bibr" rid="B20">Macharet et al., 2020</xref>; <xref ref-type="bibr" rid="B1">Bajaj et al., 2021</xref>; <xref ref-type="bibr" rid="B3">Chen et al., 2021</xref>; <xref ref-type="bibr" rid="B8">Hsu et al., 2022</xref>). For example, a cooperative multiplayer perimeter-defense game was solved on a planar game space in (<xref ref-type="bibr" rid="B31">Shishika and Kumar, 2018</xref>). In addition, an adaptive partitioning strategy based on intruder arrival estimation was proposed in (<xref ref-type="bibr" rid="B20">Macharet et al., 2020</xref>). Later, a formulation of the perimeter defense problem as an instance of the flow networks was proposed in (<xref ref-type="bibr" rid="B3">Chen et al., 2021</xref>). Further, an engagement on a conical environment was discussed in (<xref ref-type="bibr" rid="B1">Bajaj et al., 2021</xref>), and a model with heterogeneous teams was addressed in (<xref ref-type="bibr" rid="B8">Hsu et al., 2022</xref>).</p>
<p>High-dimensional extensions of the perimeter defense problem have been recently explored in (<xref ref-type="bibr" rid="B12">Lee et al., 2020b</xref>; <xref ref-type="bibr" rid="B13">Lee et al., 2021</xref>; <xref ref-type="bibr" rid="B11">Lee et al., 2022a</xref>; <xref ref-type="bibr" rid="B16">Lee and Bakolas, 2021</xref>; <xref ref-type="bibr" rid="B37">Yan et al., 2022</xref>). For example, <xref ref-type="bibr" rid="B16">Lee and Bakolas (2021)</xref> analyzed the two-player differential game of guarding a closed convex target set from an attacker in high-dimensional Euclidean spaces. <xref ref-type="bibr" rid="B37">Yan et al. (2022)</xref> studied a 3D multiplayer reach-avoid game where multiple pursuers defend a goal region against multiple evaders. <xref ref-type="bibr" rid="B12">Lee et al. (2020b)</xref>; <xref ref-type="bibr" rid="B13">Lee et al., 2021</xref>; <xref ref-type="bibr" rid="B11">Lee et al., 2022a</xref>) considered a game played between aerial defender and ground intruder.</p>
<p>All of the aforementioned works focus on solving centralized perimeter defense problems, which assume that players have global knowledge of other players&#x2019; states. However, decentralized control becomes a necessity as we reach a large number of players. To remedy this problem, <xref ref-type="bibr" rid="B34">Velhal et al. (2022)</xref> formulated the perimeter defense game into a decentralized multi-robot spatio-temporal multitask assignment problem on the perimeter of a convex shape. <xref ref-type="bibr" rid="B27">Paulos et al. (2019)</xref> proposed neural network architecture for training decentralized agent policies on the perimeter of a unit circle, where defenders have simple binary action spaces. Different from the aforementioned works, we focus on the high-dimensional perimeter, specialized to a hemisphere, with continuous action space. We solve multi-agent perimeter defense problems by learning decentralized strategies with graph neural networks.</p>
</sec>
<sec id="s2-2">
<title>2.2 Graph neural networks</title>
<p>We leverage graph neural networks as the learning paradigm because of their desirable properties of decentralized architecture that captures the interactions between neighboring agents and transferability that allows for generalization to previously unseen cases (<xref ref-type="bibr" rid="B6">Gama et al., 2019</xref>; <xref ref-type="bibr" rid="B28">Ruiz et al., 2021</xref>). In addition, GNNs have shown great success in various multi-robot problems such as formation control (<xref ref-type="bibr" rid="B33">Tolstaya et al., 2019</xref>), path planning (<xref ref-type="bibr" rid="B18">Li et al., 2021</xref>), task allocation (<xref ref-type="bibr" rid="B35">Wang and Gombolay, 2020</xref>), and multi-target tracking (<xref ref-type="bibr" rid="B40">Zhou et al., 2021</xref>; <xref ref-type="bibr" rid="B29">Sharma et al., 2022</xref>). Particularly, <xref ref-type="bibr" rid="B33">Tolstaya et al. (2019)</xref> utilized a GNN to learn a decentralized flocking behavior for a swarm of mobile robots by imitating a centralized flocking controller with global information. Later, <xref ref-type="bibr" rid="B18">Li et al. (2021)</xref> implemented GNNs to find collision-free paths for multiple robots from start positions to goal positions in obstacle-rich environments. They demonstrated that their decentralized path planner achieves a near-expert performance with local observations and neighboring communication only, which can also be generalized to larger networks of robots. The GNN-based approach was also employed to learn solutions to the combinatorial optimization problems in a multi-robot task scheduling scenario (<xref ref-type="bibr" rid="B35">Wang and Gombolay, 2020</xref>) and multi-target tracking scenario (<xref ref-type="bibr" rid="B40">Zhou et al., 2021</xref>; <xref ref-type="bibr" rid="B29">Sharma et al., 2022</xref>).</p>
</sec>
</sec>
<sec id="s3">
<title>3 Problem formulation</title>
<sec id="s3-1">
<title>3.1 Motivation</title>
<p>Perimeter defense is a relatively new field of research that has been explored recently. One particular challenge is that the high-dimensional perimeters add spatial and algorithmic complexities for defenders to execute their optimal strategies. Although many previous works considered engagements on a planar game space and derived optimal strategies in 2D motions, the extension towards high-dimensional spaces is unavoidable for practical applications of perimeter defense games in real-world scenarios. For instance, a perimeter of a building that defenders aim to protect can be enclosed by a generic shape, such as a hemisphere. Since defenders cannot pass through the building and are assumed to be close to the building at any time, they are employed to move along the surface of the dome, which leads to the &#x201c;hemisphere perimeter defense game.&#x201d; The intruder is moving on the base plane of the hemisphere, which implies a constant altitude during moving. The movement of the intruder is constrained to 2D since it is assumed that intruders may want to stay low in altitude to hide from the defenders in the real world.</p>
<p>It is worth noting that the hemisphere defense problem is more challenging to solve than a problem where both agents are allowed to freely move in a 3D space. There were previous works in which both defenders and intruders could move in 3-dimensional spaces (<xref ref-type="bibr" rid="B37">Yan et al., 2022</xref>; <xref ref-type="bibr" rid="B38">Yan et al., 2019</xref>; <xref ref-type="bibr" rid="B39">Yan et al., 2020</xref>). In all cases, the authors were able to explicitly derive the optimal solutions even in multi-robot scenarios. Although our problem limits the dynamics of the defenders to the surface of the hemisphere, these constraints make the finding of an optimal solution intractable and challenging.</p>
</sec>
<sec id="s3-2">
<title>3.2 Hemisphere perimeter defense</title>
<p>We consider a hemispherical dome with radius of <italic>R</italic> as perimeter. The hemisphere constraint is for the defender to safely move around the perimeter (e.g., building). In this game, consider two sets of players: <inline-formula id="inf1">
<mml:math id="m1">
<mml:mi mathvariant="bold">D</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> denoting <italic>N</italic> defenders, and <inline-formula id="inf2">
<mml:math id="m2">
<mml:mi mathvariant="bold">A</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> denoting <italic>N</italic> intruders. A defender <italic>D</italic>
<sub>
<italic>i</italic>
</sub> is constrained to move on the surface of the dome while an intruder <italic>A</italic>
<sub>
<italic>j</italic>
</sub> is constrained to move on the ground plane. We will drop the indices <italic>i</italic> and <italic>j</italic> when they are irrelevant. An instance of 10 vs. 10 perimeter defense is shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. The positions of the players in spherical coordinates are: <bold>z</bold>
<sub>
<italic>D</italic>
</sub> &#x3d; [<italic>&#x3c8;</italic>
<sub>
<italic>D</italic>
</sub>, &#x3d5;<sub>
<italic>D</italic>
</sub>, <italic>R</italic>] and <bold>z</bold>
<sub>
<italic>A</italic>
</sub> &#x3d; [<italic>&#x3c8;</italic>
<sub>
<italic>A</italic>
</sub>, 0, <italic>r</italic>], where <italic>&#x3c8;</italic> and <italic>&#x3d5;</italic> are the azimuth and elevation angles, which gives the relative position as: <bold>z</bold> &#x225c; [<italic>&#x3c8;</italic>, <italic>&#x3d5;</italic>, <italic>r</italic>], where <italic>&#x3c8;</italic> &#x225c; <italic>&#x3c8;</italic>
<sub>
<italic>A</italic>
</sub>&#x2212;<italic>&#x3c8;</italic>
<sub>
<italic>D</italic>
</sub> and <italic>&#x3d5;</italic> &#x225c; &#x3d5;<sub>
<italic>D</italic>
</sub>. The positions of the players can also be described in Cartesian coordinates as: <bold>x</bold>
<sub>
<italic>D</italic>
</sub> and <bold>x</bold>
<sub>
<italic>A</italic>
</sub>. All agents move at unit speed, defenders capture intruders by closing within a small distance <italic>&#x3f5;</italic>, and both defender and intruder are consumed during capture. An intruder wins if it reaches the perimeter (i.e., <italic>r</italic>(<italic>t</italic>
<sub>
<italic>f</italic>
</sub>) &#x3d; <italic>R</italic>) at time <italic>t</italic>
<sub>
<italic>f</italic>
</sub> without being captured by any defenders (i.e., <inline-formula id="inf3">
<mml:math id="m3">
<mml:mo stretchy="false">&#x2016;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo stretchy="false">&#x2016;</mml:mo>
<mml:mo>&#x3e;</mml:mo>
<mml:mi>&#x3f5;</mml:mi>
<mml:mo>,</mml:mo>
<mml:mo>&#x2200;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="bold">D</mml:mi>
<mml:mo>,</mml:mo>
<mml:mo>&#x2200;</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x3c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula>). A defender wins by capturing an intruder or preventing it from scoring indefinitely (i.e., <italic>&#x3d5;</italic>(<italic>t</italic>) &#x3d; <italic>&#x3c8;</italic>(<italic>t</italic>) &#x3d; 0, <italic>r</italic>(<italic>t</italic>) &#x3e; <italic>R</italic>). The main interest of this work is to maximize the number of captures by defenders, given a set of initial configurations.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Instance of 10 vs. 10 perimeter defense. Defenders are constrained to move on the surface of the dome while intruders are constrained to move on the ground plan.</p>
</caption>
<graphic xlink:href="fcteg-04-1104745-g001.tif"/>
</fig>
</sec>
<sec id="s3-3">
<title>3.3 Optimal breaching point</title>
<p>Given <bold>z</bold>
<sub>
<italic>D</italic>
</sub>, <bold>z</bold>
<sub>
<italic>A</italic>
</sub>, we call <italic>breachingpoint</italic> as a point on the perimeter at which the intruder tries to reach the target, as shown <italic>B</italic> in <xref ref-type="fig" rid="F2">Figure 2</xref>. We call the azimuth angle that forms the breaching point as <italic>breaching angle</italic>, denoted by <italic>&#x3b8;</italic>, and call the angle between (<bold>z</bold>
<sub>
<italic>A</italic>
</sub>&#x2212;<bold>z</bold>
<sub>
<italic>B</italic>
</sub>) and the tangent line at <italic>B</italic> as <italic>approach angle</italic>, denoted by <italic>&#x3b2;</italic>. It is proved in (<xref ref-type="bibr" rid="B12">Lee et al., 2020b</xref>) that given the current positions of defender <bold>z</bold>
<sub>
<italic>D</italic>
</sub> and intruder <bold>z</bold>
<sub>
<italic>A</italic>
</sub> as point particles, there exists a unique breaching point such that the optimal strategy for both defender and intruder is to move towards it, known as <italic>optimal breaching point</italic>. The breaching angle and approach angle corresponding to the optimal breaching point are known as <italic>optimal breaching angle</italic>, denoted by <italic>&#x3b8;</italic>&#x2a;, and <italic>optimal approach angle</italic>, denoted by <italic>&#x3b2;</italic>&#x2a;. As stated in (<xref ref-type="bibr" rid="B12">Lee et al., 2020b</xref>), although there exists no closed-form solution for <italic>&#x3b8;</italic>&#x2a; and <italic>&#x3b2;</italic>&#x2a;, they can be computed at any time by solving two governing equations:<disp-formula id="e1">
<mml:math id="m4">
<mml:msup>
<mml:mrow>
<mml:mi>&#x3b2;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>cos</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>&#x3bd;</mml:mi>
<mml:mfrac>
<mml:mrow>
<mml:mi>cos</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3d5;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>sin</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:msqrt>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>cos</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3d5;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2061;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>cos</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msqrt>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(1)</label>
</disp-formula>
</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Coordinates and relevant variables in the 1 vs. 1 hemisphere defense game.</p>
</caption>
<graphic xlink:href="fcteg-04-1104745-g002.tif"/>
</fig>
<p>and<disp-formula id="e2">
<mml:math id="m5">
<mml:msup>
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c8;</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3b2;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>cos</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>cos</mml:mi>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3b2;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mrow>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
</mml:math>
<label>(2)</label>
</disp-formula>
</p>
</sec>
<sec id="s3-4">
<title>3.4 Target time and payoff function</title>
<p>We call the <italic>target time</italic> as the time to reach <italic>B</italic> and define <italic>&#x3c4;</italic>
<sub>
<italic>D</italic>
</sub>(<bold>z</bold>
<sub>
<italic>D</italic>
</sub>, <bold>z</bold>
<sub>
<italic>B</italic>
</sub>) as the <italic>defender target time</italic>, <italic>&#x3c4;</italic>
<sub>
<italic>A</italic>
</sub>(<bold>z</bold>
<sub>
<italic>A</italic>
</sub>, <bold>z</bold>
<sub>
<italic>B</italic>
</sub>) as the <italic>intruder target time</italic>, and the following as <italic>payoff</italic> function:<disp-formula id="e3">
<mml:math id="m6">
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c4;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c4;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>B</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<label>(3)</label>
</disp-formula>
</p>
<p>The defender reaches <italic>B</italic> faster if <italic>p</italic> &#x3c; 0 and the intruder reaches <italic>B</italic> faster if <italic>p</italic> &#x3e; 0. Thus, the defender aims to minimize <italic>p</italic> while the intruder aims to maximize it.</p>
</sec>
<sec id="s3-5">
<title>3.5 Optimal strategies and nash equilibrium</title>
<p>It is proven in (<xref ref-type="bibr" rid="B12">Lee et al., 2020b</xref>) that the optimal strategies for both defender and intruder are to move towards the optimal breaching point at their maximum speed at any time. Let &#x3a9; and &#x393; be the continuous <italic>v</italic>
<sub>
<italic>D</italic>
</sub> and <italic>v</italic>
<sub>
<italic>A</italic>
</sub> that lead to <italic>B</italic> so that <italic>&#x3c4;</italic>
<sub>
<italic>D</italic>
</sub>(<bold>z</bold>
<sub>
<italic>D</italic>
</sub>, &#x3a9;) &#x225c; <italic>&#x3c4;</italic>
<sub>
<italic>D</italic>
</sub>(<bold>z</bold>
<sub>
<italic>D</italic>
</sub>, <bold>z</bold>
<sub>
<italic>B</italic>
</sub>) and <italic>&#x3c4;</italic>
<sub>
<italic>A</italic>
</sub>(<bold>z</bold>
<sub>
<italic>A</italic>
</sub>, &#x393;) &#x225c; <italic>&#x3c4;</italic>
<sub>
<italic>A</italic>
</sub>(<bold>z</bold>
<sub>
<italic>A</italic>
</sub>, <bold>z</bold>
<sub>
<italic>B</italic>
</sub>), and let &#x3a9;&#x2a; and &#x393;&#x2a; be the optimal strategies that minimize <italic>&#x3c4;</italic>
<sub>
<italic>D</italic>
</sub>(<bold>z</bold>
<sub>
<italic>D</italic>
</sub>, &#x3a9;) and <italic>&#x3c4;</italic>
<sub>
<italic>A</italic>
</sub>(<bold>z</bold>
<sub>
<italic>A</italic>
</sub>, &#x393;), respectively, then the optimality in the game is given as a Nash equilibrium:<disp-formula id="e4">
<mml:math id="m7">
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="normal">&#x3a9;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="normal">&#x393;</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="normal">&#x3a9;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="normal">&#x393;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2264;</mml:mo>
<mml:mi>p</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="normal">&#x3a9;</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="normal">&#x393;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
<label>(4)</label>
</disp-formula>
</p>
</sec>
<sec id="s3-6">
<title>3.6 Problem definition</title>
<p>To maximize the number of captures during <italic>N</italic> vs. <italic>N</italic> defense, we first recall the dynamics of a 1 vs. 1 perimeter defense game. It is proven in (<xref ref-type="bibr" rid="B12">Lee et al., 2020b</xref>) that the best action for the defender in one-on-one game is to move towards the <italic>optimal breaching point</italic> (defined in <xref ref-type="sec" rid="s3-3">Section 3.3</xref>). The defender reaches the optimal breaching point faster than the intruder does if <italic>payoff</italic> <italic>p</italic> (defined in <xref ref-type="sec" rid="s3-4">Section 3.4</xref>) is negative, and the intruder reaches faster if <italic>p</italic> &#x3e; 0. From this, we infer that maximizing the number of captures in <italic>N</italic> vs. <italic>N</italic> defense is the same as finding a matching between the defenders and intruders so that the number of the negative payoff of assigned pairs is maximized. In an optimal matching, the number of negative payoffs stays the same throughout the overall game since the optimality in each game of defender-intruder pairs is given as a <italic>Nash equilibrium</italic> (see <xref ref-type="sec" rid="s3-5">Section 3.5</xref>).</p>
<p>The expert assignment policy is a <italic>maximum matching</italic> (<xref ref-type="bibr" rid="B4">Chen et al., 2014</xref>; <xref ref-type="bibr" rid="B31">Shishika and Kumar, 2018</xref>). To execute this algorithm, we generate a bipartite graph with <bold>D</bold> and <bold>A</bold> as two sets of nodes (i.e., <inline-formula id="inf4">
<mml:math id="m8">
<mml:mi mathvariant="script">V</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:mn>1,2</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>.</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>), and define the potential assignments between defenders and intruders as the edges. For each defender/node <italic>D</italic>
<sub>
<italic>i</italic>
</sub> in <bold>D</bold>, we find all the intruders/nodes <italic>A</italic>
<sub>
<italic>j</italic>
</sub> in <bold>A</bold> that are sensible by the defender and compute the corresponding payoffs <italic>p</italic>
<sub>
<italic>ij</italic>
</sub> for all the pairs. We say that <italic>D</italic>
<sub>
<italic>i</italic>
</sub> is <italic>strongly assigned</italic> to <italic>A</italic>
<sub>
<italic>j</italic>
</sub> if <italic>p</italic>
<sub>
<italic>ij</italic>
</sub> &#x3c; 0. Using the edge set <inline-formula id="inf5">
<mml:math id="m9">
<mml:mi mathvariant="script">E</mml:mi>
</mml:math>
</inline-formula> given by maximum matching, we can maximize the number of strongly assigned pairs. For uniqueness, we choose a matching that minimizes the <italic>value of the game</italic>, which is defined as<disp-formula id="e5">
<mml:math id="m10">
<mml:mi>V</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="script">E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:munder>
<mml:msub>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:math>
<label>(5)</label>
</disp-formula>where <inline-formula id="inf6">
<mml:math id="m11">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="script">E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> is the subset of <inline-formula id="inf7">
<mml:math id="m12">
<mml:mi mathvariant="script">E</mml:mi>
</mml:math>
</inline-formula> with negative payoff (i.e., <inline-formula id="inf8">
<mml:math id="m13">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="script">E</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2a;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">E</mml:mi>
<mml:mo stretchy="false">&#x7c;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3c;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>). This unique assignment ensures that the number of captures is maximized at the earliest possible. However, running the exhaustive search using maximum matching algorithm can be very expensive as the team size increases. This method is combinatorial in nature and assumes centralized information with full observability. Instead, we aim to find decentralized strategies that uses local perceptions <inline-formula id="inf9">
<mml:math id="m14">
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">V</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> (see <xref ref-type="sec" rid="s4-1">Section 4.1</xref>). To this end, we formalize the main problem of this paper as follows.</p>
<p>Problem 1 (Decentralized Perimeter Defense with Graph Neural Networks). <italic>Design a GNN-based learning framework to learn a mapping</italic> <inline-formula id="inf10">
<mml:math id="m15">
<mml:mi mathvariant="script">M</mml:mi>
</mml:math>
</inline-formula> <italic>from the defenders&#x2019; local perceptions</italic> <inline-formula id="inf11">
<mml:math id="m16">
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">V</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> <italic>and their communication graph</italic> <inline-formula id="inf12">
<mml:math id="m17">
<mml:mi mathvariant="script">G</mml:mi>
</mml:math>
</inline-formula> <italic>to their actions</italic> <inline-formula id="inf13">
<mml:math id="m18">
<mml:mi mathvariant="script">U</mml:mi>
</mml:math>
</inline-formula>
<italic>, i.e.,</italic> <inline-formula id="inf14">
<mml:math id="m19">
<mml:mi mathvariant="script">U</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="script">M</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">V</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="script">G</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>
<italic>, such that</italic> <inline-formula id="inf15">
<mml:math id="m20">
<mml:mi mathvariant="script">U</mml:mi>
</mml:math>
</inline-formula> <italic>is as close as possible to action set</italic> <inline-formula id="inf16">
<mml:math id="m21">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="script">U</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="monospace">g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> <italic>selected by a centralized expert algorithm.</italic>
</p>
<p>We describe in detail our learning architecture for solving Problem 1 in the following section.</p>
</sec>
</sec>
<sec sec-type="methods" id="s4">
<title>4 Methods</title>
<p>In this paper, we learn decentralized strategies for perimeter defense using graph neural networks. Inference of our approach is in real-time, which is scalable to a large number of agents. We use an expert assignment policy to train a team of defenders who share information through communication channels. In <xref ref-type="sec" rid="s4-1">Section 4.1</xref>, we introduce the perception module for processing the features that are input to GNN. Learning the decentralized algorithm through GNN and planning the candidate matching for the defenders are discussed in <xref ref-type="sec" rid="s4-2">Section 4.2</xref>. The control of the defender team is explained in <xref ref-type="sec" rid="s4-3">Section 4.3</xref>, and the training procedure is detailed in <xref ref-type="sec" rid="s4-4">Section 4.4</xref>. The overall framework is shown in <xref ref-type="fig" rid="F3">Figure 3</xref>. For the choice of architecture, we decouple the control module from the learning framework since directly learning the actions is unnecessary. Learning an assignment between agents is sufficient, and the best actions can be computed by the optimal strategies (<xref ref-type="sec" rid="s3-5">Section 3.5</xref>).</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Overall framework. Perception module collects local information. Learning &#x26; Planning module processes the collected information using GNN through <italic>K</italic>-hop neighboring communications. Control module computes the optimal strategies and executes the controller to close the loop.</p>
</caption>
<graphic xlink:href="fcteg-04-1104745-g003.tif"/>
</fig>
<sec id="s4-1">
<title>4.1 Perception</title>
<p>In this section, we assume <italic>N</italic> aerial defenders and <italic>N</italic> ground intruders. Each defender <italic>D</italic>
<sub>
<italic>i</italic>
</sub> is equipped with a sensor and faces outwards the perimeter with a field of view <italic>FOV</italic>. The defenders&#x2019; horizontal field of view <italic>FOV</italic> is chosen as <italic>&#x3c0;</italic> assuming a fisheye-type camera.</p>
<sec id="s4-1-1">
<title>4.1.1 Intruder features</title>
<p>For each <italic>i</italic>, a defender observes the set of intruders <italic>A</italic>
<sub>
<italic>j</italic>
</sub>, and the relative positions in spherical coordinates between <italic>D</italic>
<sub>
<italic>i</italic>
</sub> and <italic>A</italic>
<sub>
<italic>j</italic>
</sub> are represented by <inline-formula id="inf17">
<mml:math id="m22">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="script">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> where <inline-formula id="inf18">
<mml:math id="m23">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> is the number of intruder features. The number of input features <inline-formula id="inf19">
<mml:math id="m24">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> and <inline-formula id="inf20">
<mml:math id="m25">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> are selected as the fixed number of closest detected and neighboring agents, respectively. Although a defender can detect any number of intruders within the sensing range, a fixed number of detections is selected so that the system is scalable. In a decentralized setting, a defender should be able to decide its action based on its local perceptions. We experimentally chose the fixed number as 10 since an expert algorithm (i.e., the maximum matching) would always assign a defender to a robot among the 10 closest intruders.</p>
</sec>
<sec id="s4-1-2">
<title>4.1.2 Defender features</title>
<p>To make the system scalable, we build communication with a fixed number of closest defenders. Each defender <italic>D</italic>
<sub>
<italic>i</italic>
</sub> communicates with nearby defenders <italic>D</italic>
<sub>
<italic>j</italic>
</sub> within its communication range <italic>r</italic>
<sub>
<italic>c</italic>
</sub>. For each <italic>i</italic>, the relative positions between <italic>D</italic>
<sub>
<italic>i</italic>
</sub> and <italic>D</italic>
<sub>
<italic>j</italic>
</sub> are represented by <inline-formula id="inf21">
<mml:math id="m26">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="script">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold">z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> where <inline-formula id="inf22">
<mml:math id="m27">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> is the number of defender features. The selected number was 3 since communicating with many other robots would allow every defender to have full information of the environment (i.e., centralized) and 3 is the minimum number that the robots can collect information in every direction if we assume robots are scattered. If there are fewer than 10 detected intruders or 3 neighboring defenders, we hand over dummy values to fill up the perception input matrix. It is important to keep the input features constant since neural networks cannot handle varying feature sizes.</p>
</sec>
<sec id="s4-1-3">
<title>4.1.3 Feature extraction</title>
<p>Feature extraction is performed by concatenating the relative positions of observed intruders and communicated defenders, forming the local perceptions <inline-formula id="inf23">
<mml:math id="m28">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="script">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="script">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. The extracted features are fed into a multi-layer perceptron (MLP) to generate the post-processed feature vector <bold>x</bold>
<sub>
<italic>i</italic>
</sub>, which will be exchanged among neighbors through communications.</p>
</sec>
</sec>
<sec id="s4-2">
<title>4.2 Learning and planning</title>
<p>We employ graph neural networks with <italic>K</italic>-hop communications. Defenders communicate their perceived features with neighboring robots. The communication graph <inline-formula id="inf24">
<mml:math id="m29">
<mml:mi mathvariant="script">G</mml:mi>
</mml:math>
</inline-formula> is formed by connecting the nearby defenders within the communication range <italic>r</italic>
<sub>
<italic>c</italic>
</sub>, and the resulted adjacency matrix <bold>S</bold> is given to the graph neural networks.</p>
<sec id="s4-2-1">
<title>4.2.1 Graph shift operation</title>
<p>We consider each defender <inline-formula id="inf25">
<mml:math id="m30">
<mml:mi>i</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">V</mml:mi>
</mml:math>
</inline-formula> has a feature vector <inline-formula id="inf26">
<mml:math id="m31">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>F</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula>, indicating the post-processed information from <italic>D</italic>
<sub>
<italic>i</italic>
</sub>. By collecting the feature vectors <bold>x</bold>
<sub>
<italic>i</italic>
</sub> from all defenders, we have the feature matrix for the defender team <bold>D</bold> as:<disp-formula id="e6">
<mml:math id="m32">
<mml:mi mathvariant="bold">X</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mtable class="matrix">
<mml:mtr>
<mml:mtd columnalign="center">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="sans-serif">T</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="center">
<mml:mo>&#x22ee;</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="center">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="sans-serif">T</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>F</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>F</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
</mml:math>
<label>(6)</label>
</disp-formula>where <inline-formula id="inf27">
<mml:math id="m33">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
<mml:mi>f</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>F</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> is the collection of the feature <italic>f</italic> across all defenders; i.e., <inline-formula id="inf28">
<mml:math id="m34">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">[</mml:mo>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mo stretchy="false">]</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="sans-serif">T</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> with <inline-formula id="inf29">
<mml:math id="m35">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> denoting the feature <italic>f</italic> of <inline-formula id="inf30">
<mml:math id="m36">
<mml:msub>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">V</mml:mi>
</mml:math>
</inline-formula>. We conduct <italic>graph shift operation</italic> for each <italic>D</italic>
<sub>
<italic>i</italic>
</sub> by a linear combination of its neighboring features, i.e., <inline-formula id="inf31">
<mml:math id="m37">
<mml:msub>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula>. Hence, for all defenders <bold>D</bold> with graph <inline-formula id="inf32">
<mml:math id="m38">
<mml:mi mathvariant="script">G</mml:mi>
</mml:math>
</inline-formula>, the feature matrix <bold>X</bold> after the shift operation becomes <bold>SX</bold> with:<disp-formula id="e7">
<mml:math id="m39">
<mml:msub>
<mml:mrow>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mi mathvariant="bold">S</mml:mi>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:munderover accentunder="false" accent="true">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:msub>
<mml:mrow>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mi mathvariant="bold">S</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msubsup>
<mml:mrow>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:munder>
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:munder>
<mml:msub>
<mml:mrow>
<mml:mi>s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mi>j</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold">x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
</mml:math>
<label>(7)</label>
</disp-formula>Here, the adjacency matrix <bold>S</bold> is called the <italic>Graph Shift Operator</italic> (GSO) (<xref ref-type="bibr" rid="B6">Gama et al., 2019</xref>).</p>
</sec>
<sec id="s4-2-2">
<title>4.2.2 Graph convolution</title>
<p>With the shift operation, we define the <italic>graph convolution</italic> by a linear combination of the <italic>shifted features</italic> on graph <inline-formula id="inf33">
<mml:math id="m40">
<mml:mi mathvariant="script">G</mml:mi>
</mml:math>
</inline-formula> <italic>via</italic> <italic>K</italic>-hop communication exchanges (<xref ref-type="bibr" rid="B6">Gama et al., 2019</xref>; <xref ref-type="bibr" rid="B17">Li et al., 2020</xref>):<disp-formula id="e8">
<mml:math id="m41">
<mml:mi mathvariant="script">H</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold">S</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:munderover accentunder="false" accent="true">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>K</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi mathvariant="bold">X</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:math>
<label>(8)</label>
</disp-formula>where <inline-formula id="inf34">
<mml:math id="m42">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>F</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> represents the coefficients combining <italic>F</italic> features of the defenders in the shifted feature matrix <bold>S</bold>
<sup>
<italic>k</italic>
</sup>
<bold>X</bold>, with <italic>F</italic> and <italic>G</italic> denoting the input and output dimensions of the graph convolution. Note that, <bold>S</bold>
<sup>
<italic>k</italic>
</sup>
<bold>X</bold> &#x3d; <bold>S</bold>(<bold>S</bold>
<sup>
<italic>k</italic>&#x2212;1</sup>
<bold>X</bold>) is computed by means of <italic>k</italic> communication exchanges with 1-hop neighbors.</p>
</sec>
<sec id="s4-2-3">
<title>4.2.3 Graph neural network</title>
<p>Applying a point-wise non-linearity <inline-formula id="inf35">
<mml:math id="m43">
<mml:mi>&#x3c3;</mml:mi>
<mml:mo>:</mml:mo>
<mml:mi mathvariant="double-struck">R</mml:mi>
<mml:mo>&#x2192;</mml:mo>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:math>
</inline-formula> as the activation function to the graph convolution (Eq. <xref ref-type="disp-formula" rid="e8">8</xref>), we define <italic>graph perception</italic> as:<disp-formula id="e9">
<mml:math id="m44">
<mml:mi mathvariant="script">H</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold">S</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:munderover accentunder="false" accent="true">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>K</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold">S</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mi mathvariant="bold">X</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>k</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(9)</label>
</disp-formula>
</p>
<p>Then, we define a GNN module by cascading <italic>L</italic> layers of graph perceptions (Eq. <xref ref-type="disp-formula" rid="e9">9</xref>):<disp-formula id="e10">
<mml:math id="m45">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x2113;</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="script">H</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x2113;</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x2113;</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>;</mml:mo>
<mml:mi mathvariant="bold">S</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mspace width="1em"/>
<mml:mtext>for</mml:mtext>
<mml:mspace width="1em"/>
<mml:mi>&#x2113;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo>,</mml:mo>
</mml:math>
<label>(10)</label>
</disp-formula>where the output feature of the previous layer <italic>&#x2113;</italic>&#x2212;1, <inline-formula id="inf36">
<mml:math id="m46">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x2113;</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>F</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x2113;</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula>, is taken as input to the current layer <italic>&#x2113;</italic> to generate the output feature of layer <italic>l</italic>, <bold>X</bold>
<sup>
<italic>&#x2113;</italic>
</sup>. Recall that the input to the first layer is <bold>X</bold>
<sup>0</sup> &#x3d; <bold>X</bold> (Eq. <xref ref-type="disp-formula" rid="e6">6</xref>). The output feature of the last layer <inline-formula id="inf37">
<mml:math id="m47">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="bold">X</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2208;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="double-struck">R</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>&#xd7;</mml:mo>
<mml:mi>G</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula>, obtained <italic>via</italic> <italic>K</italic>-hop communications, represents the exchanged and fused information of the defender team <bold>D</bold>.</p>
</sec>
<sec id="s4-2-4">
<title>4.2.4 Candidate matching</title>
<p>The output of the GNN, which represents the fused information from the <italic>K</italic>-hop communications, is then processed with another MLP to provide a candidate matching for each defender. <xref ref-type="fig" rid="F3">Figure 3</xref> shows a candidate matching instance if <inline-formula id="inf38">
<mml:math id="m48">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>6</mml:mn>
</mml:math>
</inline-formula>. Given a defender <italic>D</italic>
<sub>
<italic>i</italic>
</sub>, we find the <inline-formula id="inf39">
<mml:math id="m49">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> closest intruders and number them from 1 to <inline-formula id="inf40">
<mml:math id="m50">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> clockwise. The main reason for numbering the nearby intruders clockwise is to interpret the feature outputs from our networks in identifying which intruders would be matched with which defenders. We could number them counterclockwise or in any arbitrary order. Since each defender learns decentralized strategies, it needs to specify an intruder to capture given its local perception. There are no global IDs for the intruders so without loss of generality we simply assign the IDs clockwise. The output from the multi-layer perceptron is an assignment likelihood <inline-formula id="inf41">
<mml:math id="m51">
<mml:mi mathvariant="script">L</mml:mi>
</mml:math>
</inline-formula>, which presents the probabilities of <inline-formula id="inf42">
<mml:math id="m52">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> intruder candidates&#x2019; likelihood to be matched with the given defender. For instance, an expert assignment likelihood <inline-formula id="inf43">
<mml:math id="m53">
<mml:msubsup>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> for <italic>D</italic>
<sub>
<italic>i</italic>
</sub> in <xref ref-type="fig" rid="F3">Figure 3</xref> would be [0.01,0.01,0.95,0.01,0.01,0.01] if the third intruder (i.e., <italic>A</italic>
<sub>3</sub>) is matched with <italic>D</italic>
<sub>
<italic>i</italic>
</sub> by the expert policy (i.e., maximum matching). The planning module selects the intruder candidate <italic>A</italic>
<sub>
<italic>j</italic>
</sub> so that the matching pair (<italic>D</italic>
<sub>
<italic>i</italic>
</sub>, <italic>A</italic>
<sub>
<italic>j</italic>
</sub>) would resemble the expert policy with the highest probability. It is worth noting that our approach renders a decentralized assignment policy given that only neighboring information is exchanged.</p>
</sec>
<sec id="s4-2-5">
<title>4.2.5 Permutation equivalence</title>
<p>It is worth noting that our proposed GNN-based learning approach is scalable due to permutation equivalence. This means that given a decentralized defender, it should be able to decide the action based on local perceptions that consist of an arbitrary number of unnumbered intruders. An instance of a perimeter defense game is illustrated to show this property in <xref ref-type="fig" rid="F4">Figure 4</xref>. The plots focus on a single defender and intruders are gradually approaching the perimeter as time passes by. The same intruders are colored in the same color across different time stamps. Notice that a new light-blue intruder enters into the field of view of the defender at <italic>t</italic> &#x3d; 2, and a purple intruder begins to appear at <italic>t</italic> &#x3d; 3. Although an arbitrary number of intruders are detected at each time, our system gives IDs to intruders shown as blue numbers in <xref ref-type="fig" rid="F4">Figure 4</xref>. We number them clockwise but could have done differently in any permutation (e.g., counterclockwise) because graph neural networks perform label-independent processing. The reason for the numbering is to specify which intruders would be matched with which defenders from the network outputs. Without loss of generality, we assign the IDs clockwise but we note that these IDs are arbitrary since the IDs can change at different stamps. For instance, the yellow intruder ID is 2 at <italic>t</italic> &#x3d; 1 but becomes 3 at <italic>t</italic> &#x3d; 2, 3. Similarly, the red intruder ID is 3 at <italic>t</italic> &#x3d; 1 but changes to 4 at <italic>t</italic> &#x3d; 2 and 5 at <italic>t</italic> &#x3d; 3. In this way, we accommodate an arbitrary amount of intruders and thus our system is permutation equivalent.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Instance of perimeter defense game at different time stamps. The plots focus on a single defender and its local perceptions.</p>
</caption>
<graphic xlink:href="fcteg-04-1104745-g004.tif"/>
</fig>
</sec>
</sec>
<sec id="s4-3">
<title>4.3 Control</title>
<p>The output from the <xref ref-type="sec" rid="s4-2">Section 4.2</xref> is inputted to the defender strategy module in <xref ref-type="fig" rid="F3">Figure 3</xref>. This module handles all the matched pairs (<italic>D</italic>
<sub>
<italic>i</italic>
</sub>, <italic>A</italic>
<sub>
<italic>j</italic>
</sub>) and computes the optimal breaching points for each of the one-on-one hemisphere perimeter defense games (see <xref ref-type="sec" rid="s3-3">Section 3.3</xref>). The defender strategy module collectively outputs the position commands, which are towards the direction of the optimal breaching points. The SO(3) command (<xref ref-type="bibr" rid="B21">Mellinger and Kumar, 2011</xref>) that consists of thrust and moment to control the robot at a low level is then passed to the defender team <bold>D</bold> for control. The state dynamics for the defender-intruder pair is detailed in (<xref ref-type="bibr" rid="B12">Lee et al., 2020b</xref>). The defenders move based on the commands to close the perception-action loop. Notably, the expert assignment likelihood <inline-formula id="inf44">
<mml:math id="m54">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> would result in the expert action set <inline-formula id="inf45">
<mml:math id="m55">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="script">U</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> (defined in Problem 1).</p>
</sec>
<sec id="s4-4">
<title>4.4 Training procedure</title>
<p>To train our proposed networks, we use imitation learning to mimic an expert policy given by maximum matching (explained in <xref ref-type="sec" rid="s3">Section 3</xref>), which provides the optimal assignment likelihood <inline-formula id="inf46">
<mml:math id="m56">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> (described in <xref ref-type="sec" rid="s4-2">Section 4.2</xref>) given the defenders&#x2019; local perceptions <inline-formula id="inf47">
<mml:math id="m57">
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">V</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> and the communication graph <inline-formula id="inf48">
<mml:math id="m58">
<mml:mi mathvariant="script">G</mml:mi>
</mml:math>
</inline-formula>. The training set <inline-formula id="inf49">
<mml:math id="m59">
<mml:mi mathvariant="script">D</mml:mi>
</mml:math>
</inline-formula> is generated as a collection of these data: <inline-formula id="inf50">
<mml:math id="m60">
<mml:mi mathvariant="script">D</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">{</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x2208;</mml:mo>
<mml:mi mathvariant="script">V</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mi mathvariant="script">G</mml:mi>
<mml:mo>,</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo stretchy="false">}</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula>. We train the mapping <inline-formula id="inf51">
<mml:math id="m61">
<mml:mi mathvariant="script">M</mml:mi>
</mml:math>
</inline-formula> (defined in problem 1) to minimize the cross-entropy loss between <inline-formula id="inf52">
<mml:math id="m62">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> and <inline-formula id="inf53">
<mml:math id="m63">
<mml:mi mathvariant="script">L</mml:mi>
</mml:math>
</inline-formula>. We show that the trained <inline-formula id="inf54">
<mml:math id="m64">
<mml:mi mathvariant="script">M</mml:mi>
</mml:math>
</inline-formula> provides <inline-formula id="inf55">
<mml:math id="m65">
<mml:mi mathvariant="script">U</mml:mi>
</mml:math>
</inline-formula> that is close to <inline-formula id="inf56">
<mml:math id="m66">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="script">U</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula>. The number of learnable parameters in our networks is independent of the number of team sizes <italic>N</italic>. Therefore, we can train our networks on a small scale and generalize our model to large scales, given that defenders at any scale learn decentralized strategies based on the local perception of fixed numbers of agents.</p>
<sec id="s4-4-1">
<title>4.4.1 Model architecture</title>
<p>Our model architecture consists of a 2-layer MLP with 16 and 8 hidden layers to generate the post-processed feature vector <bold>x</bold>
<sub>
<italic>i</italic>
</sub>, a 2-layer GNN with 32 and 128 hidden layers to exchange the collected information from defenders, and a single-layer MLP to produce an assignment likelihood <inline-formula id="inf57">
<mml:math id="m67">
<mml:mi mathvariant="script">L</mml:mi>
</mml:math>
</inline-formula>. The layers in MLP and GNN are followed by ReLU.</p>
</sec>
<sec id="s4-4-2">
<title>4.4.2 Graph neural networks details</title>
<p>In implementing graph neural networks, we construct a 1-hop connectivity graph by connecting defenders within communication range <italic>r</italic>
<sub>
<italic>c</italic>
</sub> &#x3d; 1. Given that the default radius is <italic>R</italic> &#x3d; 1, we foresee that three neighboring agents within 1-hop would provide a wide sensing region for the defenders. Accordingly, we assume that communications occur in real-time with <inline-formula id="inf58">
<mml:math id="m68">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>3</mml:mn>
</mml:math>
</inline-formula>. Each defender gathers information as input features that consist of <inline-formula id="inf59">
<mml:math id="m69">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>10</mml:mn>
</mml:math>
</inline-formula> closest intruder positions and <inline-formula id="inf60">
<mml:math id="m70">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>3</mml:mn>
</mml:math>
</inline-formula> closest defender positions. The used parameters are summarized in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Parameter setup in implementing graph neural networks.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center">Parameter name</th>
<th align="center">Symbol</th>
<th align="center">Value</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">Capturing distance</td>
<td align="center">
<italic>&#x3f5;</italic>
</td>
<td align="center">0.02</td>
</tr>
<tr>
<td align="center">Field of view</td>
<td align="center">
<italic>FOV</italic>
</td>
<td align="center">
<italic>&#x3c0;</italic>
</td>
</tr>
<tr>
<td align="center">Number of intruder features</td>
<td align="center">
<inline-formula id="inf61">
<mml:math id="m71">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula>
</td>
<td align="center">10</td>
</tr>
<tr>
<td align="center">Number of defender features</td>
<td align="center">
<inline-formula id="inf62">
<mml:math id="m72">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>D</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula>
</td>
<td align="center">3</td>
</tr>
<tr>
<td align="center">Communication range</td>
<td align="center">
<italic>r</italic>
<sub>
<italic>c</italic>
</sub>
</td>
<td align="center">1</td>
</tr>
<tr>
<td align="center">Default team size</td>
<td align="center">
<italic>N</italic>
<sub>
<italic>def</italic>
</sub>
</td>
<td align="center">10</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4-4-3">
<title>4.4.3 Implementation details</title>
<p>The experiments are conducted using a 12-core 3.50&#xa0;GHz i9-9920X CPU and an Nvidia GeForce RTX 2080 Ti GPU. We implement the proposed networks using PyTorch v1.10.1 (<xref ref-type="bibr" rid="B26">Paszke et al., 2019</xref>) accelerated with Cuda v10.2 APIs. We use the Adam optimizer with a momentum of 0.5. The learning rate is scheduled to decay from 5 &#xd7; 10<sup>&#x2013;3</sup> to 10<sup>&#x2013;6</sup> within 1500 epochs with batch size 64, using cosine annealing. We choose these hyperparameters for the best performance.</p>
</sec>
</sec>
</sec>
<sec id="s5">
<title>5 Experiments</title>
<sec id="s5-1">
<title>5.1 Datasets</title>
<p>We evaluate our decentralized networks using imitation learning where the expert assignment policy is the maximum matching. The perimeter is a hemisphere with a radius <italic>R</italic>, which is defined by <inline-formula id="inf63">
<mml:math id="m73">
<mml:mi>R</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:mi>N</mml:mi>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi mathvariant="italic">def</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msqrt>
</mml:math>
</inline-formula> where <italic>N</italic> is team size and <italic>N</italic>
<sub>
<italic>def</italic>
</sub> is a default team size. Since running the maximum matching is very expensive at large scales (e.g., <italic>N</italic> &#x3e; 10), we set the default team size <italic>N</italic>
<sub>
<italic>def</italic>
</sub> &#x3d; 10. In this way, <italic>R</italic> also represents the scale of the game; for instance when <italic>N</italic> &#x3d; 40, <italic>R</italic> becomes 2, which indicates that the scale of the problem&#x2019;s setting is doubled compared to the setting when <italic>R</italic> &#x3d; 1. Given the team size <italic>N</italic> &#x3d; 10, our experimental arena has a dimension of 10 &#xd7; 10 &#xd7; 1&#xa0;m. In offline, we randomly sample 10 million examples of defender&#x2019;s local perception <inline-formula id="inf64">
<mml:math id="m74">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> and find corresponding <inline-formula id="inf65">
<mml:math id="m75">
<mml:mi mathvariant="script">G</mml:mi>
</mml:math>
</inline-formula> and <inline-formula id="inf66">
<mml:math id="m76">
<mml:msup>
<mml:mrow>
<mml:mi mathvariant="script">L</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>g</mml:mi>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula> to prepare the dataset, which is divided into a training set (60%), a validation set (20%), and a testing set (20%).</p>
</sec>
<sec id="s5-2">
<title>5.2 Metrics</title>
<p>We are mainly interested in the percentage of intruders caught (i.e., number of captures/total number of intruders). At small scales (e.g., <italic>N</italic> &#x2264; 10), an expert policy (i.e., the maximum matching) can be run and a direct comparison between the expert policy and our policy is available. At large scales (e.g., <italic>N</italic> &#x3e; 10), the maximum matching is too expensive to run. Thus we compare our algorithm with other baseline approaches: <italic>greedy</italic>, <italic>random</italic>, and <italic>mlp</italic>, which will be explained in <xref ref-type="sec" rid="s5-3">Section 5.3</xref>. To observe the scalability on small and large scales, we run a total of five different algorithms for each scale: <italic>expert</italic>, <italic>gnn</italic>, <italic>greedy</italic>, <italic>random</italic>, and <italic>mlp</italic>. In all cases, we compute the <italic>absolute accuracy</italic>, which is defined by the number of captures divided by the team size, to verify if our network can be generalized to any team size. Furthermore, we also calculate the <italic>comparative accuracy</italic>, defined as the ratio of the number of captures by <italic>gnn</italic> to the number of captures by another algorithm, to observe comparative results.</p>
</sec>
<sec id="s5-3">
<title>5.3 Compared algorithms</title>
<p>In baseline algorithms, defenders do not communicate their &#x201c;intentions&#x201d; of which intruders would be captured by which neighboring defenders for a fair comparison since GNN does not share such information either. For the GNN framework, each defender perceives nearby intruders, and the relative positions of perceived intruders, not the &#x201c;intentions,&#x201d; are shared by GNN through communications. The power of the GNNs is to learn these &#x201c;intentions&#x201d; implicitly <italic>via</italic> K-hop communications. That way, the decentralized decision-making (i.e., for both GNN and baselines) may allow multiple defenders to aim to capture the same intruder while the centralized planner knows the &#x201c;intentions&#x201d; of all the defenders and would avoid such a scenario.</p>
<sec id="s5-3-1">
<title>5.3.1 Greedy</title>
<p>The greedy algorithm can be run in polynomial time and thus becomes a good candidate algorithm to be compared with our approach using GNN. For a fair comparison, we run a decentralized greedy algorithm based on local perception <inline-formula id="inf67">
<mml:math id="m77">
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="script">Z</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> of <italic>D</italic>
<sub>
<italic>i</italic>
</sub>. We enable <italic>K</italic>-hop neighboring communications so that the sensible region of a defender is expanded as if the networking channels of GNN are active. The defender <italic>D</italic>
<sub>
<italic>i</italic>
</sub> computes the payoff <italic>p</italic>
<sub>
<italic>ij</italic>
</sub> (see <xref ref-type="sec" rid="s3-4">Section 3.4</xref>) based on any sensible intruder <italic>A</italic>
<sub>
<italic>j</italic>
</sub> and greedily chooses an assignment that minimizes the payoff <italic>p</italic>
<sub>
<italic>ij</italic>
</sub>.</p>
</sec>
<sec id="s5-3-2">
<title>5.3.2 Random</title>
<p>The random algorithm is similar to the greedy algorithm in that the <italic>K</italic>-hop neighboring communications are enabled for the expanded perception. Among sensible intruders, a defender <italic>D</italic>
<sub>
<italic>i</italic>
</sub> randomly picks an intruder to determine the assignment.</p>
</sec>
<sec id="s5-3-3">
<title>5.3.3 MLP</title>
<p>For the MLP algorithm, we only train the current MLP of our proposed framework in isolation by excluding the GNN module. By comparing our GNN framework to this algorithm, we can observe if the GNN gives any improvement.</p>
</sec>
</sec>
<sec id="s5-4">
<title>5.4 Results</title>
<p>We run the perimeter defense game in various scenarios with different team sizes and initial configurations to evaluate the performance of the learned networks. In particular, we conduct the experiments at small (<italic>N</italic> &#x2264; 10) and large (<italic>N</italic> &#x3e; 10) scales. The snapshots of the simulated perimeter defense game in top view with our proposed networks for different team sizes are shown in <xref ref-type="fig" rid="F5">Figure 5</xref>. The perimeter, defender state, intruder state, and breaching point are marked in green, blue, red, and yellow, respectively. We observe that intruders try to reach the perimeter. Given the defender-intruder matches, the intruders execute their respective optimal strategies to move towards the optimal breaching points (see <xref ref-type="sec" rid="s3-5">Section 3.5</xref>). If an intruder successfully reaches it without being captured by any defender, the intruder is consumed and leaves a marker called &#x201c;Intrusion&#x201d;. If an intruder fails and is intercepted by a defender, both agents are consumed and leave a marker called &#x201c;Capture&#x201d;. The points on the perimeter aimed by intruders are marked as &#x201c;Breaching point&#x201d;. In all runs, the game ends at <italic>terminal time</italic> <italic>T</italic>
<sub>
<italic>f</italic>
</sub> when all the intruders are consumed. See the supplemental video for more results.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>
<bold>(A&#x2013;C)</bold> Snapshots of simulated perimeter defense in top view using the proposed method <italic>gnn</italic> for three different team sizes.</p>
</caption>
<graphic xlink:href="fcteg-04-1104745-g005.tif"/>
</fig>
<p>As mentioned in <xref ref-type="sec" rid="s5-1">Section 5.1</xref>, we run the five algorithms <italic>expert</italic>, <italic>gnn</italic>, <italic>greedy</italic>, <italic>random</italic>, and <italic>mlp</italic> at small scales, and run <italic>gnn</italic>, <italic>greedy</italic>, <italic>random</italic>, and <italic>mlp</italic> in large scales. As an instance, the snapshots of simulated 20 vs. 20 perimeter defense game in top view at terminal time <italic>T</italic>
<sub>
<italic>f</italic>
</sub> using the four algorithms are displayed in <xref ref-type="fig" rid="F6">Figure 6</xref>. The four subfigures (a)-(d) show that these algorithms exhibit different performance although the game begins with the same initial configuration in all cases. The number of captures by these algorithms <italic>gnn</italic>, <italic>greedy</italic>, <italic>random</italic>, and <italic>mlp</italic> are 12, 11, 10, 7, respectively.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>
<bold>(A&#x2013;D)</bold> Snapshots of simulated 20 vs. 20 perimeter defense game in top view at terminal time <italic>T</italic>
<sub>
<italic>f</italic>
</sub> using the four algorithms <italic>gnn</italic>, <italic>greedy</italic>, <italic>random</italic>, and <italic>mlp</italic>. The number of captures using these algorithms are 12, 11, 10, and 7, respectively.</p>
</caption>
<graphic xlink:href="fcteg-04-1104745-g006.tif"/>
</fig>
<p>The overall results of the percentage of intruders caught by each of these methods are depicted in <xref ref-type="fig" rid="F7">Figure 7</xref>. It is observed that <italic>gnn</italic> outperforms other baselines in all cases, and performs close to <italic>expert</italic> at the small scales. In particular, given that our default team size <italic>N</italic>
<sub>
<italic>def</italic>
</sub> is 10, the performance of our proposed algorithm stays competitive with that of the expert policy near <italic>N</italic> &#x3d; 10.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Percentage of intruders caught (average and standard deviation over 10 trials) by different algorithms on small (<italic>N</italic> &#x2264; 10) and large (<italic>N</italic> &#x3e; 10) scales.</p>
</caption>
<graphic xlink:href="fcteg-04-1104745-g007.tif"/>
</fig>
<p>At large scales, the percentage of captures by <italic>gnn</italic> stays constant, which indicates that the trained network can be well generalized to the large scales even if the training has been performed at the small scale. The percentage of captures by <italic>greedy</italic> also seems constant but performs much worse than <italic>gnn</italic> as the team size gets large. At small scales, only a few combinations are available in matching defender-intruder pairs and thus the <italic>greedy</italic> algorithm would perform similarly to the expert algorithm. As the number of agents increases, the number of possible matching increases exponentially so the <italic>greedy</italic> algorithm performs worse since the problem complexity gets much higher. The <italic>random</italic> approach performs worse than all other algorithms at small scales, but the <italic>mlp</italic> begins to perform worse than the <italic>random</italic> when the team size increases over 40. This tendency tells that the policy trained only with MLP cannot be scalable at large scales. Since the training is done with 10 agents, it is optimal near <italic>N</italic> &#x3d; 10, but the <italic>mlp</italic> cannot work at larger scales and even performs worse than the <italic>random</italic> algorithm. It is confirmed that the GNN added to the MLP significantly improves the performance. Overall, compared to other algorithms, <italic>gnn</italic> performs better at large scales than at small scales, which validates that GNN helps the network become scalable.</p>
<p>To quantitatively evaluate the proposed method, we report the <italic>absolute accuracy</italic> and <italic>comparative accuracy</italic> (defined in <xref ref-type="sec" rid="s5-2">Section 5.2</xref>) in <xref ref-type="table" rid="T2">Table 2</xref> and <xref ref-type="table" rid="T3">Table 3</xref>. As expected, the absolute accuracy reaches the maximum when team size approaches <italic>N</italic> &#x3d; 10. The overall values of the absolute accuracy are fairly consistent except when <italic>N</italic> &#x3d; 2. We conjecture that there may not be much information shared by the two defenders and there could be no sensible intruders at all based on initial configurations.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Accuracy for small scales.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center">Team size</th>
<th align="center">2</th>
<th align="center">4</th>
<th align="center">6</th>
<th align="center">8</th>
<th align="center">10</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">Absolute accuracy (gnn vs. N)</td>
<td align="center">0.40</td>
<td align="center">0.50</td>
<td align="center">0.53</td>
<td align="center">0.63</td>
<td align="center">0.63</td>
</tr>
<tr>
<td align="center">Comparative accuracy (gnn vs. expert)</td>
<td align="center">0.80</td>
<td align="center">0.87</td>
<td align="center">0.89</td>
<td align="center">0.91</td>
<td align="center">0.95</td>
</tr>
<tr>
<td align="center">Comparative accuracy (gnn vs. greedy)</td>
<td align="center">1.14</td>
<td align="center">1.05</td>
<td align="center">1.14</td>
<td align="center">1.25</td>
<td align="center">1.21</td>
</tr>
<tr>
<td align="center">Comparative accuracy (gnn vs. random)</td>
<td align="center">1.33</td>
<td align="center">1.54</td>
<td align="center">1.88</td>
<td align="center">2.38</td>
<td align="center">1.91</td>
</tr>
<tr>
<td align="center">Comparative accuracy (gnn vs. mlp)</td>
<td align="center">1.14</td>
<td align="center">1.67</td>
<td align="center">1.60</td>
<td align="center">1.72</td>
<td align="center">1.58</td>
</tr>
</tbody>
</table>
</table-wrap>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Accuracy for large scales.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="center">Team size</th>
<th align="center">20</th>
<th align="center">40</th>
<th align="center">60</th>
<th align="center">80</th>
<th align="center">100</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="center">Absolute accuracy (gnn vs. N)</td>
<td align="center">0.53</td>
<td align="center">0.59</td>
<td align="center">0.53</td>
<td align="center">0.55</td>
<td align="center">0.54</td>
</tr>
<tr>
<td align="center">Comparative accuracy (gnn vs. greedy)</td>
<td align="center">1.13</td>
<td align="center">1.59</td>
<td align="center">1.42</td>
<td align="center">1.52</td>
<td align="center">1.51</td>
</tr>
<tr>
<td align="center">Comparative accuracy (gnn vs. random)</td>
<td align="center">1.71</td>
<td align="center">1.85</td>
<td align="center">1.63</td>
<td align="center">1.77</td>
<td align="center">1.93</td>
</tr>
<tr>
<td align="center">Comparative accuracy (gnn vs. mlp)</td>
<td align="center">1.20</td>
<td align="center">1.94</td>
<td align="center">2.55</td>
<td align="center">3.20</td>
<td align="center">3.37</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The comparative accuracy between <italic>gnn</italic> and <italic>expert</italic> shows that our trained policy gets much closer to the expert policy as <italic>N</italic> approaches 10, and we expect the performance of <italic>gnn</italic> to be close to that of <italic>expert</italic> even at the large scales. The comparative accuracy between <italic>gnn</italic> and other baselines shows that our trained networks perform much better than baseline algorithms at the large scales (<italic>N</italic> &#x2265; 40) with an average of 1.5 times more captures. The comparative accuracy between <italic>gnn</italic> and <italic>random</italic> is somewhat noisy throughout the team size due to the nature of randomness, but we observe that our policy can outperform random policy with an average of 1.8 times more captures at small and large scales. We observe that <italic>mlp</italic> performs much worse than other algorithms at large scales.</p>
<p>Based on the comparisons, we demonstrate that our proposed networks, which are trained at a small scale, can generalize to large scales. Intuitively, one may think that <italic>greedy</italic> would perform the best in a decentralized setting since each defender does its best to minimize the <italic>value of the game</italic> (defined in Eq. <xref ref-type="disp-formula" rid="e5">5</xref>). However, we can infer that <italic>greedy</italic> does not know the intentions of nearby defenders (e.g., which intruders to capture) so it cannot achieve the performance close to the centralized expert algorithm. Our method implements graph neural networks to exchange the information of nearby defenders, which perceive their local features, to plan the final actions of the defender team; therefore, implicit information of where the nearby defenders are likely to move is transmitted to each neighboring defender. Since the centralized expert policy knows all the intentions of defenders, our GNN-based policy learns the intention through communication channels. The collaboration among the defender team is the key for our <italic>gnn</italic> to outperform <italic>greedy</italic> approach. These results validate that the implemented GNNs are ideal for our problem with the properties of the decentralized communication that captures the neighboring interactions and transferability that allows for generalization to unseen scenarios.</p>
</sec>
<sec id="s5-5">
<title>5.5 Further analysis</title>
<sec id="s5-5-1">
<title>5.5.1 Performance <italic>vs</italic>. number of expert demonstrations</title>
<p>To analyze the algorithm performance, we have trained our GNN-based architecture with a different number of expert demonstrations (e.g., 10 million, 1 million, 100&#xa0;k, and 10&#xa0;k). The percentage of intruders caught (average and standard deviation over 10 trials) on team size 10 &#x2264; <italic>N</italic> &#x2264; 50 are shown in <xref ref-type="fig" rid="F8">Figure 8</xref>. The plot validates that our proposed network learns better with more demonstrations.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>Sample efficiency with different numbers of expert demonstrations.</p>
</caption>
<graphic xlink:href="fcteg-04-1104745-g008.tif"/>
</fig>
</sec>
<sec id="s5-5-2">
<title>5.5.2 Performance <italic>vs</italic>. perimeter radius</title>
<p>We have tested the GNN-based proposed method with different perimeter radii. Intuitively, given the fixed number of agents, increasing the radius may lead to a failure in the defense system. We set the default team size of defenders as 40 and increase the perimeter radius until the percentage of intruders caught converges to zero. As shown in <xref ref-type="fig" rid="F9">Figure 9</xref>, the percentage decreases as the radius changes from 100&#xa0;m to 800&#xa0;m, converging to zero.</p>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption>
<p>Percentage of intruders caught with various perimeter radii.</p>
</caption>
<graphic xlink:href="fcteg-04-1104745-g009.tif"/>
</fig>
</sec>
<sec id="s5-5-3">
<title>5.5.3 Performance <italic>vs</italic>. number of intruders sensed</title>
<p>The performance of our GNN-based approach with different numbers of intruder (e.g., <inline-formula id="inf68">
<mml:math id="m78">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula>) sensed is shown in <xref ref-type="fig" rid="F10">Figure 10</xref>. We have run the experiments with <inline-formula id="inf69">
<mml:math id="m79">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> as 1, 3, 5, and 10 since no ground truth expert policy is available to generate the training data for numbers larger than 10. We observe that the more intruder features are sensed, the better performances are shown. Further, the performance discrepancy tends to be smaller as the team size gets bigger. For some team size (e.g., 40), higher <inline-formula id="inf70">
<mml:math id="m80">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> performs much better, but this is expected based on the initial configuration of the game. For instance, if the initial configuration is very sparse, a defender will benefit from higher <inline-formula id="inf71">
<mml:math id="m81">
<mml:msubsup>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula>, and the percentage of intruders caught will be higher.</p>
<fig id="F10" position="float">
<label>FIGURE 10</label>
<caption>
<p>Percentage of intruders caught with different numbers of intruders sensed.</p>
</caption>
<graphic xlink:href="fcteg-04-1104745-g010.tif"/>
</fig>
</sec>
</sec>
<sec id="s5-6">
<title>5.6 Limitations</title>
<p>As perimeter defense is a relatively new field of research, this work has underlying limiting assumptions. In the problem formulation, we assume the robots are point particles. Accordingly, we assume optimal trajectories obey first-order assumptions. There is a preliminary work (<xref ref-type="bibr" rid="B13">Lee et al., 2021</xref>) to bridge the gap between the point particle assumptions and three-dimensional robots for one-on-one hemisphere perimeter defense, and we hope to extend the idea of this work to our multi-agent perimeter defense problem in the future. Another limitation is that there is no available expert policy, which can be compared with our proposed method, at large scales. Running the maximum matching algorithm is very expensive at large scales, so we compare our GNN-based algorithm with other baseline methods. Although the consistent performances of tested algorithms along different scales confirm that our trained networks can be generalized to large scales, we hope to explore another algorithm that can be used as an expert policy at large scales to replace the maximum matching. One consideration is utilizing reinforcement learning since the algorithm performance at large scales will be available.</p>
</sec>
</sec>
<sec sec-type="conclusion" id="s6">
<title>6 Conclusion</title>
<p>This paper proposes a novel framework that employs graph neural networks to solve the decentralized multi-agent perimeter defense problem. Our learning framework takes the defenders&#x2019; local perceptions and the communication graph as inputs and returns actions to maximize the number of captures for the defender team. We train deep networks supervised by an expert policy based on the maximum matching algorithm. To validate the proposed method, we run the perimeter defense game in different team sizes using five different algorithms: <italic>expert</italic>, <italic>gnn</italic>, <italic>greedy</italic>, <italic>random</italic>, and <italic>mlp</italic>. We demonstrate that our GNN-based policy stays closer to the expert policy at small scales and the trained networks can generalize to large scales.</p>
<p>One future work is to implement vision-based local sensing for the perception module, which would relax the assumptions of perfect state estimation. Realizing multi-agent perimeter defense with vision-based perception and communication within the defenders will be an end goal. Another future research direction is to find a centralized expert policy in multi-robot systems by utilizing reinforcement learning.</p>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="s7">
<title>Data availability statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="s8">
<title>Author contributions</title>
<p>EL, LZ, and VK contributed to conception and design of the study. EL and AR performed the statistical analysis. EL wrote the first draft of the manuscript. All authors contributed to manuscript revision, read and approved the submitted version.</p>
</sec>
<ack>
<p>We gratefully acknowledge the support from ARL DCIST CRA under Grant W911NF-17-2-0181, NSF under Grants CCR-2112665, CNS-1446592, and EEC-1941529, ONR under Grants N00014-20-1-2822 and N00014-20-S-B001, Qualcomm Research, NVIDIA, Lockheed Martin, and C-BRIC, a Semiconductor Research Corporation Joint University Microelectronics Program cosponsored by DARPA.</p>
</ack>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Bajaj</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Torng</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Bopardikar</surname>
<given-names>S. D.</given-names>
</name>
<name>
<surname>Von Moll</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Weintraub</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Garcia</surname>
<given-names>E.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). <source>Competitive perimeter defense of conical environments</source>. <comment>arXiv preprint arXiv:2110.04667</comment>.</citation>
</ref>
<ref id="B2">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Baxter</surname>
<given-names>J. L.</given-names>
</name>
<name>
<surname>Burke</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Garibaldi</surname>
<given-names>J. M.</given-names>
</name>
<name>
<surname>Norman</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2007</year>). &#x201c;<article-title>Multi-robot search and rescue: A potential field based approach</article-title>,&#x201d; in <source>Autonomous robots and agents</source> (<publisher-name>Springer</publisher-name>), <fpage>9</fpage>&#x2013;<lpage>16</lpage>.</citation>
</ref>
<ref id="B3">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>A. K.</given-names>
</name>
<name>
<surname>Macharet</surname>
<given-names>D. G.</given-names>
</name>
<name>
<surname>Shishika</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Pappas</surname>
<given-names>G. J.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2021</year>). &#x201c;<article-title>Optimal multi-robot perimeter defense using flow networks</article-title>,&#x201d; in <source>International symposium distributed autonomous robotic systems</source> (<publisher-name>Springer</publisher-name>), <fpage>282</fpage>&#x2013;<lpage>293</lpage>.</citation>
</ref>
<ref id="B4">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Tomlin</surname>
<given-names>C. J.</given-names>
</name>
</person-group> (<year>2014</year>). &#x201c;<article-title>Multiplayer reach-avoid games via low dimensional solutions and maximum matching</article-title>,&#x201d; in <conf-name>2014 American control conference</conf-name> (<publisher-name>IEEE</publisher-name>), <fpage>1444</fpage>&#x2013;<lpage>1449</lpage>.</citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Chen</surname>
<given-names>S. W.</given-names>
</name>
<name>
<surname>Nardari</surname>
<given-names>G. V.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>E. S.</given-names>
</name>
<name>
<surname>Qu</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Romero</surname>
<given-names>R. A. F.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Sloam: Semantic lidar odometry and mapping for forest inventory</article-title>. <source>IEEE Robotics Automation Lett.</source> <volume>5</volume>, <fpage>612</fpage>&#x2013;<lpage>619</lpage>. <pub-id pub-id-type="doi">10.1109/lra.2019.2963823</pub-id>
</citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gama</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Marques</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Leus</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Ribeiro</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2019</year>). <article-title>Convolutional neural network architectures for signals supported on graphs</article-title>. <source>IEEE Trans. Signal Process.</source> <volume>67</volume>, <fpage>1034</fpage>&#x2013;<lpage>1049</lpage>. <pub-id pub-id-type="doi">10.1109/tsp.2018.2887403</pub-id>
</citation>
</ref>
<ref id="B7">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Ge</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Radhakrishnan</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Loianno</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2022</year>). <source>Vision-based relative detection and tracking for teams of micro aerial vehicles</source>. <comment>
<italic>arXiv preprint arXiv:2207.08301</italic>
</comment>.</citation>
</ref>
<ref id="B8">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Hsu</surname>
<given-names>C. D.</given-names>
</name>
<name>
<surname>Haile</surname>
<given-names>M. A.</given-names>
</name>
<name>
<surname>Chaudhari</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2022</year>). <source>A model for perimeter-defense problems with heterogeneous teams</source>. <comment>
<italic>arXiv preprint arXiv:2208.01430</italic>
</comment>.</citation>
</ref>
<ref id="B9">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Kim</surname>
<given-names>D. K.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Riemer</surname>
<given-names>M. D.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Abdulhai</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Habibi</surname>
<given-names>G.</given-names>
</name>
<etal/>
</person-group> (<year>2021</year>). &#x201c;<article-title>A policy gradient algorithm for learning to learn in multiagent reinforcement learning</article-title>,&#x201d; in <conf-name>International Conference on Machine Learning (PMLR)</conf-name>, <fpage>5541</fpage>&#x2013;<lpage>5550</lpage>.</citation>
</ref>
<ref id="B10">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>E. S.</given-names>
</name>
<name>
<surname>Loianno</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Jayaraman</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2022a</year>). <source>Vision-based perimeter defense via multiview pose estimation</source>. <comment>
<italic>arXiv preprint arXiv:2209.12136</italic>
</comment>.</citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>E. S.</given-names>
</name>
<name>
<surname>Loianno</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Thakur</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2020a</year>). <article-title>Experimental evaluation and characterization of radioactive source effects on robot visual localization and mapping</article-title>. <source>IEEE Robotics Automation Lett.</source> <volume>5</volume>, <fpage>3259</fpage>&#x2013;<lpage>3266</lpage>. <pub-id pub-id-type="doi">10.1109/lra.2020.2975723</pub-id>
</citation>
</ref>
<ref id="B12">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>E. S.</given-names>
</name>
<name>
<surname>Shishika</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2020b</year>). &#x201c;<article-title>Perimeter-defense game between aerial defender and ground intruder</article-title>,&#x201d; in <conf-name>2020 59th IEEE Conference on Decision and Control (CDC)</conf-name> (<publisher-name>IEEE</publisher-name>), <fpage>1530</fpage>&#x2013;<lpage>1536</lpage>.</citation>
</ref>
<ref id="B13">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>E. S.</given-names>
</name>
<name>
<surname>Shishika</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Loianno</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2021</year>). &#x201c;<article-title>Defending a perimeter from a ground intruder using an aerial defender: Theory and practice</article-title>,&#x201d; in <source>2021 IEEE international symposium on safety, security, and rescue robotics (SSRR)</source> (<publisher-name>IEEE</publisher-name>), <fpage>184</fpage>&#x2013;<lpage>189</lpage>.</citation>
</ref>
<ref id="B14">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>E. S.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Ribeiro</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2022b</year>). <source>Learning decentralized strategies for a perimeter defense game with graph neural networks</source>. <comment>
<italic>arXiv preprint arXiv:2211.01757</italic>
</comment>.</citation>
</ref>
<ref id="B15">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Har</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kum</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2016</year>). &#x201c;<article-title>Drone-assisted disaster management: Finding victims via infrared camera and lidar sensor fusion</article-title>,&#x201d; in <source>2016 3rd asia-pacific world congress on computer science and engineering (APWC on CSE)</source> (<publisher-name>IEEE</publisher-name>), <fpage>84</fpage>&#x2013;<lpage>89</lpage>.</citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lee</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Bakolas</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Guarding a convex target set from an attacker in Euclidean spaces</article-title>. <source>IEEE Control Syst. Lett.</source> <volume>6</volume>, <fpage>1706</fpage>&#x2013;<lpage>1711</lpage>. <pub-id pub-id-type="doi">10.1109/lcsys.2021.3132083</pub-id>
</citation>
</ref>
<ref id="B17">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Gama</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Ribeiro</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Prorok</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2020</year>). &#x201c;<article-title>Graph neural networks for decentralized multi-robot path planning</article-title>,&#x201d; in <conf-name>2020 IEEE/RSJ Intl Conference on Intelligent Robots and Systems (IROS)</conf-name> (<publisher-name>IEEE</publisher-name>).</citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Li</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Lin</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Prorok</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Message-aware graph attention networks for large-scale multi-robot path planning</article-title>. <source>IEEE Robotics Automation Lett.</source> <volume>6</volume>, <fpage>5533</fpage>&#x2013;<lpage>5540</lpage>. <pub-id pub-id-type="doi">10.1109/lra.2021.3077863</pub-id>
</citation>
</ref>
<ref id="B19">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Prabhu</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Cladera</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>I. D.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Taylor</surname>
<given-names>C. J.</given-names>
</name>
<etal/>
</person-group> (<year>2022</year>). <source>Active metric-semantic mapping by multiple aerial robots</source>. <comment>arXiv preprint arXiv:2209.08465</comment>.</citation>
</ref>
<ref id="B20">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Macharet</surname>
<given-names>D. G.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>A. K.</given-names>
</name>
<name>
<surname>Shishika</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Pappas</surname>
<given-names>G. J.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2020</year>). &#x201c;<article-title>Adaptive partitioning for coordinated multi-agent perimeter defense</article-title>,&#x201d; in <conf-name>IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)</conf-name>.</citation>
</ref>
<ref id="B21">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Mellinger</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2011</year>). &#x201c;<article-title>Minimum snap trajectory generation and control for quadrotors</article-title>,&#x201d; in <conf-name>2011 IEEE international conference on robotics and automation</conf-name> (<publisher-name>IEEE</publisher-name>), <fpage>2520</fpage>&#x2013;<lpage>2525</lpage>.</citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Miller</surname>
<given-names>I. D.</given-names>
</name>
<name>
<surname>Cladera</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Cowley</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Shivakumar</surname>
<given-names>S. S.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>E. S.</given-names>
</name>
<name>
<surname>Jarin-Lipschitz</surname>
<given-names>L.</given-names>
</name>
<etal/>
</person-group> (<year>2020</year>). <article-title>Mine tunnel exploration using multiple quadrupedal robots</article-title>. <source>IEEE Robotics Automation Lett.</source> <volume>5</volume>, <fpage>2840</fpage>&#x2013;<lpage>2847</lpage>. <pub-id pub-id-type="doi">10.1109/lra.2020.2972872</pub-id>
</citation>
</ref>
<ref id="B23">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Mox</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Calvo-Fullana</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Gerasimenko</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Fink</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Ribeiro</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2020</year>). &#x201c;<article-title>Mobile wireless network infrastructure on demand</article-title>,&#x201d; in <conf-name>2020 IEEE International Conference on Robotics and Automation (ICRA)</conf-name> (<publisher-name>IEEE</publisher-name>), <fpage>7726</fpage>&#x2013;<lpage>7732</lpage>.</citation>
</ref>
<ref id="B24">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Ng</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Kennedy</surname>
<given-names>M.</given-names>
<suffix>III</suffix>
</name>
</person-group> (<year>2022</year>). <source>It takes two: Learning to plan for human-robot cooperative carrying</source>. <comment>
<italic>arXiv preprint arXiv:2209.12890</italic>
</comment>.</citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Nguyen</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Shivakumar</surname>
<given-names>S. S.</given-names>
</name>
<name>
<surname>Miller</surname>
<given-names>I. D.</given-names>
</name>
<name>
<surname>Keller</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>E. S.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>A.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Mavnet: An effective semantic segmentation micro-network for mav-based tasks</article-title>. <source>IEEE Robotics Automation Lett.</source> <volume>4</volume>, <fpage>3908</fpage>&#x2013;<lpage>3915</lpage>. <pub-id pub-id-type="doi">10.1109/lra.2019.2928734</pub-id>
</citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Paszke</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Gross</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Massa</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Lerer</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Bradbury</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Chanan</surname>
<given-names>G.</given-names>
</name>
<etal/>
</person-group> (<year>2019</year>). <article-title>Pytorch: An imperative style, high-performance deep learning library</article-title>. <source>Adv. neural Inf. Process. Syst.</source> <volume>32</volume>.</citation>
</ref>
<ref id="B27">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Paulos</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>S. W.</given-names>
</name>
<name>
<surname>Shishika</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Decentralization of multiagent policies by learning what to communicate</article-title>,&#x201d; in <conf-name>2019 International Conference on Robotics and Automation (ICRA)</conf-name> (<publisher-name>IEEE</publisher-name>), <fpage>7990</fpage>&#x2013;<lpage>7996</lpage>.</citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ruiz</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Gama</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Ribeiro</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Graph neural networks: Architectures, stability, and transferability</article-title>. <source>Proc. IEEE</source> <volume>109</volume>, <fpage>660</fpage>&#x2013;<lpage>682</lpage>. <pub-id pub-id-type="doi">10.1109/jproc.2021.3055400</pub-id>
</citation>
</ref>
<ref id="B29">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Sharma</surname>
<given-names>V. D.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Tokekar</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2022</year>). <source>D2coplan: A differentiable decentralized planner for multi-robot coverage</source>. <comment>
<italic>arXiv preprint arXiv:2209.09292</italic>
</comment>.</citation>
</ref>
<ref id="B30">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Shishika</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2020</year>). &#x201c;<article-title>A review of multi agent perimeter defense games</article-title>,&#x201d; in <conf-name>International Conference on Decision and Game Theory for Security</conf-name> (<publisher-name>Springer</publisher-name>), <fpage>472</fpage>&#x2013;<lpage>485</lpage>.</citation>
</ref>
<ref id="B31">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Shishika</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2018</year>). &#x201c;<article-title>Local-game decomposition for multiplayer perimeter-defense problem</article-title>,&#x201d; in <conf-name>In 2018 IEEE Conference on Decision and Control (CDC)</conf-name> (<publisher-name>IEEE</publisher-name>), <fpage>2093</fpage>&#x2013;<lpage>2100</lpage>.</citation>
</ref>
<ref id="B32">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Thrun</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Burgard</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Fox</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2000</year>). &#x201c;<article-title>A real-time algorithm for mobile robot mapping with applications to multi-robot and 3d mapping</article-title>,&#x201d; in <conf-name>Proceedings 2000 ICRA. Millennium Conference. IEEE International Conference on Robotics and Automation. Symposia Proceedings</conf-name> (<publisher-name>IEEE</publisher-name>), <fpage>321</fpage>&#x2013;<lpage>328</lpage>. <comment>(Cat. No. 00CH37065)</comment>.</citation>
</ref>
<ref id="B33">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Tolstaya</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Gama</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Paulos</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Pappas</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
<name>
<surname>Ribeiro</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Learning decentralized controllers for robot swarms with graph neural networks</article-title>,&#x201d; in <conf-name>Conference Robot Learning 2019</conf-name> (<publisher-loc>Osaka, Japan</publisher-loc>: <publisher-name>Int. Found. Robotics Res.</publisher-name>).</citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Velhal</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sundaram</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sundararajan</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>A decentralized multirobot spatiotemporal multitask assignment approach for perimeter defense</article-title>. <source>IEEE Trans. Robotics</source> <volume>38</volume>, <fpage>3085</fpage>&#x2013;<lpage>3096</lpage>. <pub-id pub-id-type="doi">10.1109/tro.2022.3158198</pub-id>
</citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Gombolay</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Learning scheduling policies for multi-robot coordination with graph attention networks</article-title>. <source>IEEE Robotics Automation Lett.</source> <volume>5</volume>, <fpage>4509</fpage>&#x2013;<lpage>4516</lpage>. <pub-id pub-id-type="doi">10.1109/lra.2020.3002198</pub-id>
</citation>
</ref>
<ref id="B36">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Xu</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>D&#x2019;Antonio</surname>
<given-names>D. S.</given-names>
</name>
<name>
<surname>Salda&#xf1;a</surname>
<given-names>D.</given-names>
</name>
</person-group> (<year>2022</year>). <source>Modular multi-rotors: From quadrotors to fully-actuated aerial vehicles</source>. <comment>
<italic>arXiv preprint arXiv:2202.00788</italic>
</comment>.</citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yan</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Duan</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhong</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Bullo</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Matching-based capture strategies for 3d heterogeneous multiplayer reach-avoid differential games</article-title>. <source>Automatica</source> <volume>140</volume>, <fpage>110207</fpage>. <pub-id pub-id-type="doi">10.1016/j.automatica.2022.110207</pub-id>
</citation>
</ref>
<ref id="B38">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Yan</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhong</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Construction of the barrier for reach-avoid differential games in three-dimensional space with four equal-speed players</article-title>,&#x201d; in <conf-name>2019 IEEE 58th Conference on Decision and Control (CDC)</conf-name> (<publisher-name>IEEE</publisher-name>), <fpage>4067</fpage>&#x2013;<lpage>4072</lpage>.</citation>
</ref>
<ref id="B39">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yan</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Shi</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Zhong</surname>
<given-names>Y.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Guarding a subspace in high-dimensional space with two defenders and one attacker</article-title>. <source>IEEE Trans. Cybern.</source> <volume>52</volume>, <fpage>3998</fpage>&#x2013;<lpage>4011</lpage>. <pub-id pub-id-type="doi">10.1109/tcyb.2020.3015031</pub-id>
</citation>
</ref>
<ref id="B40">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Sharma</surname>
<given-names>V. D.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Prorok</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Ribeiro</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Kumar</surname>
<given-names>V.</given-names>
</name>
</person-group> (<year>2021</year>). <source>Graph neural networks for decentralized multi-robot submodular action selection</source>. <comment>
<italic>arXiv preprint arXiv:2105.08601</italic>
</comment>.</citation>
</ref>
</ref-list>
</back>
</article>