<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Phys.</journal-id>
<journal-title>Frontiers in Physics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Phys.</abbrev-journal-title>
<issn pub-type="epub">2296-424X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">870273</article-id>
<article-id pub-id-type="doi">10.3389/fphy.2022.870273</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Physics</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Point-to-Point Navigation of a Fish-Like Swimmer in a Vortical Flow With Deep Reinforcement Learning</article-title>
<alt-title alt-title-type="left-running-head">Zhu et al.</alt-title>
<alt-title alt-title-type="right-running-head">Navigation of a Fish-Like Swimmer</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Zhu</surname>
<given-names>Yi</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1681450/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Pang</surname>
<given-names>Jian-Hua</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1744535/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Tian</surname>
<given-names>Fang-Bao</given-names>
</name>
<xref ref-type="aff" rid="aff3">
<sup>3</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1197611/overview"/>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Ocean Intelligence Technology Center</institution>, <institution>Shenzhen Institute of Guangdong Ocean University</institution>, <addr-line>Shenzhen</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>College of Ocean Engineering</institution>, <institution>Guangdong Ocean University</institution>, <addr-line>Zhanjiang</addr-line>, <country>China</country>
</aff>
<aff id="aff3">
<sup>3</sup>
<institution>School of Engineering and Information Technology</institution>, <institution>University of New South Wales</institution>, <addr-line>Canberra</addr-line>, <addr-line>ACT</addr-line>, <country>Australia</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1320183/overview">Haibo Huang</ext-link>, University of Science and Technology of China, China</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1673038/overview">Chengwen Zhong</ext-link>, Northwestern Polytechnical University, China</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/904797/overview">Charles Reichhardt</ext-link>, Los Alamos National Laboratory (DOE), United States</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Jian-Hua Pang, <email>pangjianhua@gdou.edu.cn</email>; Fang-Bao Tian, <email>f.tian@adfa.edu.au</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Biophysics, a section of the journal Frontiers in Physics</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>09</day>
<month>05</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>10</volume>
<elocation-id>870273</elocation-id>
<history>
<date date-type="received">
<day>06</day>
<month>02</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>07</day>
<month>03</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Zhu, Pang and Tian.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Zhu, Pang and Tian</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Efficient navigation in complex flows is of crucial importance for robotic applications. This work presents a numerical study of the point-to-point navigation of a fish-like swimmer in a time-varying vortical flow with a hybrid method of deep reinforcement learning (DRL) and immersed boundary&#x2013;lattice Boltzmann method (IB-LBM). The vortical flow is generated by placing four stationary cylinders in a uniform flow. The swimmer is trained to discover effective navigation strategies that could help itself to reach a given destination point in the flow field, utilizing only the time-sequential information of position, orientation, velocity and angular velocity. After training, the fish can reach its destination from random positions and orientations, demonstrating the effectiveness and robustness of the method. A detailed analysis shows that the fish utilizes highly subtle tail flapping to control its swimming orientation and take advantage of the reduced streamwise flow area to reach it destination, and in the same time avoiding entering the high flow velocity area.</p>
</abstract>
<kwd-group>
<kwd>vortical flow</kwd>
<kwd>immersed boundary-lattice Boltzmann method</kwd>
<kwd>deep reinforcement learning</kwd>
<kwd>point-to-point navigation</kwd>
<kwd>robotic fish</kwd>
<kwd>target-directed swimming</kwd>
<kwd>fish swimming</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>To find the timely optimal path between two given points in a complex flow is known as Zermelo&#x2019;s navigation problem [<xref ref-type="bibr" rid="B1">1</xref>]. This problem is a key issue for many robotic and engineering applications, including micro-swimmers [<xref ref-type="bibr" rid="B2">2</xref>,<xref ref-type="bibr" rid="B3">3</xref>], fish-like underwater vehicles [<xref ref-type="bibr" rid="B4">4</xref>], unmanned drones [<xref ref-type="bibr" rid="B5">5</xref>], and weather balloons [<xref ref-type="bibr" rid="B6">6</xref>]. In realistic environments, different structures interact with disturbances like wind, waves and currents, generating abundant vortices that could significantly effect the operation of these robotics [<xref ref-type="bibr" rid="B7">7</xref>], making the predefined control algorithms ineffective. In this work, we tackle the Zermelo&#x2019;s problem for the point-to-point navigation of a fish-like swimmer in a vortical flow environment. Typical application scenarios include oceanic supervision [<xref ref-type="bibr" rid="B8">8</xref>], fishery conservation and intervention on offshore structures [<xref ref-type="bibr" rid="B9">9</xref>].</p>
<p>Naive control strategies are usually ineffective or inefficient in vortical environments [<xref ref-type="bibr" rid="B10">10</xref>], since the vortices could easily deviate the vehicles away from their desired path [<xref ref-type="bibr" rid="B11">11</xref>]. Numerous methods have been trying to design a customized optimal path for a given environment, ranging from the classical optimal control theory [<xref ref-type="bibr" rid="B12">12</xref>] to modern optimization approaches [<xref ref-type="bibr" rid="B13">13</xref>,<xref ref-type="bibr" rid="B14">14</xref>]. An important feature of these methods is that they require the knowledge of the dynamics of the background flow [<xref ref-type="bibr" rid="B15">15</xref>]. However, in real world applications, it is impractical to measure the entire flow environment in advance, as ocean and air currents are too variable to be fully measured [<xref ref-type="bibr" rid="B15">15</xref>]. In addition, the vehicles themselves can also significantly alter the surrounding flow fields, making them more unpredictable [<xref ref-type="bibr" rid="B15">15</xref>].</p>
<p>Reinforcement learning (RL) offers a promising alternative for solving Zermelo&#x2019;s navigation problem in complex time-varying environments. Compared to the classical methods, RL possesses two main advantages. The first advantage is that it does not require any prior knowledge of the environment [<xref ref-type="bibr" rid="B16">16</xref>]. Instead, it automatically develops an understanding of the dynamics of the environment through trial and error. The other advantage is that the influence of the historical states can be easily taken into consideration [<xref ref-type="bibr" rid="B17">17</xref>]. Therefore, the correlation between action and its effect can be accurately captured even when there is a delay between them and there are measurable impacts from the historical actions. Colabrese et al. [<xref ref-type="bibr" rid="B3">3</xref>] first demonstrated that reinforcement learning is an efficient way to address Zermelo&#x2019;s Problem. They adopted this method to train a point-like swimmer in an Arnold-Beltrami-Childress (ABC) flow to navigate vertically as quickly as possible. The swimmer was assumed to swim with constant speed and its direction was decided by the combined effect of a shear-induced viscous torque and a torque applied by the swimmer to orient itself to a desired direction. And a torque on the swimmer was designed by measuring its instantaneous swimming direction and the local flow vorticity. The authors found that smart swimmer can take advantage of upwelling flows to accelerate upward navigation and avoid being trapped in the vortices. This work motivated a series of studies, investigating the point-to-point navigation in different flows, as well as different actions [<xref ref-type="bibr" rid="B7">7</xref>,<xref ref-type="bibr" rid="B10">10</xref>,<xref ref-type="bibr" rid="B15">15</xref>,<xref ref-type="bibr" rid="B18">18</xref>&#x2013;<xref ref-type="bibr" rid="B26">26</xref>].</p>
<p>The above studies demonstrated the potential of reinforcement learning in solving the navigation problems in complex flows. However, several simplifications are used for a better comparison with the traditional control methods. Firstly, most of these studies adopted simplified flow models to avoid the actual complexity and unpredictability of a time-varying fluid flow. Secondly, idealized model of the swimmer and their actions are utilized. In most of studies, the swimmers are considered to be an infinitely small point, which has negligible influence on the background flow. Moreover, the propellers of those swimmers are not modeled. Instead, it is assumed that the swimmers have full control of their own velocities. Those assumptions neglect the complex interaction between the swimmers and the environmental flows, such as time delays between sensing, actions and rewards. In this work, we investigate the point-to-point navigation of a fish-like swimmer in a vortical flow with a hybrid method of deep reinforcement learning (DRL) and immersed boundary&#x2013;lattice Boltzmann method (IB-LBM). Compared with previous works, the present work utilizes a full model of both the flow and the swimmer. Specifically, the vortical flow is numerically generated with IB-LBM by putting four cylinders in a uniform flow, and the fish-like swimmer propels itself by periodically undulating its fish-like body to push the surrounding flow afterwards. This setup retains the complex nonlinear interaction between the swimmer and the flow.</p>
<p>The rest of the paper is organized as follows. Numerical methods are simply introduced in <xref ref-type="sec" rid="s2">Section 2</xref>. The results of the simulation are discussed in <xref ref-type="sec" rid="s3">Section 3</xref>. The conclusions are provided in <xref ref-type="sec" rid="s4">Section 4</xref>.</p>
</sec>
<sec id="s2">
<title>2 Methodology</title>
<p>The methodology used here is almost the same as that in our previous work [<xref ref-type="bibr" rid="B27">27</xref>]. Here briefly describe it for complicity. More details of the method and its validations can be found in our previous work.</p>
<sec id="s2-1">
<title>2.1 Kinematic Model of the Fish</title>
<p>The half thickness of the body is mathematically approximated by<disp-formula id="e1">
<mml:math id="m1">
<mml:mfrac>
<mml:mrow>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0.2610</mml:mn>
<mml:msqrt>
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:msqrt>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>0.3112</mml:mn>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2b;</mml:mo>
<mml:mn>0.1371</mml:mn>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>0.0791</mml:mn>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>0.0078</mml:mn>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
</mml:math>
<label>(1)</label>
</disp-formula>where <italic>l</italic> is the arc length along the mid-line of the body, and <italic>L</italic> is the body length which is a constant during the swimming [<xref ref-type="bibr" rid="B28">28</xref>].</p>
<p>The motion of the fish body is composed of the translation of the mass center, the body rotation around the mass center and the body undulation in the local coordinate system (<xref ref-type="fig" rid="F1">Figure 1</xref>). The translational and rotational motion of the fish are determined by the FSI in the global coordinate system according to the Newton&#x2019;s laws of motion. The FSI equations are solved by an explicit FSI coupling method as in Ref. [<xref ref-type="bibr" rid="B27">27</xref>,<xref ref-type="bibr" rid="B30">30</xref>]. The undulatory motion is controlled by the fish itself, which can be taken as the superposition of different waves propagating from head to tail. A polynomial-based waveform is adopted for each wave and the kinematics of the newest generated waves can be changed every half cycle. In the <italic>n</italic>th half cycle, the mid-line lateral displacement is determined by<disp-formula id="e2">
<mml:math id="m2">
<mml:msub>
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mi>h</mml:mi>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mfrac>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3bb;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfrac>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2212;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>L</mml:mi>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>&#x222b;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>s</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>n</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>l</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mi>d</mml:mi>
<mml:mi>l</mml:mi>
<mml:mo>,</mml:mo>
</mml:math>
<label>(2)</label>
</disp-formula>where <italic>&#x3b8;</italic>
<sub>
<italic>l</italic>
</sub> is the deflection angle of the mid-line with respect to axis <italic>x</italic>
<sub>
<italic>l</italic>
</sub> as shown in <xref ref-type="fig" rid="F1">Figure 1</xref>, <italic>&#x3bb;</italic>
<sub>
<italic>n</italic>
</sub> is the wavelength, <italic>T</italic>
<sub>
<italic>n</italic>
</sub> is the period, <italic>t</italic> is the time, <italic>t</italic>
<sub>0<italic>n</italic>
</sub> &#x3d; 0 for <italic>n</italic> &#x3d; 1 and <inline-formula id="inf1">
<mml:math id="m3">
<mml:msubsup>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:msub>
<mml:mrow>
<mml:mi>T</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> for <italic>n</italic> &#x3e; 1, and <italic>h</italic> is the waveform function described by<disp-formula id="e3">
<mml:math id="m4">
<mml:mi>h</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>&#x3b6;</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>c</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>c</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mi>&#x3b6;</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>c</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3b6;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>c</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msub>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3b6;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>3</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>c</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:msub>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3b6;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>4</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>c</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>5</mml:mn>
</mml:mrow>
</mml:msub>
<mml:msup>
<mml:mrow>
<mml:mi>&#x3b6;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>5</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>,</mml:mo>
</mml:math>
<label>(3)</label>
</disp-formula>where <italic>c</italic>
<sub>0&#x2212;5</sub> can be determined by <inline-formula id="inf2">
<mml:math id="m5">
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula>, <inline-formula id="inf3">
<mml:math id="m6">
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3bb;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula>,<italic>h</italic>&#x2032;(0)&#x3d;<italic>h</italic>&#x2032;(<italic>&#x3bb;</italic>
<sub>
<italic>n</italic>
</sub>/2)&#x3d;0, <inline-formula id="inf4">
<mml:math id="m7">
<mml:msup>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>&#x3c0;</mml:mi>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3bb;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula>, and <inline-formula id="inf5">
<mml:math id="m8">
<mml:msup>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo>&#x2032;</mml:mo>
<mml:mo>&#x2032;</mml:mo>
</mml:mrow>
</mml:msup>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3bb;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>h</mml:mi>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3bb;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mn>2</mml:mn>
<mml:mi>&#x3c0;</mml:mi>
<mml:mo>/</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3bb;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:math>
</inline-formula>. <inline-formula id="inf6">
<mml:math id="m9">
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>l</mml:mi>
<mml:mi>m</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> is the maximum deflection angle at the tail tip of the <italic>n</italic>th half wave.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>A schematic illustration of the motion of the fish (Adapted from Ref. [<xref ref-type="bibr" rid="B27">27</xref>,<xref ref-type="bibr" rid="B29">29</xref>]).</p>
</caption>
<graphic xlink:href="fphy-10-870273-g001.tif"/>
</fig>
</sec>
<sec id="s2-2">
<title>2.2 Immersed Boundary&#x2013;Lattice Boltzmann Method</title>
<p>The lattice Boltzmann method (LBM) is used to simulate the fluid dynamics [<xref ref-type="bibr" rid="B31">31</xref>,<xref ref-type="bibr" rid="B32">32</xref>]. Instead of solving the Navier-Stokes equations, the LBM solves the discrete lattice Boltzmann equation which governs the kinematics of the mesoscopic particles,<disp-formula id="e4">
<mml:math id="m10">
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">c</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mi mathvariant="normal">&#x394;</mml:mi>
<mml:mi>t</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi mathvariant="normal">&#x394;</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="normal">&#x3a9;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2b;</mml:mo>
<mml:mi mathvariant="normal">&#x394;</mml:mi>
<mml:mi>t</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>G</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi mathvariant="bold-italic">r</mml:mi>
<mml:mo>,</mml:mo>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>0</mml:mn>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:mn>8</mml:mn>
</mml:math>
<label>(4)</label>
</disp-formula>where <italic>f</italic> is the particle density distribution function, <bold>
<italic>r</italic>
</bold> &#x3d; (<italic>x</italic>, <italic>y</italic>) is the space coordinate, <bold>
<italic>c</italic>
</bold>
<sub>
<bold>
<italic>i</italic>
</bold>
</sub> is the discrete lattice velocity, &#x394;<italic>t</italic> is time step, <italic>&#x3a9;</italic>
<sub>
<italic>i</italic>
</sub> is the collision operator, and <italic>G</italic>
<sub>
<italic>i</italic>
</sub> is the source term representing the body force. A detailed description of this equation can be found in Ref. [<xref ref-type="bibr" rid="B33">33</xref>]. <italic>f</italic> in the whole flow field can be acquired from a well-defined boundary condition, such as the no-slip velocity condition on the boundary of the swimmer model. Once <italic>f</italic> is known, the macroscopic physical quantity such as fluid density, pressure and velocity can be computed from<disp-formula id="e5">
<mml:math id="m11">
<mml:mi>&#x3c1;</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x2211;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:mspace width="1em"/>
<mml:mi>p</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c1;</mml:mi>
<mml:msubsup>
<mml:mrow>
<mml:mi>c</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mspace width="1em"/>
<mml:mi mathvariant="bold-italic">u</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>&#x3c1;</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi mathvariant="bold-italic">c</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mi mathvariant="normal">&#x394;</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi mathvariant="bold-italic">g</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
<label>(5)</label>
</disp-formula>where <italic>c</italic>
<sub>
<italic>s</italic>
</sub> is the lattice speed of sound in the fluid, and <bold>
<italic>g</italic>
</bold> is the body force. Then the force and torque on the swimmer model can be computed from those macroscopic physical quantity.</p>
<p>In addition, a diffusion immersed boundary method (IBM) [<xref ref-type="bibr" rid="B32">32</xref>,<xref ref-type="bibr" rid="B34">34</xref>&#x2013;<xref ref-type="bibr" rid="B36">36</xref>] is utilized to handle the boundary condition at the fluid-structure interface. In this method, the influence of the boundary on the fluid is represented by a distribution of body force on the background Eulerian mesh nodes. Compared to body conformal methods [<xref ref-type="bibr" rid="B37">37</xref>&#x2013;<xref ref-type="bibr" rid="B39">39</xref>], the grid generation in IBM is much easier for complicated shapes [<xref ref-type="bibr" rid="B32">32</xref>,<xref ref-type="bibr" rid="B40">40</xref>,<xref ref-type="bibr" rid="B41">41</xref>]. And a multi-block geometry-adaptive Cartesian grid is coupled with the IB&#x2013;LBM to accelerate the computation. A detailed description of this numerical scheme and its validation can be found in Refs. [<xref ref-type="bibr" rid="B27">27</xref>,<xref ref-type="bibr" rid="B31">31</xref>,<xref ref-type="bibr" rid="B34">34</xref>,<xref ref-type="bibr" rid="B42">42</xref>&#x2013;<xref ref-type="bibr" rid="B44">44</xref>]. The current method is first-order in accuracy.</p>
</sec>
<sec id="s2-3">
<title>2.3 Deep Reinforcement Learning</title>
<p>DRL is a machine learning method combining reinforcement learning with an artificial neural network. DRL has gained extensive attention due to its success in complex real-world problems [<xref ref-type="bibr" rid="B45">45</xref>]. In this study, a specific DRL method called deep recurrent Q-network (DRQN) [<xref ref-type="bibr" rid="B46">46</xref>] is adopted, in which a long-short-term-memory recurrent neural network (LSTM-RNN) is used to process time-sequential data. The method includes two basic elements: a learning agent and its environment [<xref ref-type="bibr" rid="B3">3</xref>]. The agent interacts with the environment in a trial-and-error fashion to collect observation of the environment state (denoted by <italic>s</italic>), control actions (denoted by <italic>a</italic>), and rewards (denoted by <italic>rd</italic>) [<xref ref-type="bibr" rid="B47">47</xref>]. The goal of the agent is learning to find a control policy (denoted by <italic>&#x3c0;</italic>(<italic>s</italic>, <italic>a</italic>)) that enables it to collect highest rewards in a single try.</p>
<p>The interaction procedure between the environment (IB-LBM) and the agent (DRL) is shown in <xref ref-type="fig" rid="F2">Figure 2</xref>. The interaction is divided into a sequence of discrete steps <italic>n</italic> &#x3d; 0, 1, 2, 3, &#x2026;. At steps <italic>n</italic>, the agents detect state <italic>s</italic>
<sub>
<italic>n</italic>
</sub>, and select action <italic>a</italic>
<sub>
<italic>n</italic>
</sub>, based on policy <inline-formula id="inf7">
<mml:math id="m12">
<mml:msub>
<mml:mrow>
<mml:mi>&#x3c0;</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="(" close="">
<mml:mrow>
<mml:mspace width="0.17em"/>
<mml:mspace width="-0.17em"/>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:msub>
<mml:mi>s</mml:mi>
<mml:mo>,</mml:mo>
<mml:mfenced open="" close=")">
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:math>
</inline-formula>. Then the environment is changed under the influence of the action. At step <italic>n</italic> &#x2b; 1, in response to the change of the environment, the agent receives reward <italic>rd</italic>
<sub>
<italic>n</italic>&#x2b;1</sub>, and find itself in a new state <italic>s</italic>
<sub>
<italic>n</italic>&#x2b;1</sub>. A detailed explanation of the procedure can be found in Refs. [<xref ref-type="bibr" rid="B27">27</xref>,<xref ref-type="bibr" rid="B48">48</xref>]. Validations of the current solver can be found in Ref. [<xref ref-type="bibr" rid="B27">27</xref>] for the hybrid method of DRL and IB-LBM.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>The interaction procedure between IB-LBM and DRL (Adapted from Ref. [<xref ref-type="bibr" rid="B29">29</xref>]).</p>
</caption>
<graphic xlink:href="fphy-10-870273-g002.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="results|discussion" id="s3">
<title>3 Results and Discussion</title>
<sec id="s3-1">
<title>3.1 The Hydrodynamics of a Uniform Flow Over Four Stationary Cylinders</title>
<p>A uniform flow over four stationary cylinders is conducted to produce a large-scale vortical flow environment as an initial flow for the fish to swim in. The diameter of the cylinders is <italic>D</italic> &#x3d; 0.8<italic>L</italic>, which is slightly smaller than the body length of the fish. The centers of the cylinders are respectively placed at (&#x2212;3<italic>L</italic>, 0.7<italic>L</italic>), (&#x2212;3<italic>L</italic>, &#x2212; 2.1<italic>L</italic>), (0<italic>L</italic>, &#x2212; 0.7<italic>L</italic>) and (0<italic>L</italic>, 2.1<italic>L</italic>), as shown in <xref ref-type="fig" rid="F3">Figure 3</xref>. Such arrangement is used in order to generate a complex vortical flow via the interaction of the vortices shedding from the leading two cylinder with the trailing cylinders.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>The confined domain of the swimming.</p>
</caption>
<graphic xlink:href="fphy-10-870273-g003.tif"/>
</fig>
<p>The simulation is performed for a Reynolds number of Re &#x3d; <italic>&#x3c1;UL</italic>/<italic>&#x3bc;</italic> &#x3d; 400 or <italic>Re</italic>
<sub>
<italic>cylinder</italic>
</sub> &#x3d; <italic>&#x3c1;UD</italic>/<italic>&#x3bc;</italic> &#x3d; 320, where <italic>&#x3c1;</italic> is the density of the fluid, <italic>U</italic> is the incoming fluid velocity, and <italic>&#x3bc;</italic> is the dynamic viscosity of the fluid. This Reynold number is used because it is able to generate sufficiently complex flows with reasonably low computational costs. The computational domain of 50<italic>L</italic> &#xd7; 50<italic>L</italic> is divided into seven blocks with 98,373 grids. The minimum nondimensional grid spacing is &#x394;<italic>x</italic>/<italic>L</italic> &#x3d; &#x394;<italic>y</italic>/<italic>L</italic> &#x3d; 0.01 near the inner boundaries and the nondimensional time step size is &#x394;<italic>tU</italic>/<italic>L</italic> &#x3d; 0.0004. Validation has been performed to ensure the numerical results are independent of mesh size, domain size and time step size.</p>
<p>
<xref ref-type="fig" rid="F4">Figure 4</xref> shows the vorticity contour and flow velocity distribution behind the cylinders at four different instants (the animation of the movement of the vortices can be found in the <xref ref-type="sec" rid="s10">Supplementary Materials</xref>). It can be seen that abundant vortices are generated in the wake flow of the cylinders, and the strength and moving velocity of the vortices are diversified. Those vortices interact with each other and the trailing cylinders, forming a highly dynamic and unpredictable flow field. Two basic types of vortices are identified: clockwise vortices (blue) and counter-clockwise vortices (red). The clockwise vortices accelerate the flow above it and decelerate the flow below it, and induce upward flow in its left side and downward flow in its right side. On the contrary, the counter-clockwise vortices accelerate the flow below it and decelerate the flow above it, and induce upward flow in its right side and downward flow in its left side. As a result, the flow velocity in the field is vastly altered. In next section, <italic>tL</italic>/<italic>U</italic> &#x3d; 50 is used as an initial flow field for the swimming training.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Vorticity contour and flow velocity distribution behind the cylinders at four different instants: <bold>(A)</bold> <italic>tU</italic>/<italic>L</italic> &#x3d; 22.7, <bold>(B)</bold> <italic>tU</italic>/<italic>L</italic> &#x3d; 24.7, <bold>(C)</bold> <italic>tU</italic>/<italic>L</italic> &#x3d; 26.7, and <bold>(D)</bold> <italic>tU</italic>/<italic>L</italic> &#x3d; 28.7.</p>
</caption>
<graphic xlink:href="fphy-10-870273-g004.tif"/>
</fig>
</sec>
<sec id="s3-2">
<title>3.2 Learning to Navigate in the Vortical Flow</title>
<p>In this section, a fish is trained to navigate in a flow field as in the last section. The cases are conducted with four computational cores on a workstation with Intel Xeon CPU E5-2678 and OpenMP. The computational domain of 50<italic>L</italic> &#xd7; 50<italic>L</italic> is divided into seven blocks with about 120,000 grids. The simulation requires about 21.0&#xa0;<italic>s</italic> of CPU time per nondimensional time unit <italic>t</italic>/<italic>T</italic> &#x3d; 1.0. For simplicity, the fish is restricted to swim in a rectangular area of 12<italic>L</italic> &#xd7; 6<italic>L</italic>, as shown in <xref ref-type="fig" rid="F3">Figure 3</xref>. The goal of the fish is to swim towards a given destination at (1<italic>L</italic>, 0.7<italic>L</italic>) from different initial positions. The goal is reflected by defining a reward as<disp-formula id="e6">
<mml:math id="m13">
<mml:mi>r</mml:mi>
<mml:mi>d</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:msqrt>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2b;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>/</mml:mo>
<mml:mi>L</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>0.7</mml:mn>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msup>
</mml:mrow>
</mml:msqrt>
<mml:mo>,</mml:mo>
</mml:math>
<label>(6)</label>
</disp-formula>where <italic>x</italic>
<sub>
<italic>tip</italic>
</sub> and <italic>y</italic>
<sub>
<italic>tip</italic>
</sub> are the space coordinates of the head tip of the fish. In addition, if the fish swims out of the boundary of the confined area, it is given a strong penalty of <italic>rd</italic> &#x3d; &#x2212;100.</p>
<p>The swimmer propels itself by generating a travelling wave propagating from head to tail, as defined by <xref ref-type="disp-formula" rid="e2">Eq. 2</xref>. In order to achieve high maneuverability, the swimmer can change the wave amplitude every half swimming cycle. Each selected set of parameters is considered as an action. In this case, the period is fixed at <italic>TU</italic>/<italic>L</italic> &#x3d; 0.4; the amplitude action base is defined as <italic>&#x3b8;</italic>
<sub>
<italic>lmax</italic>
</sub> &#x3d; 0&#xb0;, 10&#xb0;, 20&#xb0;, 30&#xb0;, 40&#xb0;, 50&#xb0;, 60&#xb0;, 70&#xb0; and 80&#xb0;; and the wavelength is fixed at <italic>&#x3bb;</italic> &#x3d; <italic>L</italic>. This parameter set forms an action base of nine components.</p>
<p>A comprehensive representation of the environment state is very important for the accurate motion control. Specifically, the historical evolution of the sensory information should be considered throughly. Zhu et al. [<xref ref-type="bibr" rid="B27">27</xref>] conducted tests with different environment information and found that only considering the actions and body kinematics in the last four periods could provide environmental information with enough accuracy for motion control. Therefore, a similar way to consider the environment information is adopted here, in which the state is defined by a tuple.<disp-formula id="e7">
<mml:math id="m14">
<mml:msub>
<mml:mrow>
<mml:mi>s</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:mtable class="matrix">
<mml:mtr>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>u</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>u</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center"/>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>u</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>u</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:mspace width="1em"/>
<mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="center">
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>8</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>8</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>&#x3b8;</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>8</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>u</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>8</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>u</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>8</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>8</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
</mml:mtd>
<mml:mtd columnalign="center">
<mml:mspace width="1em"/>
<mml:msub>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>8</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
<label>(7)</label>
</disp-formula>where <italic>x</italic>, <italic>y</italic> and <italic>&#x3b8;</italic> are respectively the space coordinates and orientation angle of the fish, and <inline-formula id="inf8">
<mml:math id="m15">
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>u</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula>, <inline-formula id="inf9">
<mml:math id="m16">
<mml:msub>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>u</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> and <inline-formula id="inf10">
<mml:math id="m17">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>&#x3c9;</mml:mi>
</mml:mrow>
<mml:mo>&#x304;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:math>
</inline-formula> are respectively the average swimming speed in <italic>x</italic> &#x2212; and <italic>y</italic> &#x2212; directions and the angular speed in each half period.</p>
<p>The learning process is divided into a series of episodes. In each episode, the initial <italic>x</italic> coordinate <italic>x</italic>
<sub>0</sub> is randomly chosen between 3 and 7<italic>L</italic>, the initial <italic>y</italic> coordinate <italic>y</italic>
<sub>0</sub> is randomly chosen between &#x2212;1.5 and 1.5<italic>L</italic>, and the initial orientation angle <italic>&#x3b8;</italic>
<sub>0</sub> randomly varies between &#x2212;30&#xb0; and 30&#xb0;. The subsequent positions and orientations of the swimmer are then determined by the FSI with the actions. Once the swimmer exceeds the confined area or reaches a small circle area near the destination with radius 0.3<italic>L</italic>, the episode ends and another starts. The fish is trained for 3,000 episodes and 126,893 periods. <xref ref-type="fig" rid="F5">Figure 5</xref> shows the traces of the head tip during different learning stages. In episode 99, the fish is not able to maintain in the vortical flow area for a prolonged time and swims out of the confined area quickly. Nevertheless, after a trial-and-error exploration period (episode 565), it learns to hold position in the area for longer time instead of being washed away. At last, it has learned how to directly swim towards its destination. After learning for 990 episodes, it successfully finds a path leading it to close area of the destination, but ending up with a collision with one of the cylinders. Then it struggles and learns to reach the destination without hitting the cylinders (episode 1,604). Finally, after learning for about 3,000 episodes, it could accurately reach the destination.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>The traces of the head during different learning stages.</p>
</caption>
<graphic xlink:href="fphy-10-870273-g005.tif"/>
</fig>
<p>In order to test the robustness of the control strategy, we investigated 100 different cases with different initial positions and orientation angles using the same control strategy after learning for 3,000 episodes. In 9 of the 100 tests, the fish loses its balance and eventually swam out of the confined area. In those cases, the relative angle of the fish with respective to the incoming flow grows so large that the fish could not restore its orientation in time. In 15 of the 100 tests, the fish ends up with a collision with the cylinders. In those cases, the fish could not resist the strong suction force behind the cylinders. In the other 76 cases, the fish successfully reach the destination. <xref ref-type="fig" rid="F6">Figure 6</xref> presents the traces when the fish swims to its destination with different initial positions. 5 cases are studied, in which the initial orientation angle is fixed at 0&#xb0; while the initial position of the head tip <inline-formula id="inf11">
<mml:math id="m18">
<mml:msub>
<mml:mrow>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>r</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> takes on the values (6<italic>L</italic>, &#x2212; 1.5<italic>L</italic>), (6<italic>L</italic>, 1.5<italic>L</italic>), (6<italic>L</italic>, 0<italic>L</italic>), (3<italic>L</italic>, &#x2212; 1.5<italic>L</italic>) and (4.5<italic>L</italic>, &#x2212; 1.5<italic>L</italic>). <xref ref-type="fig" rid="F7">Figure 7</xref> presents the traces when the fish swims to its destination with different initial orientation angles. 5 cases are studied, in which the initial position is fixed at (6<italic>L</italic>, &#x2212; 1.5<italic>L</italic>) while the initial orientation angle <italic>&#x3b8;</italic>
<sub>0</sub> takes on the values 0&#xb0;, 30&#xb0;, 15&#xb0;, &#x2212;15&#xb0; and &#x2212;30&#xb0;. In all cases, the fish reaches its destination successfully but the path varies a lot. However, two main paths can be identified. The first path is to approach the destination from the above and the other is to approach from the bottom.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>The traces of the head for different initial positions.</p>
</caption>
<graphic xlink:href="fphy-10-870273-g006.tif"/>
</fig>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>The traces of the head for different initial orientation angles.</p>
</caption>
<graphic xlink:href="fphy-10-870273-g007.tif"/>
</fig>
<p>In order to understand the hydrodynamics underlying the behaviors, we investigate a typical case in details, in which the initial orientation angle is 0&#xb0; and the initial position is (6<italic>L</italic>, &#x2212; 1.5<italic>L</italic>). The time change of the lateral tail tip movement is shown in <xref ref-type="fig" rid="F8">Figure 8</xref>. The vorticity contour and flow velocity distribution in several typical instants are shown in <xref ref-type="fig" rid="F9">Figure 9</xref> (the animation of the fish swimming can be found in the <xref ref-type="sec" rid="s10">Supplementary Materials</xref>). It is noted that the fish is forced to hold still in the flow field for 50 periods until the vortex street is fully developed. Then it is allowed to swim freely in the flow. Its goal is to swim upstream and reach its destination (green circle in <xref ref-type="fig" rid="F9">Figure 9</xref>). <xref ref-type="fig" rid="F9">Figure 9A</xref> shows the body gesture of the fish and the ambient flow field at instant <italic>t</italic>/<italic>T</italic> &#x3d; 50. It can be seen that an area of reduced streamwise flow (denoted as RSF in the figure) is formed in the right side of the fish. It will be easier if the fish can take advantage of this area to move upstream. However, the surrounding flow is trying to push the fish leftwards to the high flow velocity area. Without active control, the fish will be washed downstream quickly. Therefore, the fish adopts a large-amplitude right flapping to turn right towards the reduced flow area (<xref ref-type="fig" rid="F9">Figure 9B</xref>). At instants <italic>t</italic>/<italic>T</italic> &#x3d; 53 and <italic>t</italic>/<italic>T</italic> &#x3d; 54 (<xref ref-type="fig" rid="F9">Figures 9C,D</xref>), the fish is oriented at the reduced streamwise flow area. Meanwhile, the clockwise flow induced by Vortex 1 (denoted by V1 in the figure) has a tendency to turn it right (rotating clockwise) and draw it backwards to the downstream area. And large-amplitude right flapping will accelerate this process. Therefore, the fish adopts a large-amplitude left flapping to resist this tendency and restore its swimming orientation. In the following several periods, a similar strategy is adopted by the swimmer to take advantage of the reduced streamwise flow area and keep balance (see details in the <xref ref-type="sec" rid="s10">Supplementary Video S6</xref>). From instant <italic>t</italic>/<italic>T</italic> &#x3d; 61.5 to <italic>t</italic>/<italic>T</italic> &#x3d; 65.0 (<xref ref-type="fig" rid="F9">Figures 9E&#x2013;H</xref>), a strong counter-clockwise vortex (V2) is at the right side of the fish, inducing strong rightward flow and reduce streamwise flow in the right side of the fish. Therefore, the fish adopts two large-amplitude right flapping motions to swim rightwards and three compensate left flapping motions to hold stability. Those motions are of crucial importance for the fish to make the most use of the flow to swim upstream while keeping perfect balance. From instant <italic>t</italic>/<italic>T</italic> &#x3d; 72.5 to <italic>t</italic>/<italic>T</italic> &#x3d; 75.9 (<xref ref-type="fig" rid="F9">Figures 9I&#x2013;L</xref>), the fish is very close to the destination and located in a strong streamwise flow that could wash it away from the destination. Therefore, the fish adopts a sequence of high-amplitude right flapping motions to fast reach the destination. It is noted that the fish chooses to approach the destination from the counterflow direction instead of the downstream direction, since the high flow velocity makes it extremely hard to swim upstream.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>The time change of the lateral tail tip movement in the local coordinate system.</p>
</caption>
<graphic xlink:href="fphy-10-870273-g008.tif"/>
</fig>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption>
<p>Vorticity contour and flow velocity distribution at 12 different instants: <bold>(A)</bold> <italic>t</italic>/<italic>T</italic> &#x3d; 50, <bold>(B)</bold> <italic>t</italic>/<italic>T</italic> &#x3d; 51.5, <bold>(C)</bold> <italic>t</italic>/<italic>T</italic> &#x3d; 53, <bold>(D)</bold> <italic>t</italic>/<italic>T</italic> &#x3d; 54, <bold>(E)</bold> <italic>t</italic>/<italic>T</italic> &#x3d; 61.5, <bold>(F)</bold> <italic>t</italic>/<italic>T</italic> &#x3d; 62.5, <bold>(G)</bold> <italic>t</italic>/<italic>T</italic> &#x3d; 63, <bold>(H)</bold> <italic>t</italic>/<italic>T</italic> &#x3d; 65, <bold>(I)</bold> <italic>t</italic>/<italic>T</italic> &#x3d; 72.5, <bold>(J)</bold> <italic>t</italic>/<italic>T</italic> &#x3d; 73.5, <bold>(K)</bold> <italic>t</italic>/<italic>T</italic> &#x3d; 74.5 and <bold>(L)</bold> <italic>t</italic>/<italic>T</italic> &#x3d; 75.9.</p>
</caption>
<graphic xlink:href="fphy-10-870273-g009.tif"/>
</fig>
</sec>
</sec>
<sec id="s4">
<title>4 Conclusion</title>
<p>The point-to-point navigation of a fish-like swimmer in a vortical flow is numerically studied with a hybrid method of deep reinforcement learning and immersed boundary&#x2013;lattice Boltzmann method. The goal of the swimmer is to swim upstream through the vortical area to its destination. The vortical area is generated by placing four stationary cylinders in a uniform flow. The function of the vortices is twofold. It not only induces reduced streamwise flow to make swimming upstream easier, but also induces strong streamwise and lateral flow to deviate the swimmer from its desired path. The swimmer utilizes only the time-sequential information of position, orientation, velocity and angular velocity to learn to navigate to its destination. By considering the time-sequential information, the swimmer learns to reach its destination from different initial positions and orientations, demonstrating the effectiveness and robustness of the method. A detailed analysis shows that the fish utilizes highly subtle tail flapping motions to control its swimming orientation and take advantage of the reduced streamwise flow area to reach it destination, and in the same time avoiding entering the high flow velocity area.</p>
</sec>
</body>
<back>
<sec id="s5">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="sec" rid="s10">Supplementary Material</xref>, further inquiries can be directed to the corresponding authors.</p>
</sec>
<sec id="s6">
<title>Author Contributions</title>
<p>YZ has made contributions to methodology, software development, data analysis and interpolation, and writing of the work. F-BT has made contributions to the conception of the work, methodology, and revising of the work. J-HP has made contribution to the conception and revising of the work.</p>
</sec>
<sec id="s7">
<title>Funding</title>
<p>This work was partially supported by the Australian Research Council (project number DE160101098).</p>
</sec>
<sec sec-type="COI-statement" id="s8">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ack>
<p>YZ acknowledges Shenzhen Institute of Guangdong Ocean University and Dalian Maritime University during the pursuit this study.</p>
</ack>
<sec id="s10">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fphy.2022.870273/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fphy.2022.870273/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="Video9.MP4" id="SM1" mimetype="application/MP4" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Video3.MP4" id="SM2" mimetype="application/MP4" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Video8.MP4" id="SM3" mimetype="application/MP4" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Video4.MP4" id="SM4" mimetype="application/MP4" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Video7.MP4" id="SM5" mimetype="application/MP4" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Video10.MP4" id="SM6" mimetype="application/MP4" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Video2.MP4" id="SM7" mimetype="application/MP4" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Video5.MP4" id="SM8" mimetype="application/MP4" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Video1.MP4" id="SM9" mimetype="application/MP4" xmlns:xlink="http://www.w3.org/1999/xlink"/>
<supplementary-material xlink:href="Video6.MP4" id="SM10" mimetype="application/MP4" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zermelo</surname>
<given-names>E</given-names>
</name>
</person-group>. <article-title>&#xdc;ber das Navigationsproblem bei ruhender oder ver&#xe4;nderlicher Windverteilung</article-title>. <source>Z Angew Math Mech</source> (<year>1931</year>) <volume>11</volume>:<fpage>114</fpage>&#x2013;<lpage>24</lpage>. <pub-id pub-id-type="doi">10.1002/zamm.19310110205</pub-id> </citation>
</ref>
<ref id="B2">
<label>2.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bechinger</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Di Leonardo</surname>
<given-names>R</given-names>
</name>
<name>
<surname>L&#xf6;wen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Reichhardt</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Volpe</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Volpe</surname>
<given-names>G</given-names>
</name>
</person-group>. <article-title>Active Particles in Complex and Crowded Environments</article-title>. <source>Rev Mod Phys</source> (<year>2016</year>) <volume>88</volume>:<fpage>045006</fpage>. <pub-id pub-id-type="doi">10.1103/revmodphys.88.045006</pub-id> </citation>
</ref>
<ref id="B3">
<label>3.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Colabrese</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Gustavsson</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Celani</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Biferale</surname>
<given-names>L</given-names>
</name>
</person-group>. <article-title>Flow Navigation by Smart Microswimmers via Reinforcement Learning</article-title>. <source>Phys Rev Lett</source> (<year>2017</year>) <volume>118</volume>:<fpage>158004</fpage>. <pub-id pub-id-type="doi">10.1103/physrevlett.118.158004</pub-id> </citation>
</ref>
<ref id="B4">
<label>4.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Dong</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Wu</surname>
<given-names>Z</given-names>
</name>
</person-group>. <article-title>Motion Control and Motion Coordination of Bionic Robotic Fish: A Review</article-title>. <source>J Bionic Eng</source> (<year>2018</year>) <volume>15</volume>:<fpage>579</fpage>&#x2013;<lpage>98</lpage>. <pub-id pub-id-type="doi">10.1007/s42235-018-0048-2</pub-id> </citation>
</ref>
<ref id="B5">
<label>5.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guerrero</surname>
<given-names>JA</given-names>
</name>
<name>
<surname>Bestaoui</surname>
<given-names>Y</given-names>
</name>
</person-group>. <article-title>UAV Path Planning for Structure Inspection in Windy Environments</article-title>. <source>J Intell Robot Syst</source> (<year>2013</year>) <volume>69</volume>:<fpage>297</fpage>&#x2013;<lpage>311</lpage>. <pub-id pub-id-type="doi">10.1007/s10846-012-9778-2</pub-id> </citation>
</ref>
<ref id="B6">
<label>6.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bellemare</surname>
<given-names>MG</given-names>
</name>
<name>
<surname>Candido</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Castro</surname>
<given-names>PS</given-names>
</name>
<name>
<surname>Gong</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Machado</surname>
<given-names>MC</given-names>
</name>
<name>
<surname>Moitra</surname>
<given-names>S</given-names>
</name>
<etal/>
</person-group> <article-title>Autonomous Navigation of Stratospheric Balloons Using Reinforcement Learning</article-title>. <source>Nature</source> (<year>2020</year>) <volume>588</volume>:<fpage>77</fpage>&#x2013;<lpage>82</lpage>. <pub-id pub-id-type="doi">10.1038/s41586-020-2939-8</pub-id> </citation>
</ref>
<ref id="B7">
<label>7.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Buzzicotti</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Biferale</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Bonaccorso</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Di Leoni</surname>
<given-names>PC</given-names>
</name>
<name>
<surname>Gustavsson</surname>
<given-names>K</given-names>
</name>
</person-group>. <article-title>Optimal Control of point-to-point Navigation in Turbulent Time Dependent Flows Using Reinforcement Learning</article-title>. In: <conf-name>International Conference of the Italian Association for Artificial Intelligence</conf-name>. <publisher-loc>Berlin, Germany</publisher-loc>: <publisher-name>Springer</publisher-name> (<year>2020</year>). p. <fpage>223</fpage>&#x2013;<lpage>34</lpage>. </citation>
</ref>
<ref id="B8">
<label>8.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Inanc</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Ober-Blobaum</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Marsden</surname>
<given-names>JE</given-names>
</name>
</person-group>. <article-title>Optimal Trajectory Generation for a Glider in Time-Varying 2D Ocean Flows B-Spline Model</article-title>. In: <conf-name>2008 IEEE International Conference on Robotics and Automation</conf-name>. <publisher-loc>Pasadena, CA, USA</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2008</year>). p. <fpage>1083</fpage>&#x2013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1109/robot.2008.4543348</pub-id> </citation>
</ref>
<ref id="B9">
<label>9.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Insaurralde</surname>
<given-names>CC</given-names>
</name>
<name>
<surname>Cartwright</surname>
<given-names>JJ</given-names>
</name>
<name>
<surname>Petillot</surname>
<given-names>YR</given-names>
</name>
</person-group>. <article-title>Cognitive Control Architecture for Autonomous marine Vehicles</article-title>. In: <conf-name>2012 IEEE International Systems Conference SysCon</conf-name>. <publisher-loc>Vancouver, BC, Canada</publisher-loc>: <publisher-name>IEEE</publisher-name> (<year>2012</year>). p. <fpage>1</fpage>&#x2013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1109/syscon.2012.6189542</pub-id> </citation>
</ref>
<ref id="B10">
<label>10.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Colabrese</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Gustavsson</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Celani</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Biferale</surname>
<given-names>L</given-names>
</name>
</person-group>. <article-title>Smart Inertial Particles</article-title>. <source>Phys Rev Fluids</source> (<year>2018</year>) <volume>3</volume>:<fpage>084301</fpage>. <pub-id pub-id-type="doi">10.1103/physrevfluids.3.084301</pub-id> </citation>
</ref>
<ref id="B11">
<label>11.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Salum&#xe4;e</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kruusmaa</surname>
<given-names>M</given-names>
</name>
</person-group>. <article-title>Flow-relative Control of an Underwater Robot</article-title>. <source>Proc R Soc A: Math Phys Eng Sci</source> (<year>2013</year>) <volume>469</volume>:<fpage>20120671</fpage>. </citation>
</ref>
<ref id="B12">
<label>12.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Techy</surname>
<given-names>L</given-names>
</name>
</person-group>. <article-title>Optimal Navigation in Planar Time-Varying Flow: Zermelo&#x27;s Problem Revisited</article-title>. <source>Intel Serv Robotics</source> (<year>2011</year>) <volume>4</volume>:<fpage>271</fpage>&#x2013;<lpage>83</lpage>. <pub-id pub-id-type="doi">10.1007/s11370-011-0092-9</pub-id> </citation>
</ref>
<ref id="B13">
<label>13.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Kularatne</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Bhattacharya</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Hsieh</surname>
<given-names>MA</given-names>
</name>
</person-group>. <article-title>Going with the Flow: a Graph Based Approach to Optimal Path Planning in General Flows</article-title>. <source>Auton Robot</source> (<year>2018</year>) <volume>42</volume>:<fpage>1369</fpage>&#x2013;<lpage>87</lpage>. <pub-id pub-id-type="doi">10.1007/s10514-018-9741-6</pub-id> </citation>
</ref>
<ref id="B14">
<label>14.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Panda</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Das</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Subudhi</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Pati</surname>
<given-names>BB</given-names>
</name>
</person-group>. <article-title>A Comprehensive Review of Path Planning Algorithms for Autonomous Underwater Vehicles</article-title>. <source>Int J Autom Comput</source> (<year>2020</year>) <volume>17</volume>:<fpage>321</fpage>&#x2013;<lpage>52</lpage>. <pub-id pub-id-type="doi">10.1007/s11633-019-1204-9</pub-id> </citation>
</ref>
<ref id="B15">
<label>15.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gunnarson</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Mandralis</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Novati</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Koumoutsakos</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Dabiri</surname>
<given-names>JO</given-names>
</name>
</person-group>. <article-title>Learning Efficient Navigation in Vortical Flow fields</article-title>. <source>arXiv preprint arXiv:2102.10536</source> (<year>2021</year>). <pub-id pub-id-type="doi">10.1038/s41467-021-27015-y</pub-id> </citation>
</ref>
<ref id="B16">
<label>16.</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Sutton</surname>
<given-names>RS</given-names>
</name>
<name>
<surname>Barto</surname>
<given-names>AG</given-names>
</name>
</person-group>. <source>Reinforcement Learning: An Introduction</source>. <publisher-loc>Cambridge, MA, USA</publisher-loc>: <publisher-name>MIT press</publisher-name> (<year>2018</year>). </citation>
</ref>
<ref id="B17">
<label>17.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Verma</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Novati</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Koumoutsakos</surname>
<given-names>P</given-names>
</name>
</person-group>. <article-title>Efficient Collective Swimming by Harnessing Vortices through Deep Reinforcement Learning</article-title>. <source>Proc Natl Acad Sci U.S.A</source> (<year>2018</year>) <volume>115</volume>:<fpage>5849</fpage>&#x2013;<lpage>54</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1800923115</pub-id> </citation>
</ref>
<ref id="B18">
<label>18.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Gustavsson</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Biferale</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Celani</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Colabrese</surname>
<given-names>S</given-names>
</name>
</person-group>. <article-title>Finding Efficient Swimming Strategies in a Three-Dimensional Chaotic Flow by Reinforcement Learning</article-title>. <source>Eur Phys J E Soft Matter</source> (<year>2017</year>) <volume>40</volume>:<fpage>110</fpage>&#x2013;<lpage>6</lpage>. <pub-id pub-id-type="doi">10.1140/epje/i2017-11602-9</pub-id> </citation>
</ref>
<ref id="B19">
<label>19.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Biferale</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Bonaccorso</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Buzzicotti</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Clark Di Leoni</surname>
<given-names>P</given-names>
</name>
<name>
<surname>Gustavsson</surname>
<given-names>K</given-names>
</name>
</person-group>. <article-title>Zermelo&#x27;s Problem: Optimal point-to-point Navigation in 2D Turbulent Flows Using Reinforcement Learning</article-title>. <source>Chaos</source> (<year>2019</year>) <volume>29</volume>:<fpage>103138</fpage>. <pub-id pub-id-type="doi">10.1063/1.5120370</pub-id> </citation>
</ref>
<ref id="B20">
<label>20.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Alageshan</surname>
<given-names>JK</given-names>
</name>
<name>
<surname>Verma</surname>
<given-names>AK</given-names>
</name>
<name>
<surname>Bec</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Pandit</surname>
<given-names>R</given-names>
</name>
</person-group>. <article-title>Machine Learning Strategies for Path-Planning Microswimmers in Turbulent Flows</article-title>. <source>Phys Rev E</source> (<year>2020</year>) <volume>101</volume>:<fpage>043110</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevE.101.043110</pub-id> </citation>
</ref>
<ref id="B21">
<label>21.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Qiu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>W</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>L</given-names>
</name>
</person-group>. <article-title>Swimming Strategy of Settling Elongated Micro-swimmers by Reinforcement Learning</article-title>. <source>SCIENCE CHINA Phys Mech Astron</source> (<year>2020</year>) <volume>63</volume>:<fpage>1</fpage>&#x2013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1007/s11433-019-1502-2</pub-id> </citation>
</ref>
<ref id="B22">
<label>22.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Daddi-Moussa-Ider</surname>
<given-names>A</given-names>
</name>
<name>
<surname>L&#xf6;wen</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Liebchen</surname>
<given-names>B</given-names>
</name>
</person-group>. <article-title>Hydrodynamics Can Determine the Optimal Route for Microswimmer Navigation</article-title>. <source>Commun Phys</source> (<year>2021</year>) <volume>4</volume>:<fpage>1</fpage>&#x2013;<lpage>11</lpage>. <pub-id pub-id-type="doi">10.1038/s42005-021-00522-6</pub-id> </citation>
</ref>
<ref id="B23">
<label>23.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Qiu</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Mousavi</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Gustavsson</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>C</given-names>
</name>
<name>
<surname>Mehlig</surname>
<given-names>B</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>L</given-names>
</name>
</person-group>. <article-title>Navigation of Micro-swimmers in Steady Flow: the Importance of Symmetries</article-title>. <source>J Fluid Mech</source> (<year>2022</year>) <volume>932</volume>. <pub-id pub-id-type="doi">10.1017/jfm.2021.978</pub-id> </citation>
</ref>
<ref id="B24">
<label>24.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yan</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>W</given-names>
</name>
</person-group>. <article-title>A Numerical Simulation Method for Bionic Fish Self-Propelled Swimming under Control Based on Deep Reinforcement Learning</article-title>. <source>Proc Inst Mech Eng C: J Mech Eng Sci</source> (<year>2020</year>) <volume>234</volume>:<fpage>3397</fpage>&#x2013;<lpage>415</lpage>. <pub-id pub-id-type="doi">10.1177/0954406220915216</pub-id> </citation>
</ref>
<ref id="B25">
<label>25.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yan</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>X-h.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>N-h.</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>R-y.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L-p.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>W</given-names>
</name>
</person-group>. <article-title>Computational Analysis of Fluid-Structure Interaction in Case of Fish Swimming in the Vortex Street</article-title>. <source>J Hydrodyn</source> (<year>2021</year>) <volume>33</volume>:<fpage>747</fpage>&#x2013;<lpage>62</lpage>. <pub-id pub-id-type="doi">10.1007/s42241-021-0070-4</pub-id> </citation>
</ref>
<ref id="B26">
<label>26.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yan</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Chang</surname>
<given-names>X</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>W</given-names>
</name>
</person-group>. <article-title>Learning How to Avoid Obstacles: A Numerical Investigation for Maneuvering of Self&#x2010;propelled Fish Based on Deep Reinforcement Learning</article-title>. <source>Int J Numer Meth Fluids</source> (<year>2021</year>) <volume>93</volume>:<fpage>3073</fpage>&#x2013;<lpage>91</lpage>. <pub-id pub-id-type="doi">10.1002/fld.5025</pub-id> </citation>
</ref>
<ref id="B27">
<label>27.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Liao</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>JC</given-names>
</name>
</person-group>. <article-title>A Numerical Study of Fish Adaption Behaviors in Complex Environments with a Deep Reinforcement Learning and Immersed Boundary&#x2013;Lattice Boltzmann Method</article-title>. <source>Scientific Rep</source> (<year>2021</year>) <volume>11</volume>:<fpage>1</fpage>&#x2013;<lpage>20</lpage>. <pub-id pub-id-type="doi">10.1038/s41598-021-81124-8</pub-id> </citation>
</ref>
<ref id="B28">
<label>28.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
</person-group>. <article-title>A Numerical Study of Linear and Nonlinear Kinematic Models in Fish Swimming with the DSD/SST Method</article-title>. <source>Comput Mech</source> (<year>2015</year>) <volume>55</volume>:<fpage>469</fpage>&#x2013;<lpage>77</lpage>. <pub-id pub-id-type="doi">10.1007/s00466-014-1116-z</pub-id> </citation>
</ref>
<ref id="B29">
<label>29.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Pang</surname>
<given-names>J-H</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
</person-group>. <article-title>Stable Schooling Formations Emerge from the Combined Effect of the Active Control and Passive Self-Organization</article-title>. <source>Fluids</source> (<year>2022</year>) <volume>7</volume>:<fpage>41</fpage>. <pub-id pub-id-type="doi">10.3390/fluids7010041</pub-id> </citation>
</ref>
<ref id="B30">
<label>30.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhou</surname>
<given-names>CH</given-names>
</name>
<name>
<surname>Shu</surname>
<given-names>C</given-names>
</name>
</person-group>. <article-title>Simulation of Self-Propelled Anguilliform Swimming by Local Domain-free Discretization Method</article-title>. <source>Int J Numer Meth Fluids</source> (<year>2012</year>) <volume>69</volume>:<fpage>1891</fpage>&#x2013;<lpage>906</lpage>. <pub-id pub-id-type="doi">10.1002/fld.2670</pub-id> </citation>
</ref>
<ref id="B31">
<label>31.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>JCS</given-names>
</name>
</person-group>. <article-title>A Novel Geometry-Adaptive Cartesian Grid Based Immersed Boundary-Lattice Boltzmann Method for Fluid-Structure Interactions at Moderate and High Reynolds Numbers</article-title>. <source>J Comput Phys</source> (<year>2018</year>) <volume>375</volume>:<fpage>22</fpage>&#x2013;<lpage>56</lpage>. <pub-id pub-id-type="doi">10.1016/j.jcp.2018.08.024</pub-id> </citation>
</ref>
<ref id="B32">
<label>32.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>W-X</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
</person-group>. <article-title>Recent Trends and Progress in the Immersed Boundary Method</article-title>. <source>Proc Inst Mech Eng Part C: J Mech Eng Sci</source> (<year>2019</year>) <volume>233</volume>:<fpage>7617</fpage>&#x2013;<lpage>36</lpage>. <pub-id pub-id-type="doi">10.1177/0954406219842606</pub-id> </citation>
</ref>
<ref id="B33">
<label>33.</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Kr&#xfc;ger</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kusumaatmaja</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Kuzmin</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Shardt</surname>
<given-names>O</given-names>
</name>
<name>
<surname>Silva</surname>
<given-names>G</given-names>
</name>
<name>
<surname>Viggen</surname>
<given-names>EM</given-names>
</name>
</person-group>. <source>The Lattice Boltzmann Method</source>. <publisher-loc>Berlin, Germany</publisher-loc>: <publisher-name>Springer</publisher-name> (<year>2017</year>). </citation>
</ref>
<ref id="B34">
<label>34.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>JCS</given-names>
</name>
<name>
<surname>Sui</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
</person-group>. <article-title>An Immersed Boundary-Lattice Boltzmann Method for Fluid-Structure Interaction Problems Involving Viscoelastic Fluids and Complex Geometries</article-title>. <source>J Comput Phys</source> (<year>2020</year>) <volume>415</volume>:<fpage>109487</fpage>. <pub-id pub-id-type="doi">10.1016/j.jcp.2020.109487</pub-id> </citation>
</ref>
<ref id="B35">
<label>35.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Xu</surname>
<given-names>Y-Q</given-names>
</name>
<name>
<surname>Tang</surname>
<given-names>X-Y</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
<name>
<surname>Peng</surname>
<given-names>Y-H</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>Y-J</given-names>
</name>
</person-group>. <article-title>IB&#x2013;LBM Simulation of the Haemocyte Dynamics in a Stenotic Capillary</article-title>. <source>Comput Methods Biomech Biomed Eng</source> (<year>2014</year>) <volume>17</volume>:<fpage>978</fpage>&#x2013;<lpage>85</lpage>. </citation>
</ref>
<ref id="B36">
<label>36.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>Q</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>JC</given-names>
</name>
</person-group>. <article-title>Transition to Chaos in a Two-Sided Collapsible Channel Flow</article-title>. <source>J Fluid Mech</source> (<year>2021</year>) <volume>926</volume>. <pub-id pub-id-type="doi">10.1017/jfm.2021.710</pub-id> </citation>
</ref>
<ref id="B37">
<label>37.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
<name>
<surname>Bharti</surname>
<given-names>RP</given-names>
</name>
<name>
<surname>Xu</surname>
<given-names>Y-Q</given-names>
</name>
</person-group>. <article-title>Deforming-Spatial-Domain/Stabilized Space-Time (DSD/SST) Method in Computation of Non-newtonian Fluid Flow and Heat Transfer with Moving Boundaries</article-title>. <source>Comput Mech</source> (<year>2014</year>) <volume>53</volume>:<fpage>257</fpage>&#x2013;<lpage>71</lpage>. <pub-id pub-id-type="doi">10.1007/s00466-013-0905-0</pub-id> </citation>
</ref>
<ref id="B38">
<label>38.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
</person-group>. <article-title>FSI Modeling with the DSD/SST Method for the Fluid and Finite Difference Method for the Structure</article-title>. <source>Comput Mech</source> (<year>2014</year>) <volume>54</volume>:<fpage>581</fpage>&#x2013;<lpage>9</lpage>. <pub-id pub-id-type="doi">10.1007/s00466-014-1007-3</pub-id> </citation>
</ref>
<ref id="B39">
<label>39.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>JCS</given-names>
</name>
</person-group>. <article-title>An FSI Solution Technique Based on the DSD/SST Method and its Applications</article-title>. <source>Math Models Methods Appl Sci</source> (<year>2015</year>) <volume>25</volume>:<fpage>2257</fpage>&#x2013;<lpage>85</lpage>. <pub-id pub-id-type="doi">10.1142/s0218202515400084</pub-id> </citation>
</ref>
<ref id="B40">
<label>40.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mittal</surname>
<given-names>R</given-names>
</name>
<name>
<surname>Iaccarino</surname>
<given-names>G</given-names>
</name>
</person-group>. <article-title>Immersed Boundary Methods</article-title>. <source>Annu Rev Fluid Mech</source> (<year>2005</year>) <volume>37</volume>:<fpage>239</fpage>&#x2013;<lpage>61</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.fluid.37.061903.175743</pub-id> </citation>
</ref>
<ref id="B41">
<label>41.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Sotiropoulos</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Yang</surname>
<given-names>X</given-names>
</name>
</person-group>. <article-title>Immersed Boundary Methods for Simulating Fluid-Structure Interaction</article-title>. <source>Prog Aerospace Sci</source> (<year>2014</year>) <volume>65</volume>:<fpage>1</fpage>&#x2013;<lpage>21</lpage>. <pub-id pub-id-type="doi">10.1016/j.paerosci.2013.09.003</pub-id> </citation>
</ref>
<ref id="B42">
<label>42.</label>
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Xu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
<name>
<surname>Young</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>JCS</given-names>
</name>
</person-group>. <article-title>A Geometry-Adaptive Immersed Boundary-Lattice Boltzmann Method for Modelling Fluid-Structure Interaction Problems</article-title>. In: <source>IUTAM Symposium on Recent Advances in Moving Boundary Problems in Mechanics</source>. <publisher-loc>Berlin, Germany</publisher-loc>: <publisher-name>Springer</publisher-name> (<year>2019</year>). p. <fpage>161</fpage>&#x2013;<lpage>71</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-030-13720-5_14</pub-id> </citation>
</ref>
<ref id="B43">
<label>43.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Young</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Z</given-names>
</name>
<name>
<surname>Lai</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Nadim</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Lucey</surname>
<given-names>AD</given-names>
</name>
</person-group>. <article-title>Analysis of Unsteady Flow Effects on the Betz Limit for Flapping Foil Power Generation</article-title>. <source>J Fluid Mech</source> (<year>2020</year>) <volume>902</volume>. <pub-id pub-id-type="doi">10.1017/jfm.2020.612</pub-id> </citation>
</ref>
<ref id="B44">
<label>44.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tian</surname>
<given-names>F-B</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>H</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>L</given-names>
</name>
<name>
<surname>Liao</surname>
<given-names>JC</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>X-Y</given-names>
</name>
</person-group>. <article-title>An Efficient Immersed Boundary-Lattice Boltzmann Method for the Hydrodynamic Interaction of Elastic Filaments</article-title>. <source>J Comput Phys</source> (<year>2011</year>) <volume>230</volume>:<fpage>7266</fpage>&#x2013;<lpage>83</lpage>. <pub-id pub-id-type="doi">10.1016/j.jcp.2011.05.028</pub-id> </citation>
</ref>
<ref id="B45">
<label>45.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mnih</surname>
<given-names>V</given-names>
</name>
<name>
<surname>Kavukcuoglu</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Silver</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Rusu</surname>
<given-names>AA</given-names>
</name>
<name>
<surname>Veness</surname>
<given-names>J</given-names>
</name>
<name>
<surname>Bellemare</surname>
<given-names>MG</given-names>
</name>
<etal/>
</person-group> <article-title>Human-level Control through Deep Reinforcement Learning</article-title>. <source>Nature</source> (<year>2015</year>) <volume>518</volume>:<fpage>529</fpage>&#x2013;<lpage>33</lpage>. <pub-id pub-id-type="doi">10.1038/nature14236</pub-id> </citation>
</ref>
<ref id="B46">
<label>46.</label>
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Hausknecht</surname>
<given-names>M</given-names>
</name>
<name>
<surname>Stone</surname>
<given-names>P</given-names>
</name>
</person-group>. <article-title>Deep Recurrent Q-Learning for Partially Observable MDPs</article-title>. In: <conf-name>2015 AAAI Fall Symposium Series</conf-name> (<year>2015</year>). </citation>
</ref>
<ref id="B47">
<label>47.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Jiao</surname>
<given-names>Y</given-names>
</name>
<name>
<surname>Ling</surname>
<given-names>F</given-names>
</name>
<name>
<surname>Heydari</surname>
<given-names>S</given-names>
</name>
<name>
<surname>Kanso</surname>
<given-names>E</given-names>
</name>
<name>
<surname>Heess</surname>
<given-names>N</given-names>
</name>
<name>
<surname>Merel</surname>
<given-names>J</given-names>
</name>
</person-group>. <article-title>Learning to Swim in Potential Flow</article-title>. <source>Phys Rev Fluids</source> (<year>2021</year>) <volume>6</volume>:<fpage>050505</fpage>. <pub-id pub-id-type="doi">10.1103/physrevfluids.6.050505</pub-id> </citation>
</ref>
<ref id="B48">
<label>48.</label>
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tampuu</surname>
<given-names>A</given-names>
</name>
<name>
<surname>Matiisen</surname>
<given-names>T</given-names>
</name>
<name>
<surname>Kodelja</surname>
<given-names>D</given-names>
</name>
<name>
<surname>Kuzovkin</surname>
<given-names>I</given-names>
</name>
<name>
<surname>Korjus</surname>
<given-names>K</given-names>
</name>
<name>
<surname>Aru</surname>
<given-names>J</given-names>
</name>
<etal/>
</person-group> <article-title>Multiagent Cooperation and Competition with Deep Reinforcement Learning</article-title>. <source>PloS one</source> (<year>2017</year>) <volume>12</volume>:<fpage>e0172395</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0172395</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>