<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Neurorobot.</journal-id>
<journal-title>Frontiers in Neurorobotics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Neurorobot.</abbrev-journal-title>
<issn pub-type="epub">1662-5218</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fnbot.2022.1068706</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Continuous mode adaptation for cable-driven rehabilitation robot using reinforcement learning</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Yang</surname> <given-names>Renyu</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2138963/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Zheng</surname> <given-names>Jianlin</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Song</surname> <given-names>Rong</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/359091/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Key Laboratory of Sensing Technology and Biomedical Instrument of Guangdong Province, School of Biomedical Engineering, Sun Yat-sen University</institution>, <addr-line>Guangzhou</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>School of Biomedical Engineering, Shenzhen Campus of Sun Yat-sen University</institution>, <addr-line>Shenzhen</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Hang Su, Fondazione Politecnico di Milano, Italy</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Wen Qi, Politecnico di Milano, Italy; Hui Zhou, Nanjing University of Science and Technology, China</p></fn>
<corresp id="c001">&#x002A;Correspondence: Rong Song, <email>songrong@mail.sysu.edu.cn</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>22</day>
<month>12</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>16</volume>
<elocation-id>1068706</elocation-id>
<history>
<date date-type="received">
<day>13</day>
<month>10</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>28</day>
<month>11</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2022 Yang, Zheng and Song.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Yang, Zheng and Song</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Continuous mode adaptation is very important and useful to satisfy the different user rehabilitation needs and improve human&#x2013;robot interaction (HRI) performance for rehabilitation robots. Hence, we propose a reinforcement-learning-based optimal admittance control (RLOAC) strategy for a cable-driven rehabilitation robot (CDRR), which can realize continuous mode adaptation between passive and active working mode. To obviate the requirement of the knowledge of human and robot dynamics model, a reinforcement learning algorithm was employed to obtain the optimal admittance parameters by minimizing a cost function composed of trajectory error and human voluntary force. Secondly, the contribution weights of the cost function were modulated according to the human voluntary force, which enabled the CDRR to achieve continuous mode adaptation between passive and active working mode. Finally, simulation and experiments were conducted with 10 subjects to investigate the feasibility and effectiveness of the RLOAC strategy. The experimental results indicated that the desired performances could be obtained; further, the tracking error and energy per unit distance of the RLOAC strategy were notably lower than those of the traditional admittance control method. The RLOAC strategy is effective in improving the tracking accuracy and robot compliance. Based on its performance, we believe that the proposed RLOAC strategy has potential for use in rehabilitation robots.</p>
</abstract>
<kwd-group>
<kwd>admittance control</kwd>
<kwd>cable-driven rehabilitation robot</kwd>
<kwd>human-robot cooperation</kwd>
<kwd>human-robot interaction</kwd>
<kwd>optimal control</kwd>
<kwd>robot compliance</kwd>
</kwd-group>
<counts>
<fig-count count="9"/>
<table-count count="2"/>
<equation-count count="56"/>
<ref-count count="43"/>
<page-count count="14"/>
<word-count count="10266"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1" sec-type="intro">
<title>1 Introduction</title>
<p>Stroke is one of the leading causes of neurological and functional disability. In China, two million people suffer from stroke each year. Rehabilitation robots have attracted tremendous interest among researchers globally, as they can provide high-intensity, repetitive, and interactive rehabilitation training for post-stroke patients and overcome the labor-intensiveness of traditional manual rehabilitation training (<xref ref-type="bibr" rid="B17">Kwakkel et al., 2008</xref>). Rehabilitation robots, including various exoskeleton-type rehabilitation robots, such as ARMin (<xref ref-type="bibr" rid="B29">Nef et al., 2007</xref>), RUPERT (<xref ref-type="bibr" rid="B10">Huang et al., 2016</xref>), and UL-EXO7 (<xref ref-type="bibr" rid="B14">Kim et al., 2013</xref>) mimic the role of therapists to provide assistive forces to each joint of the human arm in rehabilitation training. However, these exoskeletons with hulking rigid links and motors attached to the human arm significantly increase the movement inertia, resulting in change in human arm dynamics, which will reduce the transparency of human&#x2013;robot interaction (HRI) (<xref ref-type="bibr" rid="B22">Mao et al., 2015</xref>). To reduce the moving mass of the robot, a novel rehabilitation robot called cable-driven rehabilitation robot (CDRR), wherein the end-effectors are driven by cables instead of hulking rigid links, was developed, which improved the HRI performance owing to its excellent characteristics of low inertia, compliant structure, safety, and transparency (<xref ref-type="bibr" rid="B13">Jin et al., 2018</xref>). <xref ref-type="bibr" rid="B22">Mao et al. (2015)</xref> developed a cable driven exoskeleton (CAREX) for upper arm rehabilitation, which uses multi-stage cable-driven parallel mechanism to reduce the movement inertia, and the feasibility was verified in patients. <xref ref-type="bibr" rid="B1">Alamdari and Krovi (2015)</xref> designed a home-based cable-driven parallel platform robot driven by five cables for upper-limb neuro-rehabilitation in three-dimensional space. <xref ref-type="bibr" rid="B3">Cui et al. (2017)</xref> designed a 7-degrees of freedom (DOFs) cable-driven arm exoskeleton can easily assist the upper limbs to realize complex training tasks, involving rotation, translation, and their combination. <xref ref-type="bibr" rid="B2">Chen et al. (2019)</xref> designed a cable-driven parallel waist rehabilitation robot and a two-level control algorithm was proposed to assist patients with waist injuries to perform rehabilitation training.</p>
<p>The control strategies applied in rehabilitation robots play a critical role in the rehabilitation effectiveness (<xref ref-type="bibr" rid="B9">Guanziroli et al., 2019</xref>). According to the different recovery stages of post-stroke patients, the control strategies mainly include passive and active control (<xref ref-type="bibr" rid="B32">Proietti et al., 2016</xref>). Passive control is generally used to drive the patient repetitively move along predefined trajectories to improve the movement ability and reduce muscle atrophy, which is commonly adopted in the early recovery stages for patients with severe impairment (<xref ref-type="bibr" rid="B11">Jamwal et al., 2014</xref>). In active control, the rehabilitation robot assists the patient by complying with human motion intentions; it is mainly applied to patients with mild impairment. <xref ref-type="bibr" rid="B15">Koenig and Riener (2016)</xref> pointed out that passive control ignores the patient&#x2019;s voluntary engagement, which is one of the essential factors to facilitate neuroplasticity and motor function recovery of post-stroke patients (<xref ref-type="bibr" rid="B38">Warraich and Kleim, 2010</xref>), so its effect of stimulating neuroplasticity is limited. Performance-adaptive control strategies for patients with different levels of motor disabilities are necessary to meet user rehabilitation needs and recovery stages (<xref ref-type="bibr" rid="B34">Sainburg and Mutha, 2016</xref>). <xref ref-type="bibr" rid="B24">Meuleman et al. (2016)</xref> developed a variable admittance control for LOPES II, which can implement both active control to passive control. <xref ref-type="bibr" rid="B40">Wolbrecht et al. (2008)</xref> proposed an assist-as-needed (AAN) control strategy to allow robots to provide only essential assistance according to the patient&#x2019;s movement performance.</p>
<p>Obtaining suitable impedance/admittance parameters for the control strategy is essential to improve HRI performance for rehabilitation robots. The bio-inspired method assuming fixed impedance such as the musculoskeletal model (<xref ref-type="bibr" rid="B31">Pfeifer et al., 2012</xref>) or measurements of biological joint impedance (<xref ref-type="bibr" rid="B7">Erden and Billard, 2015</xref>) was used to estimate the impedance parameters through offline identification. The linear quadratic regulator (LQR) was adopted to obtain the desired admittance parameters through a cost function (<xref ref-type="bibr" rid="B23">Matinfar and Hashtrudi-Zaad, 2016</xref>). These methods would be good candidates when accurate models are available and their parameters can be well estimated. It is not practically applicable in rehabilitation training scenarios, because it is difficult to build the human dynamics model due to its features of nonlinearity, complexity, and variability (<xref ref-type="bibr" rid="B6">Driggs-Campbell et al., 2018</xref>). In addition, modeling and measurement errors are inevitable. To deal with this problem, the reinforcement learning (RL) algorithm was used to solve the given LQR problem, minimizing a cost function for optimizing the overall human&#x2013;robot system performance (<xref ref-type="bibr" rid="B28">Modares et al., 2016</xref>; <xref ref-type="bibr" rid="B20">Li et al., 2017</xref>). RL algorithms have shown unprecedented successes in solving optimal control policy problems such as deep RL, including several policy search methods and deep Q-network (DQN) (<xref ref-type="bibr" rid="B25">Mnih et al., 2015</xref>; <xref ref-type="bibr" rid="B35">Silver et al., 2016</xref>). <xref ref-type="bibr" rid="B5">Doya (2000)</xref> used the knowledge of the system models to learn the optimal control policy and extend to continuous-time systems. To handle unknown dynamics, adaptive dynamic programming (ADP) with special a critic&#x2013;actor structure has been extensively studied (<xref ref-type="bibr" rid="B37">Vrabie et al., 2009</xref>; <xref ref-type="bibr" rid="B12">Jiang and Jiang, 2012</xref>; <xref ref-type="bibr" rid="B27">Modares et al., 2015</xref>), which has become a promising tool for learning impedance/admittance parameters for the human&#x2013;robot system. The ADP-based RL (ADPRL) approach was employed to automatically tune 12 impedance parameters and configure a robotic knee with human-in-the-loop (<xref ref-type="bibr" rid="B39">Wen et al., 2019</xref>; <xref ref-type="bibr" rid="B8">Gao et al., 2021</xref>). To achieve a compliant physical robot&#x2013;environment interaction, <xref ref-type="bibr" rid="B30">Peng et al. (2022)</xref> used the ADPRL approach to obtain the desired admittance parameters based on the cost function composed of interaction force and trajectory tracking without the knowledge of the environmental dynamics. However, a fixed contribution weight of the cost function was adopted in previous studies, which cannot achieve continuous mode adaptation between the passive and active working mode.</p>
<p>Continuous mode adaptation is very important and useful to satisfy the different user rehabilitation needs and improves human&#x2013;robot interaction (HRI) performance for rehabilitation robots. In this study, we present a novel reinforcement-learning-based optimal admittance control (RLOAC) strategy, which can achieve on-the-fly transitions between the passive and active working mode according to the human voluntary force. Firstly, we employed an RL algorithm to calculate the optimal admittance parameters for adapting to the different needs of patients without prior knowledge of the human dynamics model and formulated a new control strategy, which applied the optimal admittance parameters real time by minimizing the cost function to realize the desired HRI performance. Secondly, to promote patients&#x2019; voluntary engagement, the contribution weights of the cost function were adjusted according to the human voluntary force.</p>
</sec>
<sec id="S2">
<title>2 Control strategy design</title>
<sec id="S2.SS1">
<title>2.1 RLOAC framework</title>
<p>The RLOAC framework consists of two control loops&#x2014;inner loop and outer loop&#x2014;as illustrated in <xref ref-type="fig" rid="F1">Figure 1</xref>. The inner-loop is intended for position control, which compensates for the robot nonlinear dynamics and guarantees trajectory tracking accuracy and stability. This module was implemented and reported in our previous work (<xref ref-type="bibr" rid="B41">Yang et al., 2022</xref>). The outer loop includes three modules: (1) a virtual training environment module provides visual feedback of the trajectory tracking and obstacle avoidance (TTOA) movement task to the subject and outputs the predefined trajectory <italic>P_t</italic>, detailed in Section &#x201C;4.2 Adaptation to human dynamics&#x201D;; (2) an optimal admittance control method is employed to yield the desired trajectory <italic>P_d</italic> to obtain the optimal HRI performance according to the human voluntary force <italic>F_h</italic>; and (3) an RL algorithm is designed to calculate the optimal parameters <bold><italic>K</italic></bold> online, considering that the human and robot dynamics parameters are difficult to identify in practice. The details of the outer-loop designs are presented below.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>The proposed control framework.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnbot-16-1068706-g001.tif"/>
</fig>
</sec>
<sec id="S2.SS2">
<title>2.2 Optimal admittance control</title>
<p>The predefined trajectory <italic>P</italic><sub><italic>t</italic></sub> &#x2208; &#x211D;<sup><italic>n</italic></sup> set by the therapist, which is outputted directly to the CDRR, can be expressed in the form of a state equation in the Cartesian space.</p>
<disp-formula id="S2.E1">
<label>(1)</label>
<mml:math display="block" id="M1"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mover accent="true"><mml:mover accent="true"><mml:mtext mathvariant="bold">P</mml:mtext><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mo>.</mml:mo></mml:mover><mml:mi mathvariant="bold">t</mml:mi></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mover accent="true"><mml:mtext mathvariant="bold">P</mml:mtext><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi mathvariant="bold">t</mml:mi></mml:msub></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mtext mathvariant="bold">B</mml:mtext><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mover accent="true"><mml:mover accent="true"><mml:mtext mathvariant="bold">P</mml:mtext><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mo>&#x00A8;</mml:mo></mml:mover><mml:mi mathvariant="bold">t</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.E2">
<label>(2)</label>
<mml:math display="block" id="M2"><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">A</mml:mtext><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnspacing="5pt" displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:msub><mml:mi>I</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">B</mml:mtext><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:msub><mml:mi>I</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>t</mml:mi></mml:msub></mml:mpadded><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:mpadded width="+1.7pt"><mml:msub><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mpadded><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>.</mml:mo></mml:mover><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup></mml:mrow></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<p>The relationship between the human voluntary force <italic>F</italic><sub><italic>h</italic></sub> and the movement of the end-effector <italic>P</italic><sub><italic>d</italic></sub> can be described by the following admittance model (<xref ref-type="bibr" rid="B28">Modares et al., 2016</xref>; <xref ref-type="bibr" rid="B20">Li et al., 2017</xref>):</p>
<disp-formula id="S2.E3">
<label>(3)</label>
<mml:math display="block" id="M3"><mml:mrow><mml:mrow><mml:mrow><mml:msub><mml:mi>M</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00A8;</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>B</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>.</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>P</mml:mi><mml:mi>d</mml:mi></mml:msub></mml:mpadded></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<p>where <italic>M</italic><sub><italic>d</italic></sub>,<italic>B</italic><sub><italic>d</italic></sub>,<italic>D</italic><sub><italic>d</italic></sub>, and <italic>K</italic><sub><italic>h</italic></sub> are the inertia, damping, stiffness, and proportional gain matrices of the human voluntary force, respectively; <italic>l</italic>(<italic>P</italic>) is an auxiliary input term, which will be designed later. By defining the augmented state <bold><inline-formula><mml:math id="INEQ8"><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:msup><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>.</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula></bold>, (3) is expressed in the form of a state equation as</p>
<disp-formula id="S2.E4">
<label>(4)</label>
<mml:math display="block" id="M4"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mover accent="true"><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mo>.</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:mi>A</mml:mi><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mi>B</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>u</mml:mi></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>where A and B are defined as in (2), and <italic>u</italic>&#x03F5;<sup>&#x211D;<italic>m</italic></sup>. Combining (3) and (4) <italic>u</italic> is expressed as</p>
<disp-formula id="S2.E5">
<label>(5)</label>
<mml:math display="block" id="M5"><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>u</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msubsup><mml:mi>M</mml:mi><mml:mi>d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msub><mml:mi>B</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>.</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>d</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<p>The trajectory deformations are defined as <italic>e</italic><sub><italic>d</italic></sub>=<italic>P</italic><sub><italic>t</italic></sub>&#x2212;<italic>P</italic><sub><italic>d</italic></sub> and <inline-formula><mml:math id="INEQ12"><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mover accent="true"><mml:mi>e</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:msup><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>e</mml:mi><mml:mo>.</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula>. Combining (1) and (4), the trajectory deformation dynamics is expressed as</p>
<disp-formula id="S2.E6">
<label>(6)</label>
<mml:math display="block" id="M6"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:mover accent="true"><mml:msub><mml:mover accent="true"><mml:mi>e</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub><mml:mo>.</mml:mo></mml:mover></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:mi>A</mml:mi><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>e</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mi>B</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mover accent="true"><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mo>&#x00A8;</mml:mo></mml:mover><mml:mi>t</mml:mi></mml:msub><mml:mo>-</mml:mo><mml:mi>u</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo>.</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>Similar to the approach in <xref ref-type="bibr" rid="B36">Suzuki and Furuta (2012)</xref>, human dynamics is expressed as</p>
<disp-formula id="S2.E7">
<label>(7)</label>
<mml:math display="block" id="M7"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:mover accent="true"><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>.</mml:mo></mml:mover></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msup><mml:mi>T</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msup><mml:mi>T</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>K</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mover accent="true"><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>.</mml:mo></mml:mover></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msup><mml:mi>T</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>K</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>where <italic>K</italic><sub><italic>p</italic></sub>, <italic>K</italic><sub><italic>d</italic></sub>, and <italic>T</italic> are proportional coefficient of the human brain controller, differential coefficient, and time constant of the neuromuscular system, respectively. Defining the state variate as <inline-formula><mml:math id="INEQ16"><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>X</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:msup><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mover accent="true"><mml:mi>e</mml:mi><mml:mo>.</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup></mml:mrow></mml:math></inline-formula> and then combining (6) and (7), a state equation for the HRI system can be established as</p>
<disp-formula id="S2.E8">
<label>(8)</label>
<mml:math display="block" id="M8"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:mover accent="true"><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>.</mml:mo></mml:mover></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msup><mml:mi>T</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msup><mml:mi>T</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>K</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mover accent="true"><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>.</mml:mo></mml:mover></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msup><mml:mi>T</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>K</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.E9">
<label>(9)</label>
<mml:math display="block" id="M9"><mml:mrow><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:mover accent="true"><mml:mi>A</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnspacing="5pt" displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:msub><mml:mi>I</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:msup><mml:mi>T</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>K</mml:mi><mml:mi>p</mml:mi></mml:msub></mml:mrow></mml:mtd><mml:mtd columnalign="center"><mml:mrow><mml:msup><mml:mi>T</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>K</mml:mi><mml:mi>d</mml:mi></mml:msub></mml:mrow></mml:mtd><mml:mtd columnalign="center"><mml:mrow><mml:mo>-</mml:mo><mml:msup><mml:mi>T</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:mover accent="true"><mml:mi>B</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.Ex1">
<mml:math display="block" id="M10"><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>u</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msubsup><mml:mi>M</mml:mi><mml:mi>d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msub><mml:mi>B</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>.</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>P</mml:mi><mml:mi>d</mml:mi></mml:msub></mml:mpadded></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">+</mml:mo><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">+</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.Ex2">
<mml:math display="block" id="M11"><mml:mrow><mml:mi mathvariant="normal"></mml:mi><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msubsup><mml:mi>M</mml:mi><mml:mi>d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>B</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mpadded width="+3.3pt"><mml:msub><mml:mover accent="true"><mml:mi>e</mml:mi><mml:mo>.</mml:mo></mml:mover><mml:mi>d</mml:mi></mml:msub></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">+</mml:mo><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">+</mml:mo><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.Ex3">
<mml:math display="block" id="M12"><mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:msubsup><mml:mi>M</mml:mi><mml:mi>d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msub><mml:mi>M</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00A8;</mml:mo></mml:mover><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msub><mml:mi>B</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>.</mml:mo></mml:mover><mml:mi>t</mml:mi></mml:msub></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.E10">
<label>(10)</label>
<mml:math display="block" id="M13"><mml:mrow><mml:mi/><mml:mo>&#x2261;</mml:mo><mml:mrow><mml:msub><mml:mi>u</mml:mi><mml:mi>e</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>d</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<p>The control input <italic>u</italic> can be divided into two elements, feedback control input <italic>u</italic><sub><italic>e</italic></sub> and feedforward control input <italic>u</italic><sub><italic>d</italic></sub>(<xref ref-type="bibr" rid="B28">Modares et al., 2016</xref>; <xref ref-type="bibr" rid="B20">Li et al., 2017</xref>). We designed the auxiliary input term <italic>l</italic>(<italic>P</italic><sub><italic>t</italic></sub>) in (10) as</p>
<disp-formula id="S2.E11">
<label>(11)</label>
<mml:math display="block" id="M14"><mml:mrow><mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo rspace="5.8pt">)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>M</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mpadded width="+3.3pt"><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00A8;</mml:mo></mml:mover><mml:mi>t</mml:mi></mml:msub></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">+</mml:mo><mml:mrow><mml:msub><mml:mi>B</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mpadded width="+3.3pt"><mml:msub><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>.</mml:mo></mml:mover><mml:mi>t</mml:mi></mml:msub></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">+</mml:mo><mml:mrow><mml:msub><mml:mi>D</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow><mml:mo>.</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>Then, (10) can be rewritten as</p>
<disp-formula id="S2.E12">
<label>(12)</label>
<mml:math display="block" id="M15"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:mover accent="true"><mml:mi>X</mml:mi><mml:mo>.</mml:mo></mml:mover></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:mover accent="true"><mml:mi>A</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mo>&#x2062;</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>X</mml:mi></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">+</mml:mo><mml:mrow><mml:mover accent="true"><mml:mi>B</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>e</mml:mi></mml:msub></mml:mrow></mml:mrow></mml:mrow><mml:mo>.</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>The feedback control input can be rewritten as</p>
<disp-formula id="S2.E13">
<label>(13)</label>
<mml:math display="block" id="M16"><mml:mrow><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>u</mml:mi><mml:mi>e</mml:mi></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mi>K</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>K</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msubsup><mml:mi>M</mml:mi><mml:mi>d</mml:mi><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:msub><mml:mi>B</mml:mi><mml:mi>d</mml:mi></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:msub><mml:mi>D</mml:mi><mml:mi>d</mml:mi></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:msub><mml:mi>K</mml:mi><mml:mi>h</mml:mi></mml:msub></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>where <italic>K</italic> &#x2208; &#x211D;<italic><sup>3n&#x00D7;3n</sup></italic> is the control gain, which contains the admittance parameters. To minimize <italic>e</italic><sub><italic>d</italic></sub>, <inline-formula><mml:math id="INEQ23"><mml:mover accent="true"><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>.</mml:mo></mml:mover></mml:math></inline-formula>, <italic>u</italic><sub><italic>e</italic></sub>, and <italic>F</italic><sub><italic>h</italic></sub>, a cost function is designed as follows:</p>
<disp-formula id="S2.E14">
<label>(14)</label>
<mml:math display="block" id="M17"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>J</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">&#x222B;</mml:mo><mml:mn>0</mml:mn><mml:mi mathvariant="normal">&#x221E;</mml:mi></mml:msubsup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mmultiscripts><mml:mi>e</mml:mi><mml:mi>d</mml:mi><mml:none/><mml:none/><mml:mi>T</mml:mi></mml:mmultiscripts><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>Q</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">+</mml:mo><mml:mrow><mml:msup><mml:mover accent="true"><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>.</mml:mo></mml:mover><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>Q</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mpadded width="+3.3pt"><mml:mover accent="true"><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>.</mml:mo></mml:mover></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">+</mml:mo><mml:mrow><mml:msubsup><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>u</mml:mi><mml:mi>e</mml:mi></mml:msub></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">+</mml:mo><mml:mrow><mml:msubsup><mml:mi>F</mml:mi><mml:mi>h</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo mathvariant="italic" rspace="0pt">d</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>where <italic>Q</italic><sub>1</sub>,<italic>Q</italic><sub>2</sub>,<italic>R</italic><sub>1</sub>,<italic>R</italic><sub>2</sub> &#x2208; &#x211D;<italic><sup>n = n</sup></italic> are the weighting factors of <inline-formula><mml:math id="INEQ27"><mml:mrow><mml:mrow><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mover accent="true"><mml:msub><mml:mi>e</mml:mi><mml:mi>d</mml:mi></mml:msub><mml:mo>.</mml:mo></mml:mover><mml:mo>,</mml:mo><mml:msub><mml:mi>u</mml:mi><mml:mi>e</mml:mi></mml:msub></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math></inline-formula> and <italic>F</italic><sub><italic>h</italic></sub>, which allow a trade-off between the tracking error and human voluntary force. <italic>Q</italic> and <italic>R</italic><sub>2</sub>are defined as follows:</p>
<disp-formula id="S2.E15">
<label>(15)</label>
<mml:math display="block" id="M18"><mml:mrow><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>Q</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnspacing="5pt" displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:msub><mml:mi>Q</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:msub><mml:mi>Q</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:msub><mml:mi>R</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>R</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mi>d</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>i</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>g</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>r</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">&#x22EF;</mml:mi><mml:mo>,</mml:mo><mml:msub><mml:mi>r</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>where <italic>r</italic><sub>2</sub> is the diagonal element of <italic>R</italic><sub>2</sub>. <italic>R</italic><sub>1</sub> and <italic>R</italic><sub>2</sub> determine the relative contributions of shared control between human and robot, respectively, to the cost <italic>J</italic>. Robotic systems are capable of adaptation of their autonomy level through dynamical adjustment of <italic>R</italic><sub>2</sub>. A smaller <italic>R</italic><sub>2</sub> indicates a higher propensity for robots to lead the shared control task, vice versa. Since the motion capability and intention of the subject can be estimated by her/his voluntary force, <italic>R</italic><sub>2</sub> should be adjusted according to human voluntary force to improve HRI performance in terms of robot compliance. A larger human voluntary force indicates a stronger capability and motion intentions to deviate the trajectory from the predefined trajectory. In this case, humans should be assigned the dominant role whereas the robots show greater compliance with the human voluntary actions, which can be achieved by increasing <italic>R</italic><sub>2</sub>. The reverse is true for a smaller human voluntary force. Thus, by modulating <italic>R</italic><sub>2</sub> according to the human voluntary force, robots can realize continuous mode adaptation between passive and active working mode. The weighting element <italic>r</italic><sub>2</sub> can be adjusted as follows:</p>
<disp-formula id="S2.E16">
<label>(16)</label>
<mml:math display="block" id="M19"><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>r</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mrow><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>i</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo rspace="8.6pt">+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">&#x03B1;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>x</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>i</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo rspace="7.5pt">,</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>f</mml:mi><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mrow><mml:mo fence="true">||</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo fence="true">||</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:mrow><mml:mo>&gt;</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>c</mml:mi></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B1;</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mi mathvariant="normal">&#x03C0;</mml:mi><mml:mn>2</mml:mn></mml:mfrac></mml:mrow><mml:mo>,</mml:mo><mml:mfrac><mml:mi mathvariant="normal">&#x03C0;</mml:mi><mml:mn>2</mml:mn></mml:mfrac><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:msub><mml:mi>r</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>i</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo rspace="7.5pt">,</mml:mo><mml:mrow><mml:mi>o</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>h</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>e</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>r</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>w</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>i</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>s</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>e</mml:mi></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable><mml:mi/></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<p>where <italic>r</italic><sub><italic>min</italic></sub>, <italic>r</italic><sub><italic>max</italic></sub> are the minimum and maximum values of <italic>r</italic><sub><italic>2</italic></sub>. <italic>F_c</italic> is the threshold value of the human voluntary force. ||&#x22C5;|<sub>|2</sub> denotes the 2-norm of a vector. ||<italic>F</italic><sub><italic>h</italic></sub>||<sub>2</sub>&#x2264;<italic>F</italic><sub><italic>c</italic></sub> implies that <italic>F_h</italic> contains only sensor noises or involuntary force, which means the user cannot exert a voluntary force and therefore the CDRR should operate in the passive working mode. &#x03B1; is the magnitude of the directional difference between <italic>F_h</italic> and the optimal control input <inline-formula><mml:math id="INEQ44"><mml:msubsup><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:mmultiscripts><mml:mrow/><mml:mprescripts/><mml:none/><mml:mo>&#x002A;</mml:mo></mml:mmultiscripts></mml:msubsup></mml:math></inline-formula>. The condition <inline-formula><mml:math id="INEQ45"><mml:mrow><mml:mi mathvariant="normal">&#x03B1;</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mi>&#x03C0;</mml:mi><mml:mn>2</mml:mn></mml:mfrac></mml:mrow><mml:mo>,</mml:mo><mml:mfrac><mml:mi>&#x03C0;</mml:mi><mml:mn>2</mml:mn></mml:mfrac><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> indicates that the direction of <italic>F_h</italic> agrees with that of <inline-formula><mml:math id="INEQ46"><mml:msubsup><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:mmultiscripts><mml:mrow/><mml:mprescripts/><mml:none/><mml:mo>&#x002A;</mml:mo></mml:mmultiscripts></mml:msubsup></mml:math></inline-formula>. The conditions <inline-formula><mml:math id="INEQ47"><mml:mrow><mml:mrow><mml:mo fence="true">||</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo fence="true">||</mml:mo></mml:mrow><mml:mo rspace="5.8pt">&gt;</mml:mo><mml:mrow><mml:mpadded width="+5pt"><mml:msub><mml:mi>F</mml:mi><mml:mi>c</mml:mi></mml:msub></mml:mpadded><mml:mo>&#x2062;</mml:mo><mml:mi>and</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi mathvariant="normal">&#x03B1;</mml:mi></mml:mrow><mml:mo>&#x2208;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mi>&#x03C0;</mml:mi><mml:mn>2</mml:mn></mml:mfrac></mml:mrow><mml:mo>,</mml:mo><mml:mfrac><mml:mi>&#x03C0;</mml:mi><mml:mn>2</mml:mn></mml:mfrac><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math></inline-formula> indicate that the user has some capability to correctly perform the cooperative control tasks; hence, the CDRR should operate in the active working mode. &#x03B3;(<italic>F</italic><sub><italic>h</italic></sub>,&#x03B1;) &#x2208; [0,1] is a weight factor, which is used to transit <italic>r</italic><sub><italic>2</italic></sub> smoothly between <italic>r</italic><sub><italic>min</italic></sub> and <italic>r</italic><sub><italic>max</italic></sub> and is defined as</p>
<disp-formula id="S2.E17">
<label>(17)</label>
<mml:math display="block" id="M20"><mml:mrow><mml:mrow><mml:mrow><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:mi mathvariant="normal">&#x03B1;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>h</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mrow><mml:mi>&#x03BC;</mml:mi><mml:mo>&#x22C5;</mml:mo><mml:mi>m</mml:mi></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mrow><mml:mo stretchy="false">{</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mo fence="true">||</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo fence="true">||</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mo fence="true">||</mml:mo><mml:msub><mml:mi>F</mml:mi><mml:mi>c</mml:mi></mml:msub><mml:mo fence="true">||</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msub></mml:mrow><mml:mo stretchy="false">}</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mo>&#x22C5;</mml:mo><mml:mi>m</mml:mi></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mrow><mml:mi>c</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>o</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>s</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi mathvariant="normal">&#x03B1;</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>where &#x03BC; is a scale factor, which determines the ramping rate of &#x03B3;. The weight factor &#x03B3;(<italic>F</italic><sub><italic>h</italic></sub>,&#x03B1;) for &#x03BC;=0.5,<italic>F</italic><sub><italic>c</italic></sub>=1.5<italic>N</italic> is illustrated in <xref ref-type="fig" rid="F2">Figure 2</xref>.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Smooth transition of weight factor &#x03B3;(<italic>F</italic><sub><italic>h</italic></sub>,&#x03B1;) between 0 and 1. &#x03B3;(<italic>F</italic><sub><italic>h</italic></sub>,&#x03B1;) = 0 and &#x03B3;(<italic>F</italic><sub><italic>h</italic></sub>,&#x03B1;) = 0 correspond to passive and active working mode, respectively.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnbot-16-1068706-g002.tif"/>
</fig>
<p>Based on the optimal theorem (<xref ref-type="bibr" rid="B16">Kwakernaak and Sivan, 1972</xref>), the optimal admittance parameters can be obtained using the LQR algorithm with the exact model parameters of the human control and robot system dynamics. The optimal parameters that minimize the cost function (14) are given by</p>
<disp-formula id="S2.E18">
<label>(18)</label>
<mml:math display="block" id="M21"><mml:mrow><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:msup><mml:mi>K</mml:mi><mml:msup><mml:mi/><mml:mo>&#x002A;</mml:mo></mml:msup></mml:msup></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msubsup><mml:mi>R</mml:mi><mml:mn>1</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mover accent="true"><mml:mi>B</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>P</mml:mi><mml:msup><mml:mi/><mml:mo>&#x002A;</mml:mo></mml:msup></mml:msup></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:msubsup><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:msup><mml:mi/><mml:mo>&#x002A;</mml:mo></mml:msup></mml:msubsup></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msup><mml:mi>K</mml:mi><mml:msup><mml:mi/><mml:mo>&#x002A;</mml:mo></mml:msup></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>where <italic>P</italic>&#x002A; is the solution to the following algebraic Riccati equation (ARE):</p>
<disp-formula id="S2.E19">
<label>(19)</label>
<mml:math display="block" id="M22"><mml:mrow><mml:mrow><mml:mrow><mml:mrow><mml:mrow><mml:mrow><mml:msup><mml:mover accent="true"><mml:mi>A</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>P</mml:mi></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mi>P</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mover accent="true"><mml:mi>A</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:mi>P</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mover accent="true"><mml:mi>B</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mo>&#x2062;</mml:mo><mml:msubsup><mml:mi>R</mml:mi><mml:mn>1</mml:mn><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mover accent="true"><mml:mi>B</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>P</mml:mi></mml:mrow></mml:mrow><mml:mo>+</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>Q</mml:mi></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mo>.</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>Thus, the optimal admittance parameters and proportional gain of the human voluntary force (<italic>M</italic><sub><italic>d</italic></sub>, <italic>B</italic><sub><italic>d</italic></sub>,<italic>D</italic><sub><italic>d</italic></sub>, and <italic>K_h</italic>) are determined.</p>
</sec>
<sec id="S2.SS3">
<title>2.3 RL algorithm</title>
<p>The disadvantage of solving the ARE (19) by using the LQR algorithm is that it requires the exact parameters of the human&#x2013;robot system dynamics, which is difficult to know in practice. Several RL algorithms have been designed to overcome this limitation (<xref ref-type="bibr" rid="B37">Vrabie et al., 2009</xref>; <xref ref-type="bibr" rid="B12">Jiang and Jiang, 2012</xref>; <xref ref-type="bibr" rid="B26">Modares and Lewis, 2014</xref>). In this study, the RL algorithm (<xref ref-type="bibr" rid="B12">Jiang and Jiang, 2012</xref>) was employed for online calculation of the optimal admittance parameters for adapting to the needs of different patients under the human&#x2013;robot system dynamics parameters completely unknown. Based on Theorem 2 in <xref ref-type="bibr" rid="B12">Jiang and Jiang (2012)</xref>, the numerical approximation form of the Bellman equation for the aforementioned LQR problem of the system in (12) to solve the ARE (19) is given below.</p>
<disp-formula id="S2.E20">
<label>(20)</label>
<mml:math display="block" id="M23"><mml:mtable displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi>&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>P</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi>&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>P</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mi/><mml:mo>=</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">&#x222B;</mml:mo><mml:mi>t</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi>&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow></mml:msubsup><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>Q</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mi>k</mml:mi></mml:msup></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo mathvariant="italic" rspace="0pt">d</mml:mo><mml:mi>&#x03C4;</mml:mi></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">&#x222B;</mml:mo><mml:mi>t</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi>&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow></mml:msubsup><mml:mrow><mml:mmultiscripts><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:none/><mml:none/><mml:mi>T</mml:mi></mml:mmultiscripts><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo mathvariant="italic" rspace="0pt">d</mml:mo><mml:mi>&#x03C4;</mml:mi></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">&#x222B;</mml:mo><mml:mi>t</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi>&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow></mml:msubsup><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msup><mml:mi>K</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>d</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>&#x03C4;</mml:mi></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math>
</disp-formula>
<p>It is clear that (20) does not rely on the dynamic parameters <inline-formula><mml:math id="INEQ54"><mml:mover accent="true"><mml:mi>A</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> or <inline-formula><mml:math id="INEQ55"><mml:mover accent="true"><mml:mi>B</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> in (10). Then, the Kronecker product is used to express (20) as (<xref ref-type="bibr" rid="B12">Jiang and Jiang, 2012</xref>)</p>
<disp-formula id="S2.E21">
<label>(21)</label>
<mml:math display="block" id="M24"><mml:mrow><mml:mrow><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>P</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msup><mml:mover accent="true"><mml:mi>X</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>k</mml:mi></mml:msup></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.E22">
<label>(22)</label>
<mml:math display="block" id="M25"><mml:mrow><mml:mrow><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>P</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo rspace="5.8pt">)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msup><mml:mover accent="true"><mml:mi>X</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>k</mml:mi></mml:msup></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.Ex4">
<mml:math display="block" id="M26"><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>Q</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mi>k</mml:mi></mml:msup></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.E23">
<label>(23)</label>
<mml:math display="block" id="M27"><mml:mrow><mml:mrow><mml:mi/><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>&#x2297;</mml:mo><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>v</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>e</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>c</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>Q</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mi>k</mml:mi></mml:msup></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.Ex5">
<mml:math display="block" id="M28"><mml:mrow><mml:mmultiscripts><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:none/><mml:none/><mml:mi>T</mml:mi></mml:mmultiscripts><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.E24">
<label>(24)</label>
<mml:math display="block" id="M29"><mml:mrow><mml:mrow><mml:mi mathvariant="normal"></mml:mi><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:msubsup><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>&#x2297;</mml:mo><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2297;</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>v</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>e</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>c</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.Ex6">
<mml:math display="block" id="M30"><mml:mrow><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msup><mml:mi>K</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.Ex7">
<mml:math display="block" id="M31"><mml:mrow><mml:mrow><mml:mi mathvariant="normal"></mml:mi><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>&#x2297;</mml:mo><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>I</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>&#x2297;</mml:mo><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>v</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>e</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>c</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>Where</p>
<disp-formula id="S2.Ex8">
<mml:math display="block" id="M32"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>X</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mi mathvariant="normal">&#x22EF;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.Ex9">
<mml:math display="block" id="M33"><mml:mrow><mml:mpadded width="+3.3pt"><mml:mover accent="true"><mml:mi>X</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:msubsup><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:mi mathvariant="normal">&#x22EF;</mml:mi><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mn>2</mml:mn><mml:mn>2</mml:mn></mml:msubsup><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mn>3</mml:mn></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:mi mathvariant="normal">&#x22EF;</mml:mi><mml:mo>,</mml:mo><mml:mrow><mml:msub><mml:mi>X</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>X</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mi>X</mml:mi><mml:mi>n</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.E26">
<label>(26)</label>
<mml:math display="block" id="M34"><mml:mrow><mml:mpadded width="+3.3pt"><mml:msup><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>k</mml:mi></mml:msup></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:msubsup><mml:mi>P</mml:mi><mml:mn>11</mml:mn><mml:mi>k</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2062;</mml:mo><mml:msubsup><mml:mi>P</mml:mi><mml:mn>12</mml:mn><mml:mi>k</mml:mi></mml:msubsup></mml:mrow><mml:mo>,</mml:mo><mml:mi mathvariant="normal">&#x22EF;</mml:mi><mml:mo>,</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2062;</mml:mo><mml:msubsup><mml:mi>P</mml:mi><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mi>k</mml:mi></mml:msubsup></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mi>P</mml:mi><mml:mn>22</mml:mn><mml:mi>k</mml:mi></mml:msubsup><mml:mo>,</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2062;</mml:mo><mml:msubsup><mml:mi>P</mml:mi><mml:mn>23</mml:mn><mml:mi>k</mml:mi></mml:msubsup></mml:mrow><mml:mo>,</mml:mo><mml:mi mathvariant="normal">&#x22EF;</mml:mi><mml:mo>,</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2062;</mml:mo><mml:msubsup><mml:mi>P</mml:mi><mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>,</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mi>k</mml:mi></mml:msubsup></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mi>P</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mi>k</mml:mi></mml:msubsup><mml:mo>]</mml:mo></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<p>Combining (21) and (22), the left-hand side of (20) can be written as</p>
<disp-formula id="S2.Ex10">
<mml:math display="block" id="M35"><mml:mrow><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>P</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>P</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S2.E27">
<label>(27)</label>
<mml:math display="block" id="M36"><mml:mrow><mml:mi mathvariant="normal"></mml:mi><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mrow><mml:msup><mml:mover accent="true"><mml:mi>X</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msup><mml:mover accent="true"><mml:mi>X</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>k</mml:mi></mml:msup></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<p>By combining (21)&#x2013;(25), (20) can be rewritten as</p>
<disp-formula id="S2.E28">
<label>(28)</label>
<mml:math display="block" id="M37"><mml:mtable displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mrow><mml:msup><mml:mover accent="true"><mml:mi>X</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msup><mml:mover accent="true"><mml:mi>X</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>k</mml:mi></mml:msup></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mi/><mml:mo>=</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:mi>v</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>e</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>c</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>Q</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mi>k</mml:mi></mml:msup></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">&#x222B;</mml:mo><mml:mi>t</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow></mml:msubsup><mml:mrow><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2297;</mml:mo><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo mathvariant="italic" rspace="0pt">d</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2297;</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mi>n</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>v</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>e</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>c</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">&#x222B;</mml:mo><mml:mi>t</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow></mml:msubsup><mml:mrow><mml:mrow><mml:msubsup><mml:mi>u</mml:mi><mml:mi>e</mml:mi><mml:mi>T</mml:mi></mml:msubsup><mml:mo>&#x2297;</mml:mo><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo mathvariant="italic" rspace="0pt">d</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>I</mml:mi><mml:mi>n</mml:mi></mml:msub><mml:mo>&#x2297;</mml:mo><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>R</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>v</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>e</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>c</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">&#x222B;</mml:mo><mml:mi>t</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow></mml:msubsup><mml:mrow><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2297;</mml:mo><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo mathvariant="italic" rspace="0pt">d</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math>
</disp-formula>
<p>We introduce the following definitions to reduce (27) into a simple form:</p>
<disp-formula id="S2.E29">
<label>(29)</label>
<mml:math display="block" id="M38"><mml:mtable columnspacing="5pt" displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:msub><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mrow><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi></mml:mrow></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mi/><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:msup><mml:mover accent="true"><mml:mi>X</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msup><mml:mover accent="true"><mml:mi>X</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi></mml:mrow></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mi/><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">&#x222B;</mml:mo><mml:mi>t</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow></mml:msubsup><mml:mrow><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2297;</mml:mo><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo mathvariant="italic" rspace="0pt">d</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>u</mml:mi></mml:mrow></mml:msub></mml:mtd><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mi/><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">&#x222B;</mml:mo><mml:mi>t</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow></mml:mrow></mml:msubsup><mml:mrow><mml:mrow><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2297;</mml:mo><mml:msup><mml:mi>X</mml:mi><mml:mi>T</mml:mi></mml:msup></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo mathvariant="italic" rspace="0pt">d</mml:mo><mml:mi mathvariant="normal">&#x03C4;</mml:mi></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:msup><mml:mi>b</mml:mi><mml:mi>k</mml:mi></mml:msup></mml:mtd><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mi/><mml:mo>=</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msub><mml:mtext>I</mml:mtext><mml:mrow><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mi>v</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>e</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>c</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>Q</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mi>k</mml:mi></mml:msup></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:msup><mml:mi mathvariant="normal">&#x0393;</mml:mi><mml:mi>k</mml:mi></mml:msup></mml:mtd><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mi/><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi mathvariant="normal">&#x03B4;</mml:mi><mml:mrow><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi></mml:mrow></mml:msub><mml:mo rspace="5.3pt">,</mml:mo><mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mtext>I</mml:mtext><mml:mrow><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mtext>I</mml:mtext><mml:mi>n</mml:mi></mml:msub><mml:mo>&#x2297;</mml:mo><mml:msup><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mtext>I</mml:mtext><mml:mrow><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>u</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>R</mml:mi><mml:mn>1</mml:mn></mml:msub><mml:mo>&#x2297;</mml:mo><mml:msub><mml:mtext>I</mml:mtext><mml:mi>n</mml:mi></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo>.</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math>
</disp-formula>
<p>Then, (27) can be simplified as</p>
<disp-formula id="S2.E30">
<label>(30)</label>
<mml:math display="block" id="M39"><mml:mtable displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:msup><mml:mi mathvariant="normal">&#x0393;</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:msup><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>k</mml:mi></mml:msup></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mi>v</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>e</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>c</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:msup><mml:mi>b</mml:mi><mml:mi>k</mml:mi></mml:msup></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math>
</disp-formula>
<p>Refer to study (<xref ref-type="bibr" rid="B12">Jiang and Jiang, 2012</xref>), a least-squares (LS) method is implemented online to obtain the optimal solution <inline-formula><mml:math id="INEQ56"><mml:msup><mml:mi>P</mml:mi><mml:mmultiscripts><mml:mrow/><mml:mprescripts/><mml:none/><mml:mo>&#x002A;</mml:mo></mml:mmultiscripts></mml:msup></mml:math></inline-formula>. First, set <italic>u</italic><sub><italic>e</italic></sub>=<italic>K</italic><sup>0</sup> + &#x03C6; as the initial input. <italic>K</italic><sup>0</sup> is the initial value of the control gain. &#x03C6; is a probing noise. Then, the online data are collected and &#x03B4;<sub><italic>XX</italic></sub>, <italic>I</italic><sub><italic>XX</italic></sub>, and <italic>I</italic><sub><italic>Xu</italic></sub> are calculated until the following rank condition is satisfied:</p>
<disp-formula id="S2.E31">
<label>(31)</label>
<mml:math display="block" id="M40"><mml:mrow><mml:mrow><mml:mi>r</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>k</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>X</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mrow><mml:mi>X</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>u</mml:mi></mml:mrow></mml:msub><mml:mo>]</mml:mo></mml:mrow><mml:mo rspace="5.8pt">)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mn>3</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mn>3</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi></mml:mrow><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mn>2</mml:mn></mml:mfrac><mml:mo>+</mml:mo><mml:mrow><mml:mn>3</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mi>m</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi></mml:mrow></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<p>After the rank condition is satisfied, the LS solution is obtained as</p>
<disp-formula id="S2.E32">
<label>(32)</label>
<mml:math display="block" id="M41"><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mtable displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:msup><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>k</mml:mi></mml:msup></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mi>v</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>e</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>c</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msup><mml:mi>K</mml:mi><mml:mrow><mml:mi>k</mml:mi><mml:mo>+</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msup><mml:mi mathvariant="normal">&#x0393;</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi mathvariant="normal">&#x0393;</mml:mi><mml:mi>k</mml:mi></mml:msup></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msup><mml:mi mathvariant="normal">&#x0393;</mml:mi><mml:mi>k</mml:mi></mml:msup><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup><mml:mo>&#x2062;</mml:mo><mml:msup><mml:mi>b</mml:mi><mml:mi>k</mml:mi></mml:msup></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<p>Then, the policy is improved as <italic>u</italic><sub><italic>e</italic></sub>=<italic>K<sup>k + 1</sup>X</italic> and the above procedure of LS is repeatedly implemented until ||<italic>K<sup>k + 1</sup></italic>&#x2212;<sup><italic>Kk</italic></sup>|| &#x003C; &#x03B5;. Finally, the optimal <inline-formula><mml:math id="INEQ65"><mml:msup><mml:mi>K</mml:mi><mml:mmultiscripts><mml:mrow/><mml:mprescripts/><mml:none/><mml:mo>&#x002A;</mml:mo></mml:mmultiscripts></mml:msup></mml:math></inline-formula> is obtained. The RL algorithm is shown in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>Reinforcement-learning algorithm.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left" colspan="2">RL Algorithm</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="left">Select an admissible policy <italic>u</italic><sub><italic>e</italic></sub> = <italic>K</italic><sup>0</sup> + &#x03C6;;</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="left">For <italic>k</italic> = 0,1,&#x2005;2&#x22EF;, given <italic>K<sup>k</sup></italic>, collect online data, calculate &#x03B4;<sub><italic>XX</italic></sub>, <italic>I</italic><sub><italic>XX</italic></sub>, and <italic>I</italic><sub><italic>Xu</italic></sub> until the rank condition given by equation (31) is satisfied, and then solve out <inline-formula><mml:math id="INEQ78"><mml:msup><mml:mover accent="true"><mml:mi>P</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mi>k</mml:mi></mml:msup></mml:math></inline-formula>, <italic>K<sup>k + 1</sup></italic>;</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="left">Improve control policy <italic>u</italic><sub><italic>e</italic></sub> = <italic>K<sup>k + 1</sup>X</italic>, go to step 2 until ||<italic>K<sup>k + 1</sup></italic>&#x2212;<italic>K<sup>k</sup></italic>|| = &#x03B5;;;</td>
</tr>
<tr>
<td valign="top" align="left">4</td>
<td valign="top" align="left">Use <italic>u</italic><sub><italic>e</italic></sub> = <italic>K<sup>k + 1</sup>X</italic> as the approximated optimal policy to the system.</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>Remark 1: To satisfy persistently exciting condition, the probing noise &#x03C6; is added to the control input signal, which is necessary to guarantee nonsingular in LS solving process.</p>
<p>Remark 2: In <xref ref-type="bibr" rid="B28">Modares et al. (2016)</xref> and <xref ref-type="bibr" rid="B20">Li et al. (2017)</xref>, RL algorithm is employed to solve the ARE with partial knowledge of system dynamics. Specifically, <inline-formula><mml:math id="INEQ66"><mml:mover accent="true"><mml:mi>A</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> is not needed in solving process, but <inline-formula><mml:math id="INEQ67"><mml:mover accent="true"><mml:mi>B</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> is still required for policy improvement. In contrast, In this study, we referred to <xref ref-type="bibr" rid="B12">Jiang and Jiang (2012)</xref> and employed the RL algorithm, which only uses the online information of input and system states, to solve ARE (19) neither relying on <inline-formula><mml:math id="INEQ68"><mml:mover accent="true"><mml:mi>A</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> nor <inline-formula><mml:math id="INEQ69"><mml:mover accent="true"><mml:mi>B</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula>. As can be seen from the definitions of <inline-formula><mml:math id="INEQ70"><mml:mover accent="true"><mml:mi>A</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> and <inline-formula><mml:math id="INEQ71"><mml:mover accent="true"><mml:mi>B</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> in (8), one can conclude that completely both human control dynamic parameters in (7) and robotic impedance parameters in (3) are not required in our method. The convergence of the RL algorithm was proofed by Theorem 7 in <xref ref-type="bibr" rid="B12">Jiang and Jiang (2012)</xref>. Although both this study and previous study (<xref ref-type="bibr" rid="B12">Jiang and Jiang, 2012</xref>) employed the RL algorithm to obtain the optimal admittance parameters for improving the HRI performance with completely unknown dynamics parameters, their study does not address HRI issue for robot.</p>
</sec>
</sec>
<sec id="S3">
<title>3 System description</title>
<p>To perform our research on an upper-limb rehabilitation robot, a 3-DOF CDRR prototype was developed in our laboratory. As shown in <xref ref-type="fig" rid="F3">Figure 3A</xref>, the CDRR constructed to demonstrate and test the proposed control strategy consisted of a cubic mechanical framework, cable transmission mechanism, actuator module, sensors, controller (MicroLabBox, dSPACE, Germany), and personal computer [intel i7-8700 CPU 3.2 G and 32 GB of random access memory (RAM), China] with ControlDesk (dSPACE, Germany) and MATLAB R2019b software. The cable transmission and actuator module consisted of four cables, pulleys, four winches, an end-effector, and four motors (DM1B-045G, Yokogawa, Japan) with servo drivers (UB1DG3, Yokogawa, Japan). The four cables were pre-stretched high stiffness and made of lightweight steel wires. One end of each cable was fastened to the end-effector and the other end was fastened to the winch. The winches were driven by the motors to control the lengths of the cables (<xref ref-type="fig" rid="F3">Figure 3B</xref>). The sensors on the CDRR included S-shaped tensile/force sensors (HSTL-BLSM, Beijing Huakong Xingye Technology Company, China) mounted on the mechanical framework to measure the cable tension, a 6-axis F/T sensor (SRI-V-210105-G, Sunrise Instruments, China) attached to the end-effector to measure the human voluntary force between the CDRR and human, and a motion capture system (OptiTrack, NaturalPoint, USA) with four cameras (Flex3, NaturalPoint, USA) used to measure the position of marker placed on the end-effector. The control strategy and data acquisition and recording were implemented on the controller with sampling frequency of 1 kHz. The guidance monitor with a virtual training environment was used to design the exercise game.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p><bold>(A)</bold> Prototype of 3-degrees of freedom (DOFs) cable-driven rehabilitation robot (CDRR), <bold>(B)</bold> one cable transmission mechanism and actuator module, <bold>(C)</bold> the graphical guidance interface.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnbot-16-1068706-g003.tif"/>
</fig>
</sec>
<sec id="S4">
<title>4 Simulations studies</title>
<sec id="S4.SS1">
<title>4.1 Optimization of admittance parameters through RL algorithm</title>
<p>Simulation studies were conducted to investigate the convergence speed and accuracy of RL algorithm. For comparison, the optimal admittance parameters were obtained using the RL algorithm and LQR algorithm (<xref ref-type="bibr" rid="B23">Matinfar and Hashtrudi-Zaad, 2016</xref>), respectively. According to <xref ref-type="bibr" rid="B36">Suzuki and Furuta (2012)</xref>, we assumed that the human dynamics can be modeled as (7) with <italic>K</italic><sub><italic>p</italic></sub>=779,<italic>K</italic><sub><italic>d</italic></sub>=288, and <italic>T</italic> = 0.18 when applying the LQR method to simplify the simulations. The matrices <inline-formula><mml:math id="INEQ85"><mml:mover accent="true"><mml:mi>A</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> and <inline-formula><mml:math id="INEQ86"><mml:mover accent="true"><mml:mtext>B</mml:mtext><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:math></inline-formula> in (9) then become</p>
<disp-formula id="S4.Ex11">
<mml:math display="block" id="M42"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:mover accent="true"><mml:mi>A</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnspacing="5pt" displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mn>001000</mml:mn></mml:mtd><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mn>000100</mml:mn></mml:mtd><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mn>000000</mml:mn></mml:mtd><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mn>000000</mml:mn></mml:mtd><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mn>4327.8016000</mml:mn><mml:mo>-</mml:mo><mml:mn>5.5560</mml:mn></mml:mrow></mml:mtd><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mn>04327.8016000</mml:mn><mml:mo>-</mml:mo><mml:mn>5.556</mml:mn></mml:mrow></mml:mtd><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S4.E33">
<label>(33)</label>
<mml:math display="block" id="M43"><mml:mrow><mml:mpadded width="+3.3pt"><mml:mover accent="true"><mml:mi>B</mml:mi><mml:mo>&#x00AF;</mml:mo></mml:mover></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnspacing="5pt" displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mn>001000</mml:mn></mml:mtd><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mn>000100</mml:mn></mml:mtd><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/><mml:mtd/></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup></mml:mrow></mml:math>
</disp-formula>
<p>The matrices <italic>R</italic><sub><italic>1</italic></sub> and <italic>R</italic><sub><italic>2</italic></sub> in the cost function (14) were set as</p>
<disp-formula id="S4.E34">
<label>(34)</label>
<mml:math display="block" id="M44"><mml:mrow><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>Q</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mi>d</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>i</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>g</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mn>5000</mml:mn><mml:mo>,</mml:mo><mml:mn>5000</mml:mn><mml:mo>,</mml:mo><mml:mn>500</mml:mn><mml:mo>,</mml:mo><mml:mn>500</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>,</mml:mo><mml:mn>1</mml:mn><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>R</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:msub><mml:mi>I</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:mrow><mml:mo>.</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>Similar to <xref ref-type="bibr" rid="B23">Matinfar and Hashtrudi-Zaad (2016)</xref> and <xref ref-type="bibr" rid="B42">Yang et al. (2021)</xref>, the optimal admittance parameters obtained directly by the LQR algorithm by considering the exact parameters of the human&#x2013;robot system model (12) were</p>
<disp-formula id="S4.E35">
<label>(35)</label>
<mml:math display="block" id="M45"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:msup><mml:mi>K</mml:mi><mml:msup><mml:mi/><mml:mo>&#x002A;</mml:mo></mml:msup></mml:msup></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnspacing="5pt" displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>151.459</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>58.257</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0.810</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>151.459</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>58.257</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0.810</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo>.</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>Generally, it is nearly impossible to obtain the actual parameters of the human&#x2013;robot system model (12). To avoid requiring these parameters, the RL algorithm was reformulated and fit into the optimal admittance parameters calculated online in Section &#x201C;3 System description.&#x201D; The initial values of the system parameters were set as</p>
<disp-formula id="S4.Ex12">
<mml:math display="block" id="M46"><mml:mrow><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>K</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnspacing="5pt" displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>1200</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>1400</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>1400</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>1500</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>60</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>4</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>1500</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>1400</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>1500</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>2000</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>70</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>10</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>P</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mn>10</mml:mn><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mtext mathvariant="bold">I</mml:mtext><mml:mn>6</mml:mn></mml:msub></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<disp-formula id="S4.E36">
<label>(36)</label>
<mml:math display="block" id="M47"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>X</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:msup><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>0.1</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0.1</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow><mml:mi>T</mml:mi></mml:msup></mml:mrow><mml:mo>.</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>To satisfy the requirement of persistent excitation, we chose a probing noise given by</p>
<disp-formula id="S4.E37">
<label>(37)</label>
<mml:math display="block" id="M48"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>&#x03C6;</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:munderover><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mi>&#x03C9;</mml:mi><mml:mn>100</mml:mn></mml:munderover><mml:mpadded width="+3.3pt"><mml:mn>0.001</mml:mn></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mi>r</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>d</mml:mi></mml:mrow><mml:mo>-</mml:mo><mml:mn>0.5</mml:mn></mml:mrow><mml:mo rspace="5.8pt" stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>s</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>i</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>&#x03C9;</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">&#x00D7;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mi>r</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>a</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>d</mml:mi></mml:mrow><mml:mo>-</mml:mo><mml:mn>0.5</mml:mn></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>/</mml:mo><mml:mi>&#x03C9;</mml:mi></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>where <italic>rand</italic> is a random number that varies from 0 to 1. The sampling time was selected as <italic>T</italic> = 0.001 and 100 samples were collected in each iteration. After 18 iterations, the optimal admittance parameters obtained by the RL algorithm were</p>
<disp-formula id="S4.E38">
<label>(38)</label>
<mml:math display="block" id="M49"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>K</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mtable columnspacing="5pt" displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>151.462</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0.003</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>58.257</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0.810</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>0.006</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>151.464</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd><mml:mtd columnalign="center"><mml:mtable columnspacing="5pt" displaystyle="true"><mml:mtr><mml:mtd columnalign="center"><mml:mn>58.256</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0</mml:mn></mml:mtd><mml:mtd columnalign="center"><mml:mn>0.810</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo>.</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p><xref ref-type="fig" rid="F4">Figure 4A</xref> illustrates the evolution of the admittance parameters and <xref ref-type="fig" rid="F4">Figure 4B</xref> show that of the error ||<italic>K</italic>&#x2212;<italic>K</italic>&#x002A;||<sub>2</sub> between the RL algorithm and LQR method. After five iterations (0.5 s), the convergence errors of the optimal admittance parameters were lower than 0.01. Thus, the RL algorithm has similar accuracy as that of the LQR algorithm and acceptable convergence speed.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p><bold>(A)</bold> Evolution of admittance parameters <italic>K</italic> for the duration of simulation, Blue lines are the optimal admittance parameters K&#x002A; obtained with the linear quadratic regulator (LQR) algorithm for a specific human control dynamic model. Black lines are the results obtained by reinforcement learning (RL) method under unknown human dynamics; <italic>K</italic>(<italic>i</italic>,<italic>j</italic>) is the element of <italic>K</italic>, <italic>i</italic>,<italic>andj</italic> are index of row and column, respectively. <bold>(B)</bold> Convergence of ||<italic>K</italic>&#x2212;K&#x002A;|<sub>|2</sub>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnbot-16-1068706-g004.tif"/>
</fig>
</sec>
<sec id="S4.SS2">
<title>4.2 Adaptation to human dynamics</title>
<p>The TTOA movement task was designed which applied to across passive and active working mode for different subjects and displayed in a graphical guidance interface. As shown in <xref ref-type="fig" rid="F3">Figure 3C</xref>, a predefined trajectory <italic>P_t</italic> (black dotted line) represented a suitable basic movement task created for patients without voluntary movement ability, which was defined as</p>
<disp-formula id="S4.E39">
<label>(39)</label>
<mml:math display="block" id="M50"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:mn>0.1</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mi>c</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>o</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>s</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mn>0.1</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mi>&#x03C0;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mn>0.5</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mi>&#x03C0;</mml:mi></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>,</mml:mo><mml:mrow><mml:mn>0.65</mml:mn><mml:mo>+</mml:mo><mml:mrow><mml:mn>0.1</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mi>s</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>i</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>n</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mrow><mml:mn>0.1</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mi>&#x03C0;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mn>0.5</mml:mn><mml:mo>&#x2062;</mml:mo><mml:mi>&#x03C0;</mml:mi></mml:mrow></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow></mml:mrow><mml:mo>.</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p><italic>P_a</italic>, which was the actual position of the end-effector, was displayed in real time with a green slider. <italic>P_h</italic> (blue line) was the human target path, which remained unknown to the CDRR but was displayed in real time with an orange slider in the graphical interface and can be seen by the subject. During this task, the subject was instructed to look at the green slider and orange slider in the graphical interface and control the end-effector by using her/his hand and let the green slider track the orange slider with the best performance. In order to engage and challenge patients with less severe impairments, an obstacle with a diameter of 0.05 m and center at the coordinates of O<sub>2</sub>(00.5500) may appear on the path of <italic>P_t</italic> when the orange slider reaches the point A (&#x2212;0.07070.5793), and disappear when the orange slider arrives at point C (0.07070.5793) (show as Case II on the bottom row in <xref ref-type="fig" rid="F3">Figure 3C</xref>). The human target path was described follows</p>
<disp-formula id="S4.E40">
<label>(40)</label>
<mml:math display="block" id="M51"><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mtable displaystyle="true" rowspacing="0pt"><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo mathsize="70%" mathvariant="italic" separator="true" stretchy="false">&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;&#x2003;</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn>0</mml:mn></mml:msub></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:mi>t</mml:mi><mml:mo>&lt;</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mrow><mml:mi>A</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>B</mml:mi><mml:mo>-</mml:mo><mml:mi>A</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>/</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo mathsize="70%" mathvariant="italic" separator="true" stretchy="false">&#x2003;&#x2003;&#x2006;</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn>1</mml:mn></mml:msub></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:mi>t</mml:mi><mml:mo>&lt;</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:mrow><mml:mi>B</mml:mi><mml:mo>+</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>-</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:mi>C</mml:mi><mml:mo>-</mml:mo><mml:mi>B</mml:mi></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>/</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mrow><mml:msub><mml:mi>t</mml:mi><mml:mn>3</mml:mn></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo mathsize="70%" mathvariant="italic" separator="true" stretchy="false">&#x2003;</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn>2</mml:mn></mml:msub></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:mi>t</mml:mi><mml:mo>&lt;</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn>3</mml:mn></mml:msub></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd columnalign="center"><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn>3</mml:mn></mml:msub></mml:mrow><mml:mo>&#x2264;</mml:mo><mml:mi>t</mml:mi><mml:mo>&lt;</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mn>4</mml:mn></mml:msub></mml:mrow></mml:mtd></mml:mtr></mml:mtable><mml:mi/></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>where A (&#x2212;0.07070.5793),B(00.4500),<italic>andC</italic>(0.07070.5793) were the joined points; <italic>t_0</italic> = 0 s, <italic>t</italic><sub><italic>1</italic></sub> = 7.5 s, <italic>t</italic><sub><italic>2</italic></sub> = 10.0 s, <italic>t</italic><sub><italic>3</italic></sub> = 12.5 s, and <italic>t</italic><sub><italic>4</italic></sub> = 20.0 s. When the target slider reached point A at <italic>t</italic><sub><italic>1</italic></sub> = 7.5 s, the subject needed to adjust its path and plan a new bypath. The arc AC moved partly into triangle ABC in <italic>P_h</italic> to bypass the obstacle.</p>
<p>In this simulation, the feasibility of the proposed RLOAC strategy was verified through simulation of the TTOA movement task. The RLOAC strategy was implemented by the method presented in Section &#x201C;2 Control strategy design,&#x201D; and the initial parameters were set as in the above simulation example. The parameters &#x03BC;=0.5,<italic>F</italic><sub><italic>c</italic></sub>=1.5<italic>N</italic>,<italic>r</italic><sub>min</sub>=1, and <italic>r</italic><sub>max</sub>=300 were adopted. The human dynamics model (7) was used to simulate the human voluntary force. To verify whether the proposed method can adapt itself to patients with different capabilities, three types of disturbance forces were added to the human voluntary force to simulate the movements of three types of patients with high, moderate, and low levels of capabilities (<xref ref-type="bibr" rid="B36">Suzuki and Furuta, 2012</xref>). Similar to <xref ref-type="bibr" rid="B36">Suzuki and Furuta (2012)</xref>, the disturbance forces were designed as shown in <xref ref-type="table" rid="T2">Table 2</xref>. The simulation results are presented in <xref ref-type="fig" rid="F5">Figure 5</xref>. <xref ref-type="fig" rid="F5">Figure 5A</xref> illustrates the simulation results of the human voluntary forces exerted by patients with high, moderate, and low levels of capabilities. <xref ref-type="fig" rid="F5">Figure 5B</xref> shows the trajectory tracking results under these three simulation conditions. All trajectory tracking errors were small under these three simulation conditions. Thus, the RLOAC strategy is suitable for patients with different capabilities.</p>
<table-wrap position="float" id="T2">
<label>TABLE 2</label>
<caption><p>The design of three typed of disturbance force.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Levels of capabilities</td>
<td valign="top" align="center">Amplitude of disturbance force</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">High</td>
<td valign="top" align="center">0 N</td>
</tr>
<tr>
<td valign="top" align="left">Moderate</td>
<td valign="top" align="center">Random (&#x2212;5 N, 5 N)</td>
</tr>
<tr>
<td valign="top" align="left">Low</td>
<td valign="top" align="center">Random (&#x2212;10 N, 10 N)</td>
</tr>
</tbody>
</table></table-wrap>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>Simulation results for trajectory tracking performance under the voluntary force generated by the human dynamics models with three different levels of capabilities. <bold>(A)</bold> Human voluntary forces for high (violet), moderate (green), and low (blue) levels of capabilities. <bold>(B)</bold> Trajectory tracking performance.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnbot-16-1068706-g005.tif"/>
</fig>
<p>To compare the performance of the proposed RLOAC strategy with those of other methods without online optimization of the admittance parameters, further simulation was conducted by utilizing the following traditional admittance control (TAC) to perform the aforementioned task (<xref ref-type="bibr" rid="B4">Culmer et al., 2010</xref>).</p>
<disp-formula id="S4.E41">
<label>(41)</label>
<mml:math display="block" id="M52"><mml:mrow><mml:mrow><mml:mpadded width="+3.3pt"><mml:msub><mml:mi>P</mml:mi><mml:mi>d</mml:mi></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mi>t</mml:mi></mml:msub><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>/</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mi>K</mml:mi><mml:mrow><mml:mi>T</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>A</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>C</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>C</mml:mi><mml:mrow><mml:mi>T</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>A</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>C</mml:mi></mml:mrow></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mi>S</mml:mi></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>where the stiffness matric <italic>K</italic><sub><italic>TAC</italic></sub>=<italic>diag</italic>(125,125) and damping matric <italic>C</italic><sub><italic>TAC</italic></sub>=<italic>diag</italic>(49.4, 49.4). Both RLOAC and TAC method used the same inner loop controller. The detailed design can be referred to <xref ref-type="bibr" rid="B41">Yang et al. (2022)</xref> We evaluated the HRI performance in terms of the tracking accuracy and robot compliance. The absolute tracking error was used to evaluate the tracking accuracy, which was defined as follows:</p>
<disp-formula id="S4.E42">
<label>(42)</label>
<mml:math display="block" id="M53"><mml:mrow><mml:mo>|</mml:mo><mml:mo>|</mml:mo><mml:mi>E</mml:mi><mml:mi>r</mml:mi><mml:mi>r</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>|</mml:mo><mml:mpadded width="+3.3pt"><mml:msub><mml:mo>|</mml:mo><mml:mn>2</mml:mn></mml:msub></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mo>|</mml:mo><mml:mo stretchy="false">|</mml:mo><mml:mo>|</mml:mo><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>a</mml:mi></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mo>|</mml:mo><mml:mn>2</mml:mn></mml:msub><mml:mo>.</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>The energy per unit distance (EPUD) was adopted to evaluate the robot compliance (<xref ref-type="bibr" rid="B18">Lee et al., 2018</xref>; <xref ref-type="bibr" rid="B43">Zhou et al., 2021</xref>), which was defined as follows:</p>
<disp-formula id="S4.E43">
<label>(43)</label>
<mml:math display="block" id="M54"><mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>P</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>U</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>D</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>&#x22C5;</mml:mo><mml:mi mathvariant="normal">&#x25B3;</mml:mi></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>d</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x25B3;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>d</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:mi>t</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="7.5pt">|</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<p>where &#x25B3;<italic>d</italic> = <italic>P</italic><sub><italic>a</italic></sub>&#x2212;<italic>P</italic><sub><italic>t</italic></sub> is the trajectory deviation made by the subject from <italic>P</italic><sub><italic>t</italic></sub> to <italic>P</italic><sub><italic>a</italic></sub>. A smaller EPUD(<italic>t</italic>) value indicates higher robot compliance with the human motion intentions <bold>(<xref ref-type="bibr" rid="B18">Lee et al., 2018</xref>; <xref ref-type="bibr" rid="B43">Zhou et al., 2021</xref>).</bold></p>
<p>The simulation results are presented in <xref ref-type="fig" rid="F6">Figure 6</xref>. The tracking accuracy of the proposed RLOAC is higher than that of the TAC, especially when bypassing an unpredictable obstacle, as shown in <xref ref-type="fig" rid="F6">Figure 6A</xref>. Moreover, the value of <italic>EPUD</italic>(<italic>t</italic>) needed to bypass the obstacle was notably smaller with the proposed RLOAC when compared with the TAC, as shown in <xref ref-type="fig" rid="F6">Figure 6B</xref>. This comparison of the simulation results indicates that the CDRR with the proposed RLOAC achieved higher accuracy and compliance with the human motion intention.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption><p>Comparison of simulation results of reinforcement-learning-based optimal admittance control (RLOAC) (orange line) and traditional admittance control (TAC) (green line). <bold>(A)</bold> Tracking accuracy. <bold>(B)</bold> Robot compliance.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnbot-16-1068706-g006.tif"/>
</fig>
</sec>
</sec>
<sec id="S5">
<title>5 Experimental studies</title>
<p>Because human control behavior and motor learning are complex and have variable characteristics, which cannot be described in the above simulations, we further investigated the validity of the proposed method through experimentation with human subjects on the 3-DOF CDRR constructed by us and illustrated in <xref ref-type="fig" rid="F3">Figure 3</xref>.</p>
<sec id="S5.SS1">
<title>5.1 Experimental setup</title>
<p>Ten healthy subjects four males and six females, age [mean (M) 28 years with standard deviation (SD) of 4.92], height (M 1.67 m with SD 4.77), and weight (M 57.05 kg with SD 7.45) with no history of neurological impairment were recruited for the experiment. All subjects provided informed consent before participating in the experiment. They were instructed to grasp the end-effector of the CDRR and perform the TTOA movement task (detailed in Section &#x201C;4.2 Adaptation to human dynamics&#x201D;), as shown in <xref ref-type="fig" rid="F3">Figure 3C</xref>. To better show the condition of switching between passive and active working modes, the TTOA movement task was set as follows:</p>
<p>(1) The obstacle may appear randomly with a probability of 50%. Depending on the non-appearance or appearance of the obstacle, the task scenarios is called Case II as shown on the top row in <xref ref-type="fig" rid="F3">Figure 3C</xref> or Case II on the bottom row in <xref ref-type="fig" rid="F3">Figure 3C</xref>, respectively. The equation of <italic>P</italic><sub><italic>h</italic></sub> is different in these two cases. Specifically, <italic>P</italic><sub><italic>t</italic></sub> and <italic>P</italic><sub><italic>h</italic></sub> overlap in Case I, which are both expressed as (39). <italic>P</italic><sub><italic>t</italic></sub> and<italic>P</italic><sub><italic>h</italic></sub> only partial overlap in Case II. <italic>P</italic><sub><italic>t</italic></sub> is expressed as (39), while <italic>P</italic><sub><italic>h</italic></sub> is expressed as (40).</p>
<p>(2) The task of Case I and Case II were conducted periodically by performing one cycle per period.</p>
<p>Initially, the experimenter demonstrated the TTOA movement task to the subjects and ensured that each subject understood the task. Then, the subjects were allowed to practice 20 unrecorded cycles/period. After this preliminary experiment, to ensure a fair comparison, the same experimental protocol was conducted for two trials by each subject: once with TAC and once using the proposed RLOAC in a random order unknown to the subject. For each control strategy (TAC or RLOAC), each subject executed 10 cycles/period, including five cycles/period for Case I and five for Case II, and the data were recorded for analysis. Further, Case I and Case II appeared in a random order unknown to the subject.</p>
</sec>
<sec id="S5.SS2">
<title>5.2 Data analysis</title>
<p>We performed a quantitative evaluation of the HRI performance based on the following measures:</p>
<p>(1) Mean absolute tracking error (MATE), defined as</p>
<disp-formula id="S5.E44">
<label>(44)</label>
<mml:math display="block" id="M55"><mml:mrow><mml:mrow><mml:mrow><mml:mi>M</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>A</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>T</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mpadded width="+3.3pt"><mml:mi>E</mml:mi></mml:mpadded></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:mfrac><mml:mn>1</mml:mn><mml:msub><mml:mi>s</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:mfrac><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:munderover><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>i</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:msub><mml:mi>s</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:munderover><mml:msub><mml:mrow><mml:mo fence="true">||</mml:mo><mml:mrow><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mi>a</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>-</mml:mo><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow><mml:mo fence="true">||</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msub></mml:mrow></mml:mrow></mml:mrow><mml:mo>,</mml:mo></mml:mrow></mml:math>
</disp-formula>
<p>where <italic>P</italic><sub><italic>a</italic></sub>(<italic>t</italic><sub><italic>i</italic></sub>) and <italic>P</italic><sub><italic>h</italic></sub>(<italic>t</italic><sub><italic>i</italic></sub>) represented the actual position and the human target position of the end-effector at the <italic>ith</italic> sampling instant, respectively; and <italic>s</italic><sub><italic>j</italic></sub>(<italic>j</italic> = 1,2,3) were the total number of samples for each subject during the entire experiment, the active working mode, and the passive working mode, respectively.</p>
<p>(2) The EPUD for each subject, defined as</p>
<disp-formula id="S5.E45">
<label>(45)</label>
<mml:math display="block" id="M56"><mml:mrow><mml:mrow><mml:mi>E</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>P</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>U</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>D</mml:mi></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:munderover><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:msub><mml:mi>s</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:munderover><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:msub><mml:mi>F</mml:mi><mml:mi>h</mml:mi></mml:msub><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>&#x22C5;</mml:mo><mml:mi mathvariant="normal">&#x25B3;</mml:mi></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mi>d</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>|</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:mrow><mml:munderover><mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo><mml:mi>i</mml:mi><mml:msub><mml:mi>s</mml:mi><mml:mi>j</mml:mi></mml:msub></mml:munderover><mml:mrow><mml:mo stretchy="false">|</mml:mo><mml:mrow><mml:mi mathvariant="normal">&#x25B3;</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mi>d</mml:mi><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math>
</disp-formula>
<p>where <italic>F</italic><sub><italic>h</italic></sub>(<italic>t</italic><sub><italic>i</italic></sub>) and &#x25B3;<italic>d</italic>(<italic>t</italic><sub><italic>i</italic></sub>) represented the human voluntary force and the trajectory deviation at the <italic>ith</italic> sampling instant, respectively.</p>
<p>We employed paired-samples <italic>t</italic>-tests with a significance level of <bold>&#x03B1;</bold> = 0.05 to test differences in MATE and EPUD of the 10 subjects between the two control strategies (TAC and RLOAC) (<xref ref-type="bibr" rid="B21">Losey and O&#x2019;Malley, 2018</xref>). All statistical analyses were performed using SPSS 19 (SPSS, Inc., Chicago, IL, USA).</p>
</sec>
</sec>
<sec id="S6">
<title>6 Experimental results</title>
<p>Our findings are presented in <xref ref-type="fig" rid="F7">Figures 7</xref>&#x2013;<xref ref-type="fig" rid="F9">9</xref>. <xref ref-type="fig" rid="F7">Figure 7</xref> depicts the results from the subject 2 when using the TAC strategy, whereas <xref ref-type="fig" rid="F8">Figure 8</xref> depicts the results of the subject 2 when using the proposed RLOAC strategy. Specifically, <xref ref-type="fig" rid="F7">Figures 7A</xref>, <xref ref-type="fig" rid="F8">8A</xref> illustrate the tracking trajectories of 10 cycles for the TAC and the proposed RLOAC strategy, respectively. The first and second rows of <xref ref-type="fig" rid="F7">Figures 7B</xref>, <xref ref-type="fig" rid="F8">8B</xref> illustrate the tracking errors and the EPUD(<italic>t</italic>), respectively. The deep blue scatterplot shows the results for active working mode, and the red scatterplot shows the results for passive working mode. The shadows in each subplot indicate the time durations for which the obstacle appeared.</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption><p>Experimental results of the proposed traditional admittance control (TAC) strategy measured from subject 2. <bold>(A)</bold> Tracking trajectories, <bold>(B)</bold> tracking error and EPUD(t). The shadows in each subplot indicate the time duration of appearance of the unpredicted obstacle and the intervention required from the participant to bypass it. The deep blue scatterplot shows the results of the passive working mode, and the red scatterplot shows the results of the active working mode.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnbot-16-1068706-g007.tif"/>
</fig>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption><p>Experimental results of subject 2 when using the proposed RLOAC. <bold>(A)</bold> Tracking trajectories, <bold>(B)</bold> tracking error and EPUD(t).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnbot-16-1068706-g008.tif"/>
</fig>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption><p>Comparison of human&#x2013;robot interaction (HRI) performance in term of the mean absolute tracking error (MATE) <bold>(A)</bold> and energy per unit distance (EPUD) <bold>(B)</bold> between the proposed reinforcement-learning-based optimal admittance control (RLOAC) and traditional admittance control (TAC) in the entire experiment, the active working mode, and the passive working mode, respectively. Each plot shows the mean value for the subjects when TAC was used (grid) and when the proposed RLOAC was used (black). The error bars indicate the standard deviation, and the asterisk &#x201C;&#x002A;&#x201D; indicates <italic>p</italic> &#x003C; 0.05.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnbot-16-1068706-g009.tif"/>
</fig>
<p>It is clear from <xref ref-type="fig" rid="F8">Figure 8</xref> that with the proposed RLOAC strategy, the CDRR can assist the subject cooperatively by complying with her/his motion intention to complete the movement task with desirable performances in terms of the accuracy and compliance. As shown in the first row of <xref ref-type="fig" rid="F7">Figures 7B</xref>, <xref ref-type="fig" rid="F8">8B</xref>, the tracking error, ||<italic>Error</italic>(<italic>t</italic>)||<sub>2</sub>, when using the RLOAC strategy was small and acceptable, and it was mostly notably lower than that of the TAC strategy. As seen in the second row of <xref ref-type="fig" rid="F7">Figures 7B</xref>, <xref ref-type="fig" rid="F8">8B</xref>, the compliance indicated by <italic>EPUD</italic>(t) for the RLOAC strategy was below 13 in each case, which was notably better than that of the TAC strategy. The red sections in of <xref ref-type="fig" rid="F7">Figures 7B</xref>, <xref ref-type="fig" rid="F8">8B</xref> show that the active working mode time for the proposed RLOAC strategy was notably longer than that for the TAC strategy for performing the same movement task. As seen in the triangular part of the tracking trajectories, by using the RLOAC strategy, the CDRR can comply with the subjects&#x2019; voluntary actions and bypass unpredictable obstacles with greater accuracy and compliance than that possible with the TAC strategy. When using the RLOAC strategy, the tracking error and vibrations of the end-effector can be decreased through voluntary control by the subject, as it ensures better compliance and smooth switching between passive and active working mode. Even without obstacles in the path, the active working mode can be adopted to decrease the tracking error and vibrations. On the contrary, with the TAC strategy, the active working mode is rarely adopted in the absence of obstacles. Thus, the results of the representative subject show that the proposed RLOAC strategy has better accuracy and compliance, and can promote active working mode in comparison with the TAC strategy.</p>
<p><xref ref-type="fig" rid="F9">Figure 9</xref> depicts the results for all 10 subjects in the form of mean &#x00B1; std to statistically detect the differences between the two control strategies. As shown in <xref ref-type="fig" rid="F9">Figure 9</xref>, the MATEs for the entire experiment of 10 cycles/period, the active working mode, and the passive working mode were (0.0160 &#x00B1; 0.0021, 0.0168 &#x00B1; 0.0043, and 0.0152 &#x00B1; 0.0039) m and (0.0189 &#x00B1; 0.0027, 0.0228 &#x00B1; 0.0044, and 0.0170 &#x00B1; 0.0091) m with the RLOAC and TAC strategies, respectively. By comparing the trials of RLOAC and TAC, there were statistically significant decreases in the MATE for the entire experiment (<italic>p</italic> = 0.015) and during the active working mode (<italic>p</italic> = 0.001). These differences were not statistically significant during passive working mode (<italic>p</italic> = 0.693). A similar pattern was discovered for the EPUDs. The detailed EPUDs of the RLOAC trials for the entire experiment, the active working mode, and the passive working mode were (2.5931 &#x00B1; 0.5740, 4.4704 &#x00B1; 0.7217, and 0.5805 &#x00B1; 0.1470), whereas in the TAC trials, the corresponding EPUDs were (4.0754 &#x00B1; 0.4845, 6.4994 &#x00B1; 1.4368, and 0.5569 &#x00B1; 0.1137). The results of the RLOAC trials were significantly smaller than those of the TAC trials for the entire experiment (<italic>p</italic> = 0.001) and during the active working mode (<italic>p</italic> = 0.006), whereas the EPUDs of these two control strategies during the passive working mode had no significant differences (<italic>p</italic> = 0.560). Thus, the statistical quantification analysis proved that the proposed RLOAC strategy had desirable accuracy and compliance, which were statistically notably better than those of the TAC strategy in the comparison experiment.</p>
</sec>
<sec id="S7" sec-type="discussion|conclusion">
<title>7 Discussion and conclusion</title>
<p>In this study, an RLOAC strategy is proposed for a CDRR that can achieve continuous mode adaptation between the passive and active working modes. Experiment with 10 subjects were conducted on a self-designed CDRR, and the results demonstrated the effectiveness of the proposed control strategy. It is demonstrated that the proposed approach can potentially be applied in CDRR.</p>
<p>The RLOAC strategy improved the HRI performance in terms of tracking accuracy and robot compliance. The tracking error (<xref ref-type="bibr" rid="B28">Modares et al., 2016</xref>; <xref ref-type="bibr" rid="B20">Li et al., 2017</xref>, <xref ref-type="bibr" rid="B19">2018</xref>) and EPUD (<xref ref-type="bibr" rid="B18">Lee et al., 2018</xref>; <xref ref-type="bibr" rid="B43">Zhou et al., 2021</xref>) are common performance indexes of the HRI. A smaller tracking error indicates that the subjects can control the CDRR&#x2019;s motion more accurately (<xref ref-type="bibr" rid="B28">Modares et al., 2016</xref>; <xref ref-type="bibr" rid="B19">Li et al., 2018</xref>), and a smaller EPUD indicates higher robot compliance (<xref ref-type="bibr" rid="B18">Lee et al., 2018</xref>; <xref ref-type="bibr" rid="B43">Zhou et al., 2021</xref>). There was decrease in the means of absolute tracking error and EPUD during exercise, because the CDRR with RLOAC can obtain the suitable admittance parameters to optimize HRI performance. Significant differences between the RLOAC and TAC strategy were found in the performance metrics during both active working mode and in the overall experiment, because the contributions to the control task between subject and robot can be adjusted as necessary for rapid adaptation to the changes in human voluntary actions and task requirements. Thus, the CDRR with RLOAC exhibited high levels of compliance with human motion intentions and self-adaptive optimization to human dynamics. The controller type did not have a statistically significant effect in the passive working mode, because the tracking error was mainly determined by the inner loop position controller in this working mode. The increase in time spent in active working mode indicates that the RLOAC strategy can promote voluntary engagement during exercise. Because the subjects participated in the control loop, and their voluntary force were utilized to perceive their motion intentions (<xref ref-type="bibr" rid="B19">Li et al., 2018</xref>). Continuous mode adaptation according to subjects&#x2019; voluntary force facilitated subjects driving the robot at their will made them feel in control during exercise, which may increase their motivation and confidence to use the affected limb (<xref ref-type="bibr" rid="B32">Proietti et al., 2016</xref>).</p>
<p>Comparing the RLOAC strategy with the traditional control strategies highlights its advantages. The well-recognized TAC strategy, widely applied in rehabilitation robots, was chosen as a comparison method because both the RLOAC and TAC yield the desired trajectories based on the human input forces using an admittance model to obtain robot compliance. The fixed admittance parameters were adopted in the TAC strategy, which meant that it could not adapt to the variability of human dynamics. In contrast, using reinforcement learning, the RLOAC can obtain suitable admittance parameters to optimize HRI performance. In contrast to most optimization algorithms, the RLOAC strategy can adjust admittance parameters online without the knowledge of human and robot dynamics models. Although adaptive impedance control has been applied to optimize interaction performance, as pointed out in <xref ref-type="bibr" rid="B33">Riener et al. (2005)</xref>, <xref ref-type="bibr" rid="B4">Culmer et al. (2010)</xref>, <xref ref-type="bibr" rid="B32">Proietti et al. (2016)</xref>, and <xref ref-type="bibr" rid="B43">Zhou et al. (2021)</xref>, admittance control is more stable than impedance control. Therefore, the RLOAC strategy is more suitable for rehabilitation robots due to using admittance control. Use of RL algorithm to obtain the optimal admittance parameters and optimal HRI performance by minimizing a cost function has been suggested in previous studies (<xref ref-type="bibr" rid="B28">Modares et al., 2016</xref>; <xref ref-type="bibr" rid="B20">Li et al., 2017</xref>). However, in their studies, a partial knowledge of the system dynamics was still required. In contrast, the RL algorithm in this study was improved and used to address the HRI issue considering completely unknown human and robot dynamics parameters. Moreover, continuous and real-time mode adaptation was realized by dynamically adjusted contribution weight of the cost function according to the human voluntary force.</p>
<p>The limitations present in this study can be given as follows. We assumed that the human voluntary force can be directly measured by a 6-axis F/T sensor. In fact, the measured force was the interaction force between the human and the end-effector, which is composed of both voluntary and involuntary components. The applicability and clinical effectiveness of the proposed control strategy was not verified in post-stroke patients.</p>
</sec>
<sec id="S8" sec-type="data-availability">
<title>Data availability statement</title>
<p>The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.</p>
</sec>
<sec id="S9" sec-type="ethics-statement">
<title>Ethics statement</title>
<p>Ethical review and approval was not required for the study on human participants in accordance with the local legislation and institutional requirements. The patients/participants provided their written informed consent to participate in this study.</p>
</sec>
<sec id="S10" sec-type="author-contributions">
<title>Author contributions</title>
<p>RY and RS contributed to the conception of the control algorithm. RY designed and performed the simulations and experiments and wrote the first draft of the manuscript. All authors contributed to the manuscript revision and approved the submitted version.</p>
</sec>
</body>
<back>
<sec id="S11" sec-type="funding-information">
<title>Funding</title>
<p>This research was supported by the National Key Research and Development Program of China (Grant No. 2022YFE0201900), the Shenzhen Science and Technology Research Program (No. SGDX20210823103405040), the Guangdong Science and Technology Plan Project (No. 2020B1212060077), and the Natural Science Foundation of Guangdong Province (No. 2020A1515010735).</p>
</sec>
<sec id="S12" sec-type="COI-statement">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="S13" sec-type="disclaimer">
<title>Publisher&#x2019;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alamdari</surname> <given-names>A.</given-names></name> <name><surname>Krovi</surname> <given-names>V.</given-names></name></person-group> (<year>2015</year>). &#x201C;<article-title>Modeling and control of a novel home-based cable-driven parallel platform robot: PACER</article-title>,&#x201D; in <source><italic>Proceedings of the 2015 IEEE/RSJ international conference on intelligent robots and systems (IROS)</italic></source>, <publisher-loc>Hamburg</publisher-loc>.</citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>Q.</given-names></name> <name><surname>Zi</surname> <given-names>B.</given-names></name> <name><surname>Sun</surname> <given-names>Z.</given-names></name> <name><surname>Li</surname> <given-names>Y.</given-names></name> <name><surname>Xu</surname> <given-names>Q.</given-names></name></person-group> (<year>2019</year>). <article-title>Design and development of a new cable-driven parallel robot for waist rehabilitation.</article-title> <source><italic>IEEE ASME Trans. Mechatron.</italic></source> <volume>24</volume> <fpage>1497</fpage>&#x2013;<lpage>1507</lpage>.</citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cui</surname> <given-names>X.</given-names></name> <name><surname>Chen</surname> <given-names>W. H.</given-names></name> <name><surname>Jin</surname> <given-names>X.</given-names></name> <name><surname>Agrawal</surname> <given-names>S. K.</given-names></name></person-group> (<year>2017</year>). <article-title>Design of a 7-DOF cable-driven arm exoskeleton (CAREX-7) and a Controller for dexterous motion training or assistance.</article-title> <source><italic>IEEE ASME Trans. Mechatron.</italic></source> <volume>22</volume> <fpage>161</fpage>&#x2013;<lpage>172</lpage>. <pub-id pub-id-type="doi">10.1109/tmech.2016.2618888</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Culmer</surname> <given-names>P. R.</given-names></name> <name><surname>Jackson</surname> <given-names>A. E.</given-names></name> <name><surname>Makower</surname> <given-names>S.</given-names></name> <name><surname>Richardson</surname> <given-names>R.</given-names></name> <name><surname>Cozens</surname> <given-names>J. A.</given-names></name> <name><surname>Levesley</surname> <given-names>M. C.</given-names></name><etal/></person-group> (<year>2010</year>). <article-title>A control strategy for upper limb robotic rehabilitation with a dual robot system.</article-title> <source><italic>IEEE ASME Trans. Mechatron.</italic></source> <volume>15</volume> <fpage>575</fpage>&#x2013;<lpage>585</lpage>. <pub-id pub-id-type="doi">10.1109/tmech.2009.2030796</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Doya</surname> <given-names>K.</given-names></name></person-group> (<year>2000</year>). <article-title>Reinforcement learning in continuous time and space.</article-title> <source><italic>Neural Comput.</italic></source> <volume>12</volume> <fpage>219</fpage>&#x2013;<lpage>245</lpage>. <pub-id pub-id-type="doi">10.1162/089976600300015961</pub-id> <pub-id pub-id-type="pmid">10636940</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Driggs-Campbell</surname> <given-names>K.</given-names></name> <name><surname>Dong</surname> <given-names>R.</given-names></name> <name><surname>Bajcsy</surname> <given-names>R.</given-names></name></person-group> (<year>2018</year>). <article-title>Robust, informative human-in-the-loop predictions via empirical reachable sets.</article-title> <source><italic>IEEE Trans. Intell. Veh.</italic></source> <volume>3</volume> <fpage>300</fpage>&#x2013;<lpage>309</lpage>. <pub-id pub-id-type="doi">10.1109/TIV.2018.2843125</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Erden</surname> <given-names>M. S.</given-names></name> <name><surname>Billard</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>End-point impedance measurements across dominant and nondominant hands and robotic assistance with directional damping.</article-title> <source><italic>IEEE Trans. Cybern.</italic></source> <volume>45</volume> <fpage>1146</fpage>&#x2013;<lpage>1157</lpage>. <pub-id pub-id-type="doi">10.1109/tcyb.2014.2346021</pub-id> <pub-id pub-id-type="pmid">25148680</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gao</surname> <given-names>X.</given-names></name> <name><surname>Si</surname> <given-names>J.</given-names></name> <name><surname>Wen</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>M.</given-names></name> <name><surname>Huang</surname> <given-names>H.</given-names></name></person-group> (<year>2021</year>). <article-title>Reinforcement learning control of robotic knee with human-in-the-loop by flexible policy iteration.</article-title> <source><italic>IEEE Trans. Neural Netw. Learn. Syst.</italic></source> <volume>33</volume> <fpage>5873</fpage>&#x2013;<lpage>5887</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2021.3071727</pub-id> <pub-id pub-id-type="pmid">33956634</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Guanziroli</surname> <given-names>E.</given-names></name> <name><surname>Cazzaniga</surname> <given-names>M.</given-names></name> <name><surname>Colombo</surname> <given-names>L.</given-names></name> <name><surname>Basilico</surname> <given-names>S.</given-names></name> <name><surname>Legnani</surname> <given-names>G.</given-names></name> <name><surname>Molteni</surname> <given-names>F.</given-names></name></person-group> (<year>2019</year>). <article-title>Assistive powered exoskeleton for complete spinal cord injury: Correlations between walking ability and exoskeleton control.</article-title> <source><italic>Eur. J. Phys. Rehabil. Med.</italic></source> <volume>55</volume> <fpage>209</fpage>&#x2013;<lpage>216</lpage>. <pub-id pub-id-type="doi">10.23736/s1973-9087.18.05308-x</pub-id> <pub-id pub-id-type="pmid">30156088</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>J.</given-names></name> <name><surname>Tu</surname> <given-names>X. K.</given-names></name> <name><surname>He</surname> <given-names>J. P.</given-names></name></person-group> (<year>2016</year>). <article-title>Design and evaluation of the RUPERT wearable upper extremity exoskeleton robot for clinical and in-home therapies.</article-title> <source><italic>IEEE Trans. Syst. Man Cybern. Syst.</italic></source> <volume>46</volume> <fpage>926</fpage>&#x2013;<lpage>935</lpage>. <pub-id pub-id-type="doi">10.1109/tsmc.2015.2497205</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jamwal</surname> <given-names>P. K.</given-names></name> <name><surname>Xie</surname> <given-names>S. Q.</given-names></name> <name><surname>Hussain</surname> <given-names>S.</given-names></name> <name><surname>Parsons</surname> <given-names>J. G.</given-names></name></person-group> (<year>2014</year>). <article-title>An adaptive wearable parallel robot for the treatment of ankle injuries.</article-title> <source><italic>IEEE ASME Trans. Mechatron.</italic></source> <volume>19</volume> <fpage>64</fpage>&#x2013;<lpage>75</lpage>. <pub-id pub-id-type="doi">10.1109/tmech.2012.2219065</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jiang</surname> <given-names>Y.</given-names></name> <name><surname>Jiang</surname> <given-names>Z. P.</given-names></name></person-group> (<year>2012</year>). <article-title>Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics.</article-title> <source><italic>Automatica</italic></source> <volume>48</volume> <fpage>2699</fpage>&#x2013;<lpage>2704</lpage>. <pub-id pub-id-type="doi">10.1016/j.automatica.2012.06.096</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jin</surname> <given-names>X.</given-names></name> <name><surname>Prado</surname> <given-names>A.</given-names></name> <name><surname>Agrawal</surname> <given-names>S. K.</given-names></name></person-group> (<year>2018</year>). <article-title>Retraining of human gait &#x2013; are lightweight cable-driven leg exoskeleton designs effective?</article-title> <source><italic>IEEE Trans. Neural Syst. Rehabil. Eng.</italic></source> <volume>26</volume> <fpage>847</fpage>&#x2013;<lpage>855</lpage>. <pub-id pub-id-type="doi">10.1109/tnsre.2018.2815656</pub-id> <pub-id pub-id-type="pmid">29641389</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kim</surname> <given-names>H.</given-names></name> <name><surname>Miller</surname> <given-names>L. M.</given-names></name> <name><surname>Fedulow</surname> <given-names>I.</given-names></name> <name><surname>Simkins</surname> <given-names>M.</given-names></name> <name><surname>Abrams</surname> <given-names>G. M.</given-names></name> <name><surname>Byl</surname> <given-names>N.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>Kinematic data analysis for post-stroke patients following bilateral versus unilateral rehabilitation with an upper limb wearable robotic system.</article-title> <source><italic>IEEE Trans. Neural Syst. Rehabil. Eng.</italic></source> <volume>21</volume> <fpage>153</fpage>&#x2013;<lpage>164</lpage>. <pub-id pub-id-type="doi">10.1109/tnsre.2012.2207462</pub-id> <pub-id pub-id-type="pmid">22855233</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koenig</surname> <given-names>A. C.</given-names></name> <name><surname>Riener</surname> <given-names>R.</given-names></name></person-group> (<year>2016</year>). &#x201C;<article-title>The human in the loop</article-title>,&#x201D; in <source><italic>Neurorehabilitation technology</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Reinkensmeyer</surname> <given-names>D. J.</given-names></name> <name><surname>Dietz</surname> <given-names>V.</given-names></name></person-group> (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>161</fpage>&#x2013;<lpage>181</lpage>.</citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kwakernaak</surname> <given-names>H.</given-names></name> <name><surname>Sivan</surname> <given-names>R.</given-names></name></person-group> (<year>1972</year>). <source><italic>Linear optimal control systems.</italic></source> <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Wiley-interscience</publisher-name>.</citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kwakkel</surname> <given-names>G.</given-names></name> <name><surname>Kollen</surname> <given-names>B. J.</given-names></name> <name><surname>Krebs</surname> <given-names>H. I.</given-names></name></person-group> (<year>2008</year>). <article-title>Effects of robot-assisted therapy on upper limb recovery after stroke: A systematic review.</article-title> <source><italic>Neurorehabil. Neural Repair</italic></source> <volume>22</volume> <fpage>111</fpage>&#x2013;<lpage>121</lpage>. <pub-id pub-id-type="doi">10.1177/1545968307305457</pub-id> <pub-id pub-id-type="pmid">17876068</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>K. H.</given-names></name> <name><surname>Baek</surname> <given-names>S. G.</given-names></name> <name><surname>Lee</surname> <given-names>H. J.</given-names></name> <name><surname>Choi</surname> <given-names>H. R.</given-names></name> <name><surname>Moon</surname> <given-names>H.</given-names></name> <name><surname>Koo</surname> <given-names>J. C.</given-names></name></person-group> (<year>2018</year>). <article-title>Enhanced transparency for physical human-robot interaction using human hand impedance compensation.</article-title> <source><italic>IEEE ASME Trans. Mechatron.</italic></source> <volume>23</volume> <fpage>2662</fpage>&#x2013;<lpage>2670</lpage>. <pub-id pub-id-type="doi">10.1109/tmech.2018.2875690</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>Z. J.</given-names></name> <name><surname>Huang</surname> <given-names>B.</given-names></name> <name><surname>Ye</surname> <given-names>Z. F.</given-names></name> <name><surname>Deng</surname> <given-names>M. D.</given-names></name> <name><surname>Yang</surname> <given-names>C. G.</given-names></name></person-group> (<year>2018</year>). <article-title>Physical human-robot interaction of a robo is exoskeleton by admittance control.</article-title> <source><italic>IEEE Trans. Industr. Electron.</italic></source> <volume>65</volume> <fpage>9614</fpage>&#x2013;<lpage>9624</lpage>. <pub-id pub-id-type="doi">10.1109/tie.2018.2821649</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>Z. J.</given-names></name> <name><surname>Liu</surname> <given-names>J. Q.</given-names></name> <name><surname>Huang</surname> <given-names>Z. C.</given-names></name> <name><surname>Peng</surname> <given-names>Y.</given-names></name> <name><surname>Pu</surname> <given-names>H. Y.</given-names></name> <name><surname>Ding</surname> <given-names>L.</given-names></name></person-group> (<year>2017</year>). <article-title>Adaptive impedance control of human-robot cooperation using reinforcement learning.</article-title> <source><italic>IEEE Trans. Industr. Electron.</italic></source> <volume>64</volume> <fpage>8013</fpage>&#x2013;<lpage>8022</lpage>. <pub-id pub-id-type="doi">10.1109/tie.2017.2694391</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Losey</surname> <given-names>D. P.</given-names></name> <name><surname>O&#x2019;Malley</surname> <given-names>M. K.</given-names></name></person-group> (<year>2018</year>). <article-title>Trajectory deformations from physical human&#x2013;robot interaction.</article-title> <source><italic>IEEE Trans. Robot.</italic></source> <volume>34</volume> <fpage>126</fpage>&#x2013;<lpage>138</lpage>. <pub-id pub-id-type="doi">10.1109/tro.2017.2765335</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mao</surname> <given-names>Y.</given-names></name> <name><surname>Jin</surname> <given-names>X.</given-names></name> <name><surname>Dutta</surname> <given-names>G. G.</given-names></name> <name><surname>Scholz</surname> <given-names>J. P.</given-names></name> <name><surname>Agrawal</surname> <given-names>S. K.</given-names></name></person-group> (<year>2015</year>). <article-title>Human movement training with a cable driven ARm EXoskeleton (CAREX).</article-title> <source><italic>IEEE Trans. Neural Syst. Rehabil. Eng.</italic></source> <volume>23</volume> <fpage>84</fpage>&#x2013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1109/tnsre.2014.2329018</pub-id> <pub-id pub-id-type="pmid">24919202</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Matinfar</surname> <given-names>M.</given-names></name> <name><surname>Hashtrudi-Zaad</surname> <given-names>K.</given-names></name></person-group> (<year>2016</year>). <article-title>Optimization-based robot compliance control: Geometric and linear quadratic approaches.</article-title> <source><italic>Int. J. Robot. Res.</italic></source> <volume>24</volume> <fpage>645</fpage>&#x2013;<lpage>656</lpage>. <pub-id pub-id-type="doi">10.1177/0278364905056347</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meuleman</surname> <given-names>J.</given-names></name> <name><surname>van Asseldonk</surname> <given-names>E.</given-names></name> <name><surname>van Oort</surname> <given-names>G.</given-names></name> <name><surname>Rietman</surname> <given-names>H.</given-names></name> <name><surname>van der Kooij</surname> <given-names>H.</given-names></name></person-group> (<year>2016</year>). <article-title>LOPES II-design and evaluation of an admittance controlled gait training robot with shadow-leg approach.</article-title> <source><italic>IEEE Trans. Neural Syst. Rehabil. Eng.</italic></source> <volume>24</volume> <fpage>352</fpage>&#x2013;<lpage>363</lpage>. <pub-id pub-id-type="doi">10.1109/tnsre.2015.2511448</pub-id> <pub-id pub-id-type="pmid">26731771</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mnih</surname> <given-names>V.</given-names></name> <name><surname>Kavukcuoglu</surname> <given-names>K.</given-names></name> <name><surname>Silver</surname> <given-names>D.</given-names></name> <name><surname>Rusu</surname> <given-names>A. A.</given-names></name> <name><surname>Veness</surname> <given-names>J.</given-names></name> <name><surname>Bellemare</surname> <given-names>M. G.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Human-level control through deep reinforcement learning.</article-title> <source><italic>Nature</italic></source> <volume>518</volume> <fpage>529</fpage>&#x2013;<lpage>533</lpage>. <pub-id pub-id-type="doi">10.1038/nature14236</pub-id> <pub-id pub-id-type="pmid">25719670</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Modares</surname> <given-names>H.</given-names></name> <name><surname>Lewis</surname> <given-names>F. L.</given-names></name></person-group> (<year>2014</year>). <article-title>Linear quadratic tracking control of partially-unknown continuous-time systems using reinforcement learning.</article-title> <source><italic>IEEE Trans. Automat. Contr.</italic></source> <volume>59</volume> <fpage>3051</fpage>&#x2013;<lpage>3056</lpage>. <pub-id pub-id-type="doi">10.1109/tac.2014.2317301</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Modares</surname> <given-names>H.</given-names></name> <name><surname>Lewis</surname> <given-names>F. L.</given-names></name> <name><surname>Jiang</surname> <given-names>Z. P.</given-names></name></person-group> (<year>2015</year>). <article-title>H-infinity tracking control of completely unknown continuous-time systems via off-policy reinforcement learning.</article-title> <source><italic>IEEE Trans. Neural Netw. Learn. Syst.</italic></source> <volume>26</volume> <fpage>2550</fpage>&#x2013;<lpage>2562</lpage>. <pub-id pub-id-type="doi">10.1109/tnnls.2015.2441749</pub-id> <pub-id pub-id-type="pmid">26111401</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Modares</surname> <given-names>H.</given-names></name> <name><surname>Ranatunga</surname> <given-names>I.</given-names></name> <name><surname>Lewis</surname> <given-names>F. L.</given-names></name> <name><surname>Popa</surname> <given-names>D. O.</given-names></name></person-group> (<year>2016</year>). <article-title>Optimized assistive human-robot interaction using reinforcement learning.</article-title> <source><italic>IEEE Trans. Cybern.</italic></source> <volume>46</volume> <fpage>655</fpage>&#x2013;<lpage>667</lpage>. <pub-id pub-id-type="doi">10.1109/tcyb.2015.2412554</pub-id> <pub-id pub-id-type="pmid">25823055</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nef</surname> <given-names>T.</given-names></name> <name><surname>Mihelj</surname> <given-names>M.</given-names></name> <name><surname>Riener</surname> <given-names>R.</given-names></name></person-group> (<year>2007</year>). <article-title>ARMin: A robot for patient-cooperative arm therapy.</article-title> <source><italic>Med. Biol. Eng. Comput.</italic></source> <volume>45</volume> <fpage>887</fpage>&#x2013;<lpage>900</lpage>. <pub-id pub-id-type="doi">10.1007/s11517-007-0226-6</pub-id> <pub-id pub-id-type="pmid">17674069</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Peng</surname> <given-names>G.</given-names></name> <name><surname>Chen</surname> <given-names>C. L. P.</given-names></name> <name><surname>Yang</surname> <given-names>C.</given-names></name></person-group> (<year>2022</year>). <article-title>Neural networks enhanced optimal admittance control of robot-environment interaction using reinforcement learning.</article-title> <source><italic>IEEE Trans. Neural Netw. Learn. Syst.</italic></source> <volume>33</volume> <fpage>4551</fpage>&#x2013;<lpage>4561</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2021.3057958</pub-id> <pub-id pub-id-type="pmid">33651696</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pfeifer</surname> <given-names>S.</given-names></name> <name><surname>Vallery</surname> <given-names>H.</given-names></name> <name><surname>Hardegger</surname> <given-names>M.</given-names></name> <name><surname>Riener</surname> <given-names>R.</given-names></name> <name><surname>Perreault</surname> <given-names>E. J.</given-names></name></person-group> (<year>2012</year>). <article-title>Model-based estimation of knee stiffness.</article-title> <source><italic>IEEE Trans. Biomed. Eng.</italic></source> <volume>59</volume> <fpage>2604</fpage>&#x2013;<lpage>2612</lpage>. <pub-id pub-id-type="doi">10.1109/TBME.2012.2207895</pub-id> <pub-id pub-id-type="pmid">22801482</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Proietti</surname> <given-names>T.</given-names></name> <name><surname>Crocher</surname> <given-names>V.</given-names></name> <name><surname>Roby-Brami</surname> <given-names>A.</given-names></name> <name><surname>Jarrasse</surname> <given-names>N.</given-names></name></person-group> (<year>2016</year>). <article-title>Upper-limb robotic exoskeletons for neurorehabilitation: A review on control strategies.</article-title> <source><italic>IEEE Rev. Biomed. Eng.</italic></source> <volume>9</volume> <fpage>4</fpage>&#x2013;<lpage>14</lpage>. <pub-id pub-id-type="doi">10.1109/rbme.2016.2552201</pub-id> <pub-id pub-id-type="pmid">27071194</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Riener</surname> <given-names>R.</given-names></name> <name><surname>Lunenburger</surname> <given-names>L.</given-names></name> <name><surname>Jezernik</surname> <given-names>S.</given-names></name> <name><surname>Anderschitz</surname> <given-names>M.</given-names></name> <name><surname>Colombo</surname> <given-names>G.</given-names></name> <name><surname>Dietz</surname> <given-names>V.</given-names></name></person-group> (<year>2005</year>). <article-title>Patient-cooperative strategies for robot-aided treadmill training: First experimental results.</article-title> <source><italic>IEEE Trans. Neural Syst. Rehabil. Eng.</italic></source> <volume>13</volume> <fpage>380</fpage>&#x2013;<lpage>394</lpage>. <pub-id pub-id-type="doi">10.1109/tnsre.2005.848628</pub-id> <pub-id pub-id-type="pmid">16200761</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sainburg</surname> <given-names>R. L.</given-names></name> <name><surname>Mutha</surname> <given-names>P. K.</given-names></name></person-group> (<year>2016</year>). &#x201C;<article-title>Movement neuroscience foundations of neurorehabilitation</article-title>,&#x201D; in <source><italic>Neurorehabilitation technology</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Reinkensmeyer</surname> <given-names>D. J.</given-names></name> <name><surname>Dietz</surname> <given-names>V.</given-names></name></person-group> (<publisher-loc>Cham</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>19</fpage>&#x2013;<lpage>38</lpage>.</citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Silver</surname> <given-names>D.</given-names></name> <name><surname>Huang</surname> <given-names>A.</given-names></name> <name><surname>Maddison</surname> <given-names>C. J.</given-names></name> <name><surname>Guez</surname> <given-names>A.</given-names></name> <name><surname>Sifre</surname> <given-names>L.</given-names></name> <name><surname>van den Driessche</surname> <given-names>G.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Mastering the game of go with deep neural networks and tree search.</article-title> <source><italic>Nature</italic></source> <volume>529</volume> <fpage>484</fpage>&#x2013;<lpage>489</lpage>. <pub-id pub-id-type="doi">10.1038/nature16961</pub-id> <pub-id pub-id-type="pmid">26819042</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Suzuki</surname> <given-names>S.</given-names></name> <name><surname>Furuta</surname> <given-names>K.</given-names></name></person-group> (<year>2012</year>). <article-title>Adaptive impedance control to enhance human skill on a haptic interface system.</article-title> <source><italic>J. Contr. Sci. Eng.</italic></source> <volume>2012</volume> <fpage>365067</fpage>&#x2013;<lpage>365077</lpage>. <pub-id pub-id-type="doi">10.1155/2012/365067</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vrabie</surname> <given-names>D.</given-names></name> <name><surname>Pastravanu</surname> <given-names>O.</given-names></name> <name><surname>Abu-Khalaf</surname> <given-names>M.</given-names></name> <name><surname>Lewis</surname> <given-names>F. L.</given-names></name></person-group> (<year>2009</year>). <article-title>Adaptive optimal control for continuous-time linear systems based on policy iteration.</article-title> <source><italic>Automatica</italic></source> <volume>45</volume> <fpage>477</fpage>&#x2013;<lpage>484</lpage>. <pub-id pub-id-type="doi">10.1016/j.automatica.2008.08.017</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Warraich</surname> <given-names>Z.</given-names></name> <name><surname>Kleim</surname> <given-names>J. A.</given-names></name></person-group> (<year>2010</year>). <article-title>Neural plasticity: The biological substrate for neurorehabilitation.</article-title> <source><italic>PM R</italic></source> <volume>2</volume> <fpage>S208</fpage>&#x2013;<lpage>S219</lpage>. <pub-id pub-id-type="doi">10.1016/j.pmrj.2010.10.016</pub-id> <pub-id pub-id-type="pmid">21172683</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wen</surname> <given-names>Y.</given-names></name> <name><surname>Si</surname> <given-names>J.</given-names></name> <name><surname>Brandt</surname> <given-names>A.</given-names></name> <name><surname>Gao</surname> <given-names>X.</given-names></name> <name><surname>Huang</surname> <given-names>H.</given-names></name></person-group> (<year>2019</year>). <article-title>Online Reinforcement learning control for the personalization of a robotic knee prosthesis.</article-title> <source><italic>IEEE Trans. Cybern.</italic></source> <volume>50</volume> <fpage>2346</fpage>&#x2013;<lpage>2356</lpage>. <pub-id pub-id-type="doi">10.1109/TCYB.2019.2890974</pub-id> <pub-id pub-id-type="pmid">30668514</pub-id></citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wolbrecht</surname> <given-names>E. T.</given-names></name> <name><surname>Chan</surname> <given-names>V.</given-names></name> <name><surname>Reinkensmeyer</surname> <given-names>D. J.</given-names></name> <name><surname>Bobrow</surname> <given-names>J. E.</given-names></name></person-group> (<year>2008</year>). <article-title>Optimizing compliant, model-based robotic assistance to promote neurorehabilitation.</article-title> <source><italic>IEEE Trans. Neural Syst. Rehabil. Eng.</italic></source> <volume>16</volume> <fpage>286</fpage>&#x2013;<lpage>297</lpage>. <pub-id pub-id-type="doi">10.1109/TNSRE.2008.918389</pub-id> <pub-id pub-id-type="pmid">18586608</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>R.</given-names></name> <name><surname>Li</surname> <given-names>Z.</given-names></name> <name><surname>Lyu</surname> <given-names>Y.</given-names></name> <name><surname>Song</surname> <given-names>R.</given-names></name></person-group> (<year>2022</year>). <article-title>Fast finite-time tracking control for a 3-DOF cable-driven parallel robot by adding a power integrator.</article-title> <source><italic>Mechatronics</italic></source> <volume>84</volume>:<issue>102782</issue>. <pub-id pub-id-type="doi">10.1016/j.mechatronics.2022.102782</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yang</surname> <given-names>R.</given-names></name> <name><surname>Zhou</surname> <given-names>J.</given-names></name> <name><surname>Song</surname> <given-names>R.</given-names></name></person-group> (<year>2021</year>). &#x201C;<article-title>adaptive admittance control based on linear quadratic regulation optimization technique for a lower limb rehabilitation robot</article-title>,&#x201D; in <source><italic>in Proceedings of the 6th IEEE international conference on advanced robotics and mechatronics (ICARM)</italic></source>, <publisher-loc>Changqing</publisher-loc>.</citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>J.</given-names></name> <name><surname>Li</surname> <given-names>Z. J.</given-names></name> <name><surname>Li</surname> <given-names>X. M.</given-names></name> <name><surname>Wang</surname> <given-names>X. Y.</given-names></name> <name><surname>Song</surname> <given-names>R.</given-names></name></person-group> (<year>2021</year>). <article-title>Human-robot cooperation control based on trajectory deformation algorithm for a lower limb rehabilitation robot.</article-title> <source><italic>IEEE ASME Trans. Mechatron.</italic></source> <volume>26</volume> <fpage>3128</fpage>&#x2013;<lpage>3138</lpage>. <pub-id pub-id-type="doi">10.1109/tmech.2021.3053562</pub-id></citation></ref>
</ref-list>
</back>
</article>