<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Bioeng. Biotechnol.</journal-id>
<journal-title>Frontiers in Bioengineering and Biotechnology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Bioeng. Biotechnol.</abbrev-journal-title>
<issn pub-type="epub">2296-4185</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">839586</article-id>
<article-id pub-id-type="doi">10.3389/fbioe.2022.839586</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Bioengineering and Biotechnology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>MGMSN: Multi-Granularity Matching Model Based on Siamese Neural Network</article-title>
<alt-title alt-title-type="left-running-head">Wang and Yang</alt-title>
<alt-title alt-title-type="right-running-head">Multi-Granularity Matching Model</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name>
<surname>Wang</surname>
<given-names>Xin</given-names>
</name>
<xref ref-type="aff" rid="aff1">
<sup>1</sup>
</xref>
<uri xlink:href="https://loop.frontiersin.org/people/1601816/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Yang</surname>
<given-names>Huimin</given-names>
</name>
<xref ref-type="aff" rid="aff2">
<sup>2</sup>
</xref>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
</contrib>
</contrib-group>
<aff id="aff1">
<sup>1</sup>
<institution>Huafeng Meteorological Media Group</institution>, <addr-line>Beijing</addr-line>, <country>China</country>
</aff>
<aff id="aff2">
<sup>2</sup>
<institution>College of Computer and Software</institution>, <institution>Nanjing University of Information Science Technology</institution>, <addr-line>Nanjing</addr-line>, <country>China</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1254880/overview">Tinggui Chen</ext-link>, Zhejiang Gongshang University, China</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1612506/overview">Yongli Xu</ext-link>, Beijing University of Chemical Technology, China</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/686697/overview">Hong Chen</ext-link>, Huazhong Agricultural University, China</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Huimin Yang, <email>78711730@qq.com</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Bionics and Biomimetics, a section of the journal Frontiers in Bioengineering and Biotechnology</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>28</day>
<month>03</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>10</volume>
<elocation-id>839586</elocation-id>
<history>
<date date-type="received">
<day>20</day>
<month>12</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>15</day>
<month>02</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2022 Wang and Yang.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Wang and Yang</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<abstract>
<p>Aiming to overcome the shortcomings of the existing text matching algorithms, in this research, we have studied the related technologies of sentence matching and dialogue retrieval and proposed a multi-granularity matching model based on Siamese neural networks. This method considers both deep semantic similarity and shallow semantic similarity of input sentences to completely mine similar information between sentences. Moreover, to alleviate the problem of out of vocabulary in sentences, we have combined both word and character granularity in deep semantic similarity to further learn information. Finally, comparative experiments were carried out on the Chinese data set LCQMC. The experimental results confirm the effectiveness and generalization ability of this method, and several ablation experiments also show the importance of each part of the model.</p>
</abstract>
<abstract abstract-type="graphical">
<title>Graphical Abstract</title>
<p>
<graphic xlink:href="fbioe-10-839586-fx1.tif" position="anchor"/>
</p>
</abstract>
<kwd-group>
<kwd>conversation system</kwd>
<kwd>retrieval model</kwd>
<kwd>semantic matching</kwd>
<kwd>Siamese neural network</kwd>
<kwd>multi-granularity</kwd>
</kwd-group>
<contract-num rid="cn001">2018YFC1507805</contract-num>
<contract-sponsor id="cn001">National Key Research and Development Program of China<named-content content-type="fundref-id">10.13039/501100012166</named-content>
</contract-sponsor>
</article-meta>
</front>
<body>
<sec id="s1">
<title>1 Introduction</title>
<p>Natural language processing is a hot research topic and a technological frontier in the fields of artificial intelligence and information processing. Understanding and judging language and making corresponding responses are the primary tasks of realizing machine intelligence (<xref ref-type="bibr" rid="B23">Shang et al. (2015)</xref>; <xref ref-type="bibr" rid="B17">Ma et al. (2021b)</xref>). Due to the popularity of smartphones and the development of wireless technology, we are now in the age of social media, and the conversation model has gradually developed into a social mode (<xref ref-type="bibr" rid="B16">Ma et al. (2021a)</xref>).</p>
<p>Early interaction systems, such as ELIZA (<xref ref-type="bibr" rid="B29">Weizenbaum (1983)</xref>), PARRY (<xref ref-type="bibr" rid="B4">Colby (1981)</xref>), and UC (<xref ref-type="bibr" rid="B31">Wilensky (1987)</xref>), were conversation models designed to imitate human behavior and pass the Turing test. Despite the impressive success, these conversation models are mainly based on manually customized rules, so they only have limited performance (<xref ref-type="bibr" rid="B13">Lu et al. (2021)</xref>). Nowadays, retrieval-based methods are one of the mainstream techniques for constructing conversation models. Generally, a retrieval-based model selects the appropriate response from the predefined corpus based on the input question, for instance, given a question, the retrieval model will calculate the similarity between the input question and each context in the corpus. These matching scores will be sorted, and the response matched by the context with the highest score will be taken as the answer to the input question. The final response quality of a retrieval model is not only affected by the size of the corpus but also depends on the accuracy of sentence similarity calculation. Here, for the latter, it is necessary to analyze and extract the features of the sentence itself and between sentences.</p>
<p>Traditional sentence matching methods are mainly based on statistical characteristics of sentences (<xref ref-type="bibr" rid="B37">Zhang et al. (2021)</xref>) or on word embedding (<xref ref-type="bibr" rid="B24">Shen et al. (2018)</xref>) to directly calculate the similarity between sentences. But they often ignore the semantic features of sentences, which are not effective in complex situations. With the development of deep learning and its successful application in various fields, using it to mine the deep representation of sentences has attracted more and more attention in sentence matching. Generally, a neural network is used to encode the two statements into sentence vectors, and then the relationship between sentences is determined according to the similarity of the two vectors (<xref ref-type="bibr" rid="B2">Bowman et al. (2015)</xref>; <xref ref-type="bibr" rid="B32">Yang et al. (2015)</xref>; <xref ref-type="bibr" rid="B15">Lu et al. (2020b)</xref>). However, this kind of framework ignores the lower level interaction between two sentences. The matching&#x2013;aggregation framework is therefore proposed to match two sentences at the word level and then aggregate the matching information based on the attention mechanism for the final decision. <xref ref-type="bibr" rid="B22">Rockt&#xe4;schel et al. (2016)</xref> employed word-by-word attention to obtain a sentence pair encoding from fine-grained reasoning via soft alignment of words and phrases in the premise and hypothesis, which achieved very promising results on the SNLI data set. <xref ref-type="bibr" rid="B27">Wang and Jiang (2017)</xref> proposed match LSTM for natural language inference that tries to match the current word in the hypothesis with an attention-weighted representation of the premise calculated by word-by-word attention. But these methods only consider word granularity information. <xref ref-type="bibr" rid="B14">Lu et al. (2020a)</xref> proposed a hierarchical encoding model (HEM) for sentence representation, further enhancing sentence interaction through a hierarchical matching mechanism. <xref ref-type="bibr" rid="B35">Yu R. et al. (2021)</xref> found that the available neural networks were usually limited to 1D sequential models, which hampered the performance to be improved further. Therefore, a novel neural architecture was proposed for sentence pair modeling, which utilizes 1D sentences to construct multidimensional feature maps similar to images containing multiple color channels. However, retrieval models are usually used in task-based dialogue generation and make use of only domain-specific data sets for training. In these situations, the generalization ability of the aforementioned models is poor, and they cannot respond to common input questions.</p>
<p>Based on the previous discussion, this study proposes a multi-granularity matching model based on Siamese networks (MGMSN). This method not only uses the deep learning method of character granularity and word granularity extraction to improve the accuracy of similarity calculation but also adds shallow semantic matching to increase the generalization of the model so that the model can still respond well to statements outside the corpus.</p>
<p>The rest of this article is arranged as follows. Some related work is introduced in <xref ref-type="sec" rid="s2">Section 2</xref>. The architecture of the proposed MGMSN is detailed in <xref ref-type="sec" rid="s3">Section 3</xref>. The experiment results of seven algorithms on the Chinese semantic similarity data set LCQMC by <xref ref-type="bibr" rid="B12">Liu et al. (2018)</xref> are compared in <xref ref-type="sec" rid="s4">Section 4</xref>. In this section, we also detail the ablation experiments to show the effectiveness of each part of the model. Finally, we summarize this study in <xref ref-type="sec" rid="s5">Section 5</xref>.</p>
</sec>
<sec id="s2">
<title>2 Related Work</title>
<p>In this section, we briefly introduce some related theories and concepts. Specifically, bidirectional LSTM (BiLSTM) will be used to extract the character granularity and word granularity features. Siamese networks will be the core components of the proposed model.</p>
<sec id="s2-1">
<title>2.1 BiLSTM</title>
<p>The most important part of the text analysis process is the analysis of sentence sequences. Recurrent neural networks (RNNs) have a wide range of applications in solving sequence information problems, and their network structure is significantly different from traditional neural networks (<xref ref-type="bibr" rid="B36">Yu et al. (2020)</xref>; <xref ref-type="bibr" rid="B26">Wang et al. (2022)</xref>). There will be a long-term dependency problem in the RNN learning process. This is because the connection relationship between the inputs and outputs is not ignored, resulting in forgetting the previous text information, which will cause the gradient disappearance or gradient explosion phenomenon.</p>
<p>The long short-term memory network (LSTM) can solve this problem. It provides a gate mechanism to manage information to limit the amount of information and uses memory cells to store long-term historical information. Adding gates is actually a multilevel feature selection method (<xref ref-type="bibr" rid="B19">Na et al. (2021)</xref>). The LSTM model mainly includes input gates <italic>i</italic>
<sub>
<italic>t</italic>
</sub>, forgetting gates <italic>f</italic>
<sub>
<italic>t</italic>
</sub>, output gates <italic>O</italic>
<sub>
<italic>t</italic>
</sub> and memory units <italic>C</italic>
<sub>
<italic>t</italic>
</sub>. The specific structure is shown in <xref ref-type="fig" rid="F1">Figure 1</xref>.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>LSTM cell.</p>
</caption>
<graphic xlink:href="fbioe-10-839586-g001.tif"/>
</fig>
<p>In the first, LSTM must pass the forgetting gate to decide which information in the previous cell unit needs to be forgotten. It is completed by the sigmoid function, which calculates a number from 0 to 1 by receiving the weighted sum of the output at the previous time (time <italic>t</italic> &#x2212; 1) and the input at this time (time <italic>t</italic>), where 0 means completely discarded and 1 means all retention. Its calculation is shown in <xref ref-type="disp-formula" rid="e1">Eq. 1</xref>:<disp-formula id="e1">
<mml:math id="m1">
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x22c5;</mml:mo>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>b</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(1)</label>
</disp-formula>
</p>
<p>After inputting the information required by the door control unit, we get<disp-formula id="e2">
<mml:math id="m2">
<mml:msub>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x22c5;</mml:mo>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>b</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(2)</label>
</disp-formula>
<disp-formula id="e3">
<mml:math id="m3">
<mml:msub>
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#xd7;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#xd7;</mml:mo>
<mml:mi mathvariant="italic">tanh</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>f</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x22c5;</mml:mo>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>b</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>0</mml:mn>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(3)</label>
</disp-formula>
</p>
<p>The information controlled by the output gate is used for the task output at this moment, and its calculation process is given as follows:<disp-formula id="e4">
<mml:math id="m4">
<mml:msub>
<mml:mrow>
<mml:mi>O</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x22c5;</mml:mo>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>b</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>o</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
<label>(4)</label>
</disp-formula>
<disp-formula id="e5">
<mml:math id="m5">
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>O</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x22c5;</mml:mo>
<mml:mi mathvariant="italic">tanh</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>C</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(5)</label>
</disp-formula>
</p>
<p>Among them, <italic>w</italic>
<sub>
<italic>i</italic>
</sub>, <italic>w</italic>
<sub>
<italic>f</italic>
</sub>, and <italic>w</italic>
<sub>
<italic>o</italic>
</sub> are the weight matrices of the input gate, forgetting gate, and output gate, respectively; <italic>b</italic>
<sub>
<italic>i</italic>
</sub>, <italic>b</italic>
<sub>
<italic>f</italic>
</sub>, and <italic>b</italic>
<sub>
<italic>o</italic>
</sub> are the bias matrices of the input gate, forgetting gate, and output gate, respectively; <italic>&#x3c3;</italic> is the sigmoid activation function; <italic>h</italic>
<sub>
<italic>t</italic>&#x2212;1</sub> and <italic>h</italic>
<sub>
<italic>t</italic>
</sub> represent the state of the previous hidden layer and the current hidden layer, respectively; and <italic>x</italic>
<sub>
<italic>t</italic>
</sub> represents the input of the current cell.</p>
<p>However, LSTM still has defects. It cannot effectively use the information after the word and cannot effectively capture weaker semantic information but can only use the information before the current word. In fact, the word semantics is related not only to the previous information but also to the information behind the word. Therefore, the text sequence is reversely integrated into the model, so that the model becomes a bidirectional long short-term memory network (BiLSTM) structure model composed of forward and reverse. The BiLSTM network takes the word vector as the model input and obtains the hidden layer state vector through the forward and backward units of the hidden layer, respectively. Considering <inline-formula id="inf1">
<mml:math id="m6">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
<mml:mo>&#x20d7;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>.</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> and <inline-formula id="inf2">
<mml:math id="m7">
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
<mml:mo>&#x20d6;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x3d;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>,</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>.</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>h</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:math>
</inline-formula> as the forward and backward outputs of the hidden layer, the output of the BiLSTM hidden layer is obtained as follows:<disp-formula id="e6">
<mml:math id="m8">
<mml:mi>H</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
<mml:mo>&#x20d7;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>,</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mrow>
<mml:mi>H</mml:mi>
</mml:mrow>
<mml:mo>&#x20d6;</mml:mo>
</mml:mover>
</mml:mrow>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(6)</label>
</disp-formula>
</p>
</sec>
<sec id="s2-2">
<title>2.2 Siamese Networks</title>
<p>A Siamese network (<xref ref-type="bibr" rid="B3">Bromley et al. (1993)</xref>) is an architecture for non-linear metric learning with similarity information. It naturally learns representations that embed the invariance and selectivity desired by the explicit information about similarity between pairs of objects. In contrast, an auto-encoder (<xref ref-type="bibr" rid="B28">Wang et al. (2016)</xref>) learns invariance through added noise and dimensionality reduction in the bottleneck layer and selectivity through the condition that the input should be reproduced by the decoding part of the network. A Siamese network learns an invariant and selective representation directly through the use of similarity and dissimilarity information. In natural language processing, Siamese networks are usually used to calculate the semantic similarity between sentences (<xref ref-type="bibr" rid="B10">Kenter et al. (2016)</xref>; <xref ref-type="bibr" rid="B18">Mueller and Thyagarajan (2016)</xref>; <xref ref-type="bibr" rid="B20">Neculoiu et al. (2016)</xref>). The structure of the Siamese network is shown in <xref ref-type="fig" rid="F2">Figure 2</xref>.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>Siamese network frame diagram.</p>
</caption>
<graphic xlink:href="fbioe-10-839586-g002.tif"/>
</fig>
<p>Generally, to calculate semantic similarity, sentences will be reformed as sentence pairs and then input into a Siamese network (<xref ref-type="bibr" rid="B5">Fan et al. (2020)</xref>).</p>
</sec>
</sec>
<sec id="s3">
<title>3 Proposed MGMSN Model</title>
<p>In this section, we will introduce the proposed MGMSN model in detail. It includes two basic blocks, a deep semantic matching block and a shallow semantic matching block.</p>
<sec id="s3-1">
<title>3.1 Deep Semantic Matching Block</title>
<p>For convenience, in this study, the problem of calculating the input problem and the context in the retrieval model is described as the problem of calculating the similarity between the input problem <italic>x</italic>
<sub>1</sub> and each context in the corpus <italic>x</italic>
<sub>2</sub>. First, we use the tool of Jieba (<xref ref-type="bibr" rid="B30">Wieting et al. (2016)</xref>) for word segmentation and character segmentation. The character segmentation is to segment the sentence according to the single character. For sentences <italic>x</italic>
<sub>1</sub> and <italic>x</italic>
<sub>2</sub> that need to be calculated for similarity, after word segmentation and character segmentation, two representations of word sequence and character sequence can be obtained, respectively, which are recorded as word sequence 1, character sequence 1, word sequence 2, and character sequence 2. After finishing segmentation, the word sequence and character sequence are converted into a single vector representation through the embedding layer, and finally, the embedding matrix of the sentence is formed. The embedding layer maps each word into a vector by loading the weight of the pretrained Word2vec model.</p>
<p>The network structure of the deep semantic matching algorithm is mainly divided into four layers, including the embedding layer, coding layer, comparison layer, and aggregation layer. <xref ref-type="fig" rid="F3">Figure 3</xref> shows the structure of the deep semantic matching algorithm designed in this study. The deep semantic matching algorithm is mainly divided into two parts: word granularity feature extraction and character granularity feature extraction.</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Structure diagram of the deep semantic matching algorithm.</p>
</caption>
<graphic xlink:href="fbioe-10-839586-g003.tif"/>
</fig>
<sec id="s3-1-1">
<title>3.1.1 Word Granularity Feature Extraction</title>
<p>In our deep semantic matching block, Siamese networks are used to learn the similarity between input sentences. To learn more, the BiLSTM and attention mechanisms are used to analyze and extract the semantics of sentences, and the results are combined to obtain the deep semantic matching feature of word granularity.</p>
<p>After tokenization, the input statements <italic>x</italic>
<sub>1</sub> and <italic>x</italic>
<sub>2</sub> are converted into word sequences <italic>Q</italic> and <italic>P</italic>, respectively. The embedding matrix of a word sequence is further obtained by the embedding layer, which is denoted as <italic>Q</italic> &#x2208; <bold>R</bold>
<sup>
<italic>d</italic>&#xd7;<italic>q</italic>
</sup> and <italic>P</italic> &#x2208; <bold>R</bold>
<sup>
<italic>d</italic>&#xd7;<italic>q</italic>
</sup>, where <italic>d</italic> is the dimension of the word vector, and each column of <italic>Q</italic> and <italic>P</italic> represents the word vector of a word. After obtaining the embedding matrix of the input statement, the encoding layer is responsible for further feature extraction of the embedded matrix to obtain the implicit semantics of the sentence. The coding layer is the core part of the deep semantic matching algorithm. To obtain more information, in the calculation of word granularity, we use two feature extractors, BiLSTM and Attention.</p>
<p>In standard NLP tasks, such as text matching and named entity recognition, BiLSTM is much better than standard LSTM. The number of layers of LSTM will greatly affect the training efficiency of the model. If the LSTM has more than 3 layers, it will be difficult to train. Therefore, our proposed model adopts two-layer BiLSTM. After the embedding layer, the embedded matrix of the input statement is input into the BiLSTM for calculation, and the hidden state of the last moment of the BiLSTM is output <italic>h</italic>
<sub>
<italic>bi</italic>&#x2212;<italic>lstm</italic>
</sub>. The word embedding matrix of the input sentence <italic>x</italic>
<sub>1</sub>, <italic>x</italic>
<sub>2</sub> is changed as the semantic vectors <inline-formula id="inf3">
<mml:math id="m9">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>word&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>l</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>m</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> and <inline-formula id="inf4">
<mml:math id="m10">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>word&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>l</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>m</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> through the BiLSTM, corresponding to <italic>WH</italic>
<sub>1</sub> and <italic>WH</italic>
<sub>2</sub> in <xref ref-type="fig" rid="F3">Figure 3</xref>.</p>
<p>Deep neural networks can learn more information through training, but because of the disappearance of gradients, we cannot deepen the neural network infinitely. Therefore, in this study, the neural network is constructed horizontally. In addition to the feature extraction by BiLSTM, the attention layer is added to further learn the semantic information of two input statements embedded in matrices <italic>Q</italic> and <italic>P</italic>. There are many ways to realize attention mechanisms. Here, the self-attention mechanism is used for feature extraction. Its essence is to align the text to obtain more information in a targeted manner. By learning a set of weight parameters <italic>W</italic> &#x2208; <italic>R</italic>
<sup>
<italic>d</italic>
</sup>, the words embedded in the input statement is aligned to obtain the attention weight vectors <inline-formula id="inf5">
<mml:math id="m11">
<mml:msub>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>Q</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:math>
</inline-formula> and <inline-formula id="inf6">
<mml:math id="m12">
<mml:msub>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>P</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
<mml:mo>&#x2026;</mml:mo>
<mml:mo>,</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>a</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:math>
</inline-formula>, corresponding to <italic>A</italic>
<sub>1</sub> and <italic>A</italic>
<sub>2</sub> in <xref ref-type="fig" rid="F3">Figure 3</xref>, where <italic>n</italic> represents the length of input statement <italic>x</italic>
<sub>1</sub>, <italic>m</italic> represents the length of input statement <italic>x</italic>
<sub>2</sub>, and the formulas of attention weight matrix are as follows,<disp-formula id="e7">
<mml:math id="m13">
<mml:msub>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>Q</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="normal">softmax</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>tanh</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>Q</mml:mi>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
<label>(7)</label>
</disp-formula>
<disp-formula id="e8">
<mml:math id="m14">
<mml:msub>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>Q</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi mathvariant="normal">softmax</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>tanh</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>P</mml:mi>
<mml:mi>W</mml:mi>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(8)</label>
</disp-formula>
</p>
<p>The vectors in the embedded matrix are further weighted and summed to obtain attention semantic vectors <inline-formula id="inf7">
<mml:math id="m15">
<mml:msubsup>
<mml:mrow>
<mml:mi>v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>attention&#x2009;</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>Q</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula> and <inline-formula id="inf8">
<mml:math id="m16">
<mml:msubsup>
<mml:mrow>
<mml:mi>v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>attention&#x2009;</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>P</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msub>
</mml:math>
</inline-formula>.</p>
</sec>
<sec id="s3-1-2">
<title>3.1.2 Character Granularity Feature Extraction</title>
<p>At present, there is still an important problem in text matching tasks, which is the problem of out of vocabulary (OOV). There are only 2,500 Chinese characters in common use in daily life, but the number of words made up of these characters is huge. With current technology, we cannot get a word vector representation of every word. For the trained model, if the input new sentence contains untrained words, the model cannot obtain the word vector representation of untrained words, which will affect the performance of the whole model. Because of the limitation of the size of the corpus, OOV words are very easy to appear in the calculation of word granularity, and only using a model based on word granularity will reduce the discrimination ability. Therefore, in addition to calculating the similarity of the word granularity, this study further expands the granularity to the character level to obtain more text features and improve the flexibility of the model.</p>
<p>By adding a character granularity Siamese network, the characteristics of the text sequence can be analyzed and captured at a more fine-grained level, which further solves the problem of OOV. However, more detailed segmentation of text sequences will greatly increase the complexity of the model when capturing features. To reduce the complexity of the model and avoid the overfitting problem caused by too many parameters, only single-layer BiLSTM is used in the character granularity Siamese network. In the encoding layer, the weights in the Siamese network of word granularity and character granularity are shared.</p>
<p>After the tokenization, a character embedding layer is used to obtain the character embedding matrix of the input statement, and the semantic vectors <inline-formula id="inf9">
<mml:math id="m17">
<mml:msubsup>
<mml:mrow>
<mml:mi>v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>h</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>l</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>m</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> and <inline-formula id="inf10">
<mml:math id="m18">
<mml:msubsup>
<mml:mrow>
<mml:mi>v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>h</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mo>&#x2212;</mml:mo>
<mml:mi>l</mml:mi>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>m</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> of each statement are obtained in the character granularity through the BiLSTM, corresponding to <italic>CH</italic>
<sub>1</sub> and <italic>CH</italic>
<sub>2</sub> in <xref ref-type="fig" rid="F3">Figure 3</xref>.</p>
<p>Through coding layer calculation, three kinds of semantic vectors are obtained for each input sentence, namely, the BiLSTM semantic vector on word granularity, the attention semantic vector on word granularity, and the BiLSTM semantic vector on character granularity. These semantic vectors are further passed into the comparison layer, and each feature is combined according to the comparison function to depict the spatial difference of the input sentence. Through the element subtraction of three kinds of semantic vectors of input statement, the difference information between semantics can be obtained. The formula of the output vector <italic>c</italic> obtained by comparing functions is given as follows:<disp-formula id="e9">
<mml:math id="m19">
<mml:mi>c</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mfenced open="[" close="]">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi>v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>word</mml:mtext>
<mml:mo>-</mml:mo>
<mml:mtext>lstm&#x2009;</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>word</mml:mtext>
<mml:mo>-</mml:mo>
<mml:mtext>lstm&#x2009;</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2212;</mml:mo>
<mml:msup>
<mml:mrow>
<mml:mi>A</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msup>
<mml:mo>;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>char&#x2009;</mml:mtext>
<mml:mo>-</mml:mo>
<mml:mtext>lstm&#x2009;</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x2212;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>char&#x2009;</mml:mtext>
<mml:mo>&#x2212;</mml:mo>
<mml:mn>1</mml:mn>
<mml:mtext>&#x2009;stm&#x2009;</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(9)</label>
</disp-formula>
</p>
<p>Finally, the result of the feature comparison is passed to the aggregation layer, which is composed of a multilayer perceptron (MLP). The final output of the deep neural network is computed as follows:<disp-formula id="e10">
<mml:math id="m20">
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mi>W</mml:mi>
<mml:mi>c</mml:mi>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>b</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(10)</label>
</disp-formula>
</p>
</sec>
</sec>
<sec id="s3-2">
<title>3.2 Shallow Semantic Matching Algorithms</title>
<p>The retrieval model is more suitable for task-based conversation systems, so the collected corpus is often targeted at a specific field. Through these corpora, the neural network model is trained to learn the semantic information of sentences in the corpus and judge the similarity between sentences. Due to the limitations of the corpus domain, the similarity of sentences in a specific domain can be calculated well, but for sentences outside the corpus, it is often difficult to calculate the similarity of sentences because the neural network lacks sufficient training information and generalization ability. Following Google&#x2019;s Word2vec, many companies have trained word vectors on large corpora and have opened up word vector weights. After the sentence is pretrained by Word2vec to get the word vector, the word vector of all the words in the sentence is embedded as the sentence vector, and the similarity of the input sentences is obtained by directly calculating the similarity of the two input sentence vectors. In the process of word vector training with good results, much semantic information is learned. Due to the lack of special domain knowledge and limited generalization ability, this method is even more effective in text similarity tasks than the LSTM model. By calculating the shallow semantic matching of sentences, for sentences outside the domain, the model can give a reasonable response through multilevel analysis.</p>
<p>The shallow semantic matching algorithm also uses the embedding layer to obtain the embedding matrices <italic>Q</italic> and <italic>P</italic> of the input statement. Although there are many ways to convert the word embedding matrix into a word vector, after comprehensive consideration, we choose word vector summation and averaging. Moreover, these two methods do not need additional parameter training and can directly obtain the embedded representation of the sentence. They can be formulated as follows:<disp-formula id="e11">
<mml:math id="m21">
<mml:msub>
<mml:mrow>
<mml:mi>v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:mfrac>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>j</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:msubsup>
<mml:mrow>
<mml:mi>W</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msubsup>
<mml:mo>,</mml:mo>
</mml:math>
<label>(11)</label>
</disp-formula>
<disp-formula id="e12">
<mml:math id="m22">
<mml:msub>
<mml:mrow>
<mml:mi>v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>u</mml:mi>
<mml:mi>m</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>n</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:msubsup>
<mml:mrow>
<mml:mi>W</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>x</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:msubsup>
<mml:mo>.</mml:mo>
</mml:math>
<label>(12)</label>
</disp-formula>
</p>
<p>There is no significant difference in the performance of sentence representation using word vector summation or averaging, so this study uses the word vector averaging method for semantic representation. First, the embedded matrix is averaged by word, the cosine similarity of the two average word vectors is calculated, and the similarity is taken as the output of the shallow semantic matching module. The schematic diagram of the shallow semantic matching module is given in <xref ref-type="fig" rid="F4">Figure 4</xref>.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Structure diagram of shallow semantic matching algorithm.</p>
</caption>
<graphic xlink:href="fbioe-10-839586-g004.tif"/>
</fig>
<p>After obtaining the average word representations of sentences <inline-formula id="inf11">
<mml:math id="m23">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>mean&#x2009;</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> and <inline-formula id="inf12">
<mml:math id="m24">
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>mean&#x2009;</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:math>
</inline-formula> respectively, the shallow semantic similarity <italic>y</italic>
<sub>2</sub> is obtained by calculating the cosine similarity of the two statements as follows:<disp-formula id="e13">
<mml:math id="m25">
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x3d;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>mean&#x2009;</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msubsup>
<mml:mo>&#x22c5;</mml:mo>
<mml:msubsup>
<mml:mrow>
<mml:mi>v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mo movablelimits="false" form="prefix">mean</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
<mml:mrow>
<mml:mfenced open="&#x2016;" close="&#x2016;">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>mecn&#x2009;</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>q</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
<mml:mo>&#xd7;</mml:mo>
<mml:mfenced open="&#x2016;" close="&#x2016;">
<mml:mrow>
<mml:msubsup>
<mml:mrow>
<mml:mi mathvariant="bold-italic">v</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mtext>mean&#x2009;</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
</mml:msubsup>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfrac>
<mml:mo>.</mml:mo>
</mml:math>
<label>(13)</label>
</disp-formula>
</p>
<p>The shallow semantic matching algorithm itself has no training parameters, but the word embedding weight of the previous embedding layer is a trainable parameter. The purpose of this setting is to allow the shallow semantic matching module to &#x201c;update&#x201d; the training of the embedding layer.</p>
</sec>
<sec id="s3-3">
<title>3.3 Framework of MGMSN Model</title>
<p>The input of MGMSN is a sentence pair <italic>X</italic>&#x3d;(<italic>x</italic>
<sub>1</sub>, <italic>x</italic>
<sub>2</sub>), and its output <italic>y</italic> is the similarity score of the sentences <italic>x</italic>
<sub>1</sub> and <italic>x</italic>
<sub>2</sub>. After obtaining the deep semantic similarity <italic>y</italic>
<sub>1</sub> and shallow semantic similarity <italic>y</italic>
<sub>2</sub> of the two sentences, respectively, the final output neuron is achieved by combining these two different levels of similarity.<disp-formula id="e14">
<mml:math id="m26">
<mml:mi>y</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mi>&#x3c3;</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>w</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mn>2</mml:mn>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mi>b</mml:mi>
</mml:mrow>
</mml:mfenced>
<mml:mo>.</mml:mo>
</mml:math>
<label>(14)</label>
</disp-formula>
</p>
<p>Here, <italic>w</italic>
<sub>1</sub>, <italic>w</italic>
<sub>2</sub>, and <italic>b</italic> are the weight parameters of the neural network. The training goal is to minimize the cross-entropy between the predicted value and the real value, which is given by<disp-formula id="e15">
<mml:math id="m27">
<mml:mtext>&#x2009;loss&#x2009;</mml:mtext>
<mml:mo>&#x3d;</mml:mo>
<mml:mo>&#x2212;</mml:mo>
<mml:munderover accentunder="false" accent="false">
<mml:mrow>
<mml:mo>&#x2211;</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
<mml:mo>&#x3d;</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mrow>
<mml:mi>N</mml:mi>
</mml:mrow>
</mml:munderover>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2061;</mml:mo>
<mml:mi>log</mml:mi>
<mml:msub>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>&#x2b;</mml:mo>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>y</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
<mml:mi>log</mml:mi>
<mml:mfenced open="(" close=")">
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>&#x2212;</mml:mo>
<mml:msub>
<mml:mrow>
<mml:mi>p</mml:mi>
</mml:mrow>
<mml:mrow>
<mml:mi>i</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mfenced>
</mml:mrow>
</mml:mfenced>
<mml:mo>,</mml:mo>
</mml:math>
<label>(15)</label>
</disp-formula>where <italic>y</italic>
<sub>
<italic>i</italic>
</sub> is the real value and <italic>p</italic>
<sub>
<italic>i</italic>
</sub> is the predicted value.</p>
<p>Through the processing of the above two parts, we get the similarity of the two input sentences. When applied to the conversation model, it uses the text matching model to calculate the matching degree between the input question and each context in the corpus. The higher the matching degree between the input question and the context is, the more appropriate the response corresponding to the context will be. This method increases the accuracy of text similarity calculations to a certain extent through multilevel and multi-granularity calculations. Combining <xref ref-type="fig" rid="F3">Figure 3</xref> and <xref ref-type="fig" rid="F4">Figure 4</xref>, the structure diagram of MGMSN is shown in <xref ref-type="fig" rid="F5">Figure 5</xref>.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Structure of MGMSN.</p>
</caption>
<graphic xlink:href="fbioe-10-839586-g005.tif"/>
</fig>
</sec>
</sec>
<sec id="s4">
<title>4 Experiment Results and Analysis</title>
<p>To evaluate the effectiveness of the proposed model, we compare MGMSN with six current mainstream text matching algorithms, they are Deep Structured Semantic Model (DSSM) (<xref ref-type="bibr" rid="B8">Huang et al. (2013)</xref>), Siamese LSTM network (Siamese LSTM) (<xref ref-type="bibr" rid="B20">Neculoiu et al. (2016)</xref>), Attention-Based Convolutional Neural Network for Modeling Sentence Pairs (ABCNN) (<xref ref-type="bibr" rid="B33">Yin et al. (2016)</xref>), Enhance Sequential Inference Model (ESIM) (<xref ref-type="bibr" rid="B6">Greff et al. (2016)</xref>), Deep Interactive Text Matching (DITM) model (<xref ref-type="bibr" rid="B34">Yu C. et al. (2021)</xref>), and Frame-based Multi-level Semantics Representation (FMSR) model (<xref ref-type="bibr" rid="B7">Guo et al. (2021)</xref>).</p>
<sec id="s4-1">
<title>4.1 Data Description</title>
<p>In this study, we use the open Chinese semantic similarity data set LCQMC (<xref ref-type="bibr" rid="B12">Liu et al. (2018)</xref>) for training. The LCQMC is a semantic matching data set published by the Harbin Institute of Technology in COLING 2018. The task is to judge whether the semantics of two questions are similar. The LCQMC data set contains 260,086 annotated data points. By dividing different fields and extracting the most relevant question set from Baidu QA, the LCQMC data set is manually filtered and annotated after preliminary screening. In the LCQMC corpus, the maximum length of sentences is 131 characters, the shortest length is 2 characters, and the average length of sentences is 10 characters, which belongs to the category of short text. The data set is predivided into a training set, a validation set, and a test set, and their descriptions are shown in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Data set structure.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Data</th>
<th align="center">Positive sample size</th>
<th align="center">Negative sample size</th>
<th align="center">Total</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Training set</td>
<td align="center">138,574</td>
<td align="center">100,192</td>
<td align="center">238,766</td>
</tr>
<tr>
<td align="left">Validation set</td>
<td align="center">4,402</td>
<td align="center">4,400</td>
<td align="center">8,802</td>
</tr>
<tr>
<td align="left">Test set</td>
<td align="center">6,250</td>
<td align="center">6,250</td>
<td align="center">12,500</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4-2">
<title>4.2 Implementation Details and Parameter Settings of MGMSN</title>
<p>To prevent the model from overfitting, we apply techniques such as dropout (<xref ref-type="bibr" rid="B1">Baldi and Sadowski (2013)</xref>), early stopping (<xref ref-type="bibr" rid="B21">Prechelt (1998)</xref>), and batch normalization (<xref ref-type="bibr" rid="B9">Ioffe and Szegedy (2015)</xref>). A dropout layer is added between the LSTM recurrent layer and the deep semantic matching aggregation layer, the function of similar data enhancement is increased to a certain extent during training by dropping a certain proportion of neurons randomly, while the co-adaptation relationship between neurons is reduced to prevent overfitting. Early stopping is a common method to prevent model overfitting. The data set is divided into training sets, validation sets, and test sets, and the training process is monitored through the validation sets. In the process of model training, the validation set is used to monitor the training accuracy (it can also be the training loss), and when the specified conditions are satisfied, the training process will be terminated in advance. In this study, the tolerance set in the experiment is 2, that is, if the accuracy of the two training process validation sets is not improved, the training process will be terminated in advance.</p>
<p>During the training of the neural network, as the parameters of the previous layers change, the input data distribution of the current layer will also change accordingly. Therefore, the current layer needs to be continuously updated to adapt to the new data distribution (<xref ref-type="bibr" rid="B25">Tang et al. (2018)</xref>). Batch normalization normalizes the input of each layer of the network to make the output obey the normal distribution with a mean value of 0 and a variance of 1, to avoid the problem of variable distribution deviation. The batch normalization layer is applied between multilayer perceptrons. The NAdam method (<xref ref-type="bibr" rid="B11">Kingma and Ba (2015)</xref>) is selected as the model optimization strategy. <xref ref-type="table" rid="T2">Table 2</xref> shows the parameter settings of the model in the experimental environment. All word vectors are pretrained word vectors and updated during the training process.</p>
<table-wrap id="T2" position="float">
<label>TABLE 2</label>
<caption>
<p>Model parameter setting.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Parameter</th>
<th align="center">Value</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Word vector dimension</td>
<td align="char" char=".">300</td>
</tr>
<tr>
<td align="left">Perceptron dimension</td>
<td align="char" char=".">128</td>
</tr>
<tr>
<td align="left">LSTM hidden state dimension</td>
<td align="char" char=".">128</td>
</tr>
<tr>
<td align="left">LSTM Circulation layer Dropout</td>
<td align="char" char=".">0.5</td>
</tr>
<tr>
<td align="left">Dropout</td>
<td align="char" char=".">0.5</td>
</tr>
<tr>
<td align="left">Maximum length of word input</td>
<td align="char" char=".">20</td>
</tr>
<tr>
<td align="left">Maximum length of character input</td>
<td align="char" char=".">40</td>
</tr>
<tr>
<td align="left">Word Siamese LSTM hidden state dimension</td>
<td align="char" char=".">32</td>
</tr>
<tr>
<td align="left">Maximum number of iterations</td>
<td align="char" char=".">15</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec id="s4-3">
<title>4.3 Comparison Results of Sentence Matching Algorithms</title>
<p>The proposed MGMSN model will be evaluated in calculating the accuracy between sentences with the other six sentence matching models. <xref ref-type="table" rid="T3">Table 3</xref> shows the results of these models.</p>
<table-wrap id="T3" position="float">
<label>TABLE 3</label>
<caption>
<p>Comparison of matching accuracy of different models.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">Model</th>
<th align="center">Embedded type</th>
<th align="center">Training set</th>
<th align="center">Validation set</th>
<th align="center">Test set</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">DSSM</td>
<td align="left">Char</td>
<td align="char" char=".">0.66</td>
<td align="char" char=".">0.59</td>
<td align="char" char=".">0.63</td>
</tr>
<tr>
<td align="left">Siamese LSTM</td>
<td align="left">Word</td>
<td align="char" char=".">0.91</td>
<td align="char" char=".">0.78</td>
<td align="char" char=".">0.77</td>
</tr>
<tr>
<td align="left">ABCNN</td>
<td align="left">Char</td>
<td align="char" char=".">0.90</td>
<td align="char" char=".">0.76</td>
<td align="char" char=".">0.79</td>
</tr>
<tr>
<td align="left">ESIM</td>
<td align="left">Word</td>
<td align="char" char=".">0.77</td>
<td align="char" char=".">0.67</td>
<td align="char" char=".">0.72</td>
</tr>
<tr>
<td align="left">DITM</td>
<td align="left">Word</td>
<td align="char" char=".">0.81</td>
<td align="char" char=".">0.74</td>
<td align="char" char=".">0.77</td>
</tr>
<tr>
<td align="left">FMSR</td>
<td align="left">Word</td>
<td align="char" char=".">0.92</td>
<td align="char" char=".">0.83</td>
<td align="char" char=".">0.79</td>
</tr>
<tr>
<td align="left">MGMSN</td>
<td align="left">Word/Char</td>
<td align="char" char=".">0.94</td>
<td align="char" char=".">0.84</td>
<td align="char" char=".">0.85</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>In the table, &#x201c;Char&#x201d; means embedding at character granularity, and &#x201c;Word&#x201d; means embedding at word granularity. It can be seen that although DSSM uses a fine-grained character-based embedding method, the comprehensive matching effect is poor because it only uses the fully connected layer for feature extraction. In the training set, the ESIM model only has an accuracy of 0.77 on the training set, which shows that the ESIM model does not fit the data set well, and the role of text interactive reasoning information in text matching does not have an obvious effect. The under-fitting state of the ESIM model on the training set causes the model to perform worse than the Siamese LSTM model on the validation set and test set. In contrast, although the Siamese LSTM model and the ABCNN model can achieve an effect of more than 0.9 in the training set, their effect in the validation set and test set is greatly reduced. Since the DITM model is able to perform multiple iterations of the interaction process, it can obtain deep interaction information and extract the relationship between text pairs through multi-view pooling. Therefore, the model can achieve 81% accuracy on the training set, which is about 4% higher than the ESIM model. However, the Siamese network has irreplaceable advantages, and the accuracy of the Siamese LSTM model at the word level is about 10% higher than that of the DITM model in the training set. The FMSR model exploits frame knowledge to explicitly extract multilevel semantic information in sentences for text matching tasks. The accuracy of the FMSR model is higher than the aforementioned models in the training set, validation set, and test set. However, the model still has shortcomings, the training set accuracy is about 0.02 lower than the MGMSN model, the validation set is about 0.01 lower, and the test set is about 0.06 lower.</p>
</sec>
<sec id="s4-4">
<title>4.4 Ablation Experiments</title>
<p>In this subsection, we conduct a set of ablation experiments on the model to prove the effectiveness of each component of the MGMSN model. Specifically, we sequentially remove the character granularity Siamese network in the component, attention feature extraction, and shallow semantic matching block. The changes in matching accuracy of the ablation experiments are given in <xref ref-type="table" rid="T4">Table 4</xref>.</p>
<table-wrap id="T4" position="float">
<label>TABLE 4</label>
<caption>
<p>Results of ablation experiment.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left"/>
<th align="center">Training set</th>
<th align="center">Validation set</th>
<th align="center">Test set</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">Character granularity Siamese network removed</td>
<td align="char" char=".">&#x2212;0.2%</td>
<td align="char" char=".">&#x2212;0.5%</td>
<td align="char" char=".">&#x2212;0.9%</td>
</tr>
<tr>
<td align="left">Attention feature extraction removed</td>
<td align="char" char=".">&#x2212;0.1%</td>
<td align="char" char=".">&#x2212;1.5%</td>
<td align="char" char=".">&#xb1;0.5%</td>
</tr>
<tr>
<td align="left">Shallow semantic matching block removed</td>
<td align="char" char=".">&#x2b;0.5%</td>
<td align="char" char=".">&#x2b;0.6%</td>
<td align="char" char=".">&#x2212;1.6%</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>It can be found that the shallow semantic matching block has the greatest impact on the accuracy of the final test set. After deleting it, the performance of the entire model in the test set dropped by 1.6%. At the same time, the accuracy of the training set and the validation set is improved, which means that the data distribution of the training set and the validation set is relatively consistent. However, the accuracy of the test set dropped sharply, which indicates that the data distribution of the test set and the training set may be inconsistent, and the model could not be generalized well to the test set. This further shows that the shallow semantic matching block improves the generalization ability of the model to a certain extent.</p>
<p>After removing the character granularity Siamese network, it can be found that the accuracy of the MGMSN model decreases to varying degrees on the training set, validation set, and test set. In the test set, the performance of the model decreases by 0.9%, which shows that the character granularity Siamese network can not only solve the problem of OOV but also extract different granularities of text semantic information to a certain extent and improve the accuracy of model matching.</p>
<p>When the attention feature extraction part is removed, the training process has little effect on the model, and the training set fitting accuracy only decreases by 0.1%, which indicates that the Siamese network structure is suitable for text semantic matching tasks. However, the performance of the validation set is reduced by 1.5%, which shows that the model has a certain degree of overfitting. In the final test set, we find that the performance of the test set is very unstable, and the performance influence fluctuates within the range of (&#x2212;0.5% to 0.5%), which also means that the addition of attention to extracting features improves the robustness of the model.</p>
<p>Due to the real-time requirements of the conversation system, we tested the prediction time of the MGMSN model under the configuration of the CPU model i5-8250U and memory of 16G. The experiment shows that the real-time prediction time of the MGMSN model is 2&#xa0;ms, which can meet the demand of millisecond-level response. It can be seen that even in the CPU environment, the MGMSN model proposed in this article can fully meet the needs of real-time conversation.</p>
</sec>
</sec>
<sec id="s5">
<title>5 Conclusion and Future Work</title>
<p>Different from other models that only use deep neural network to improve sentence matching similarity, in order to further improve the accuracy and generalization ability of the model, we not only optimize the deep neural network but also combine it with the traditional matching model. In this way, the model responds well to user input questions whether or not they are in the trained corpus.</p>
<p>The multi-granularity matching model based on the Siamese network proposed in this article improves the scalability, robustness, and accuracy ability of the model by combining deep semantic matching and shallow semantic matching algorithms and using attention and BiLSTM to extract features in parallel to obtain matching information from different views. To solve the problem of OOV, the character granularity Siamese structure is further added to the deep semantic matching to enrich the network structure and obtain fine-grained matching features. The ablation experiments show that the character granularity Siamese network, attention feature extraction, and shallow semantic matching algorithms all contribute to the MGMSN model. Experiments show that the accuracy of the MGMSN model proposed in this article is higher than that of the other six current mainstream text matching algorithms.</p>
<p>Although the multi-granularity compound conversation model based on the Siamese network proposed in this article has excellent performance, there is still room for further improvement in terms of practical problems.<list list-type="simple">
<list-item>
<p>1) Most of the conversations are more than single round. How to analyze and respond to multi-round conversation is very important. The model proposed in this article is mainly for a single-round conversation. For multi-round conversation, it cannot perform multi-sentence contextual analysis and maintain the consistency of responses. Therefore, in future work, a hierarchical structure can be added to capture the semantics of sentences and the semantics of multiple rounds of context at the same time, to improve the model&#x2019;s response accuracy and topic consistency.</p>
</list-item>
<list-item>
<p>2) It is very natural for humans to express emotional language. But how the conversation system can capture emotions and express them in real time is still a challenge. Therefore, in future work, we can consider the emotional analysis of the input sentences and use the emotional dictionary to generate the emotional response so that the response sentence is more like a real person&#x2019;s response and ensure the continuity of the conversation.</p>
</list-item>
</list>
</p>
</sec>
</body>
<back>
<sec id="s6">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="sec" rid="s11">Supplementary Material</xref>; further inquiries can be directed to the corresponding author.</p>
</sec>
<sec id="s7">
<title>Author Contributions</title>
<p>XW: conceptualization and funding acquisition. HY: methodology.</p>
</sec>
<sec id="s8">
<title>Funding</title>
<p>This research was funded by the National Key Research and Development Program of China (2018YFC1507805).</p>
</sec>
<sec sec-type="COI-statement" id="s9">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors, and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<sec id="s11">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fbioe.2022.839586/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fbioe.2022.839586/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="DataSheet1.ZIP" id="SM1" mimetype="application/ZIP" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Baldi</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Sadowski</surname>
<given-names>P. J.</given-names>
</name>
</person-group> (<year>2013</year>). <article-title>Understanding Dropout</article-title>. <source>Adv. Neural Inf. Process. Syst.</source> <volume>26</volume>, <fpage>2814</fpage>&#x2013;<lpage>2822</lpage>. </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bowman</surname>
<given-names>S. R.</given-names>
</name>
<name>
<surname>Angeli</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Potts</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Manning</surname>
<given-names>C. D.</given-names>
</name>
</person-group> (<year>2015</year>). <article-title>A Large Annotated Corpus for Learning Natural Language Inference</article-title>. <source>arXiv preprint arXiv:1508.05326</source>. <pub-id pub-id-type="doi">10.18653/v1/d15-1075</pub-id> </citation>
</ref>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Bromley</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Bentz</surname>
<given-names>J. W.</given-names>
</name>
<name>
<surname>Bottou</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Guyon</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>LeCun</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Moore</surname>
<given-names>C.</given-names>
</name>
<etal/>
</person-group> (<year>1993</year>). <article-title>Signature Verification Using a &#x201C;Siamese&#x201D; Time Delay Neural Network</article-title>. <source>Int. J. Patt. Recogn. Artif. Intell.</source> <volume>07</volume>, <fpage>669</fpage>&#x2013;<lpage>688</lpage>. <pub-id pub-id-type="doi">10.1142/s0218001493000339</pub-id> </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Colby</surname>
<given-names>K. M.</given-names>
</name>
</person-group> (<year>1981</year>). <article-title>Modeling a Paranoid Mind</article-title>. <source>Behav. Brain Sci.</source> <volume>4</volume>, <fpage>515</fpage>&#x2013;<lpage>534</lpage>. <pub-id pub-id-type="doi">10.1017/s0140525x00000030</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Fan</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Song</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Shibasaki</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Tsubouchi</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Trajectory Fingerprint: One-Shot Human Trajectory Identification Using Siamese Network</article-title>. <source>CCF Trans. Pervasive Comp. Interact.</source> <volume>2</volume>, <fpage>113</fpage>&#x2013;<lpage>125</lpage>. <pub-id pub-id-type="doi">10.1007/s42486-020-00034-2</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Greff</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Srivastava</surname>
<given-names>R. K.</given-names>
</name>
<name>
<surname>Koutn&#xed;k</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Steunebrink</surname>
<given-names>B. R.</given-names>
</name>
<name>
<surname>Schmidhuber</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Lstm: A Search Space Odyssey</article-title>. <source>IEEE Trans. Neural Netw. Learn. Syst.</source> <volume>28</volume>, <fpage>2222</fpage>&#x2013;<lpage>2232</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2016.2582924</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Guo</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Guan</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Tan</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Frame-based Multi-Level Semantics Representation for Text Matching</article-title>. <source>Knowledge-Based Syst.</source> <volume>232</volume>, <fpage>107454</fpage>. <pub-id pub-id-type="doi">10.1016/j.knosys.2021.107454</pub-id> </citation>
</ref>
<ref id="B8">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Huang</surname>
<given-names>P.-S.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Gao</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Acero</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Heck</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2013</year>). &#x201c;<article-title>Learning Deep Structured Semantic Models for Web Search Using Clickthrough Data</article-title>,&#x201d; in <conf-name>Proceedings of the 22nd ACM international conference on Information &#x26; Knowledge Management</conf-name>, <fpage>2333</fpage>&#x2013;<lpage>2338</lpage>. <pub-id pub-id-type="doi">10.1145/2505515.2505665</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Ioffe</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Szegedy</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift</article-title>,&#x201d; in <conf-name>International conference on machine learning (PMLR)</conf-name>, <fpage>448</fpage>&#x2013;<lpage>456</lpage>. </citation>
</ref>
<ref id="B10">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Kenter</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Borisov</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>de Rijke</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2016</year>). <source>Siamese CBOW: Optimizing Word Embeddings for Sentence Representations</source>. <publisher-loc>Stroudsburg, Pennsylvania, USA</publisher-loc>: <publisher-name>The Association for Computer Linguistics</publisher-name>. </citation>
</ref>
<ref id="B11">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Kingma</surname>
<given-names>D. P.</given-names>
</name>
<name>
<surname>Ba</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Adam: A Method for Stochastic Optimization</article-title>,&#x201d; in <conf-name>3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings</conf-name>. Editors <person-group person-group-type="editor">
<name>
<surname>Bengio</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>LeCun</surname>
<given-names>Y.</given-names>
</name>
</person-group>. </citation>
</ref>
<ref id="B12">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Liu</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Deng</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Zeng</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Chen</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>D.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). &#x201c;<article-title>Lcqmc: A Large-Scale Chinese Question Matching Corpus</article-title>,&#x201d; in <conf-name>Proceedings of the 27th International Conference on Computational Linguistics</conf-name>, <fpage>1952</fpage>&#x2013;<lpage>1962</lpage>. </citation>
</ref>
<ref id="B13">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Yu</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Jian</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Sentence Semantic Matching Based on 3d Cnn for Human-Robot Language Interaction</article-title>. <source>ACM Trans. Internet Technol.</source> <volume>21</volume>, <fpage>1</fpage>&#x2013;<lpage>24</lpage>. <pub-id pub-id-type="doi">10.1145/3450520</pub-id> </citation>
</ref>
<ref id="B14">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2020a</year>). <article-title>Deep Hierarchical Encoding Model for Sentence Semantic Matching</article-title>. <source>J. Vis. Commun. Image Representation</source> <volume>71</volume>, <fpage>102794</fpage>. <pub-id pub-id-type="doi">10.1016/j.jvcir.2020.102794</pub-id> </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Liu</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Luo</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2020b</year>). <article-title>Concept Representation by Learning Explicit and Implicit Concept Couplings</article-title>. <source>IEEE Intell. Syst.</source> <volume>36</volume>, <fpage>6</fpage>&#x2013;<lpage>15</lpage>. </citation>
</ref>
<ref id="B16">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Al-Nabhan</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2021a</year>). <article-title>Graph Classification Based on Structural Features of Significant Nodes and Spatial Convolutional Neural Networks</article-title>. <source>Neurocomputing</source> <volume>423</volume>, <fpage>639</fpage>&#x2013;<lpage>650</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2020.10.060</pub-id> </citation>
</ref>
<ref id="B17">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ma</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Tian</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Al-Nabhan</surname>
<given-names>N.</given-names>
</name>
</person-group> (<year>2021b</year>). <article-title>A Novel Rumor Detection Algorithm Based on Entity Recognition, Sentence Reconfiguration, and Ordinary Differential Equation Network</article-title>. <source>Neurocomputing</source> <volume>447</volume>, <fpage>224</fpage>&#x2013;<lpage>234</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2021.03.055</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Mueller</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Thyagarajan</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2016</year>). <source>Siamese Recurrent Architectures for Learning Sentence Similarity</source>. <publisher-loc>Palo Alto, California, U.S.</publisher-loc>: <publisher-name>AAAI Press</publisher-name>, <fpage>2786</fpage>&#x2013;<lpage>2792</lpage>. </citation>
</ref>
<ref id="B19">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Na</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Shin</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Lee</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Lstm-based Throughput Prediction for Lte Networks</article-title>. <source>ICT Express</source>. <pub-id pub-id-type="doi">10.1016/j.icte.2021.12.001</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Neculoiu</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Versteegh</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Rotaru</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2016</year>). &#x201c;<article-title>Learning Text Similarity with Siamese Recurrent Networks</article-title>,&#x201d; in <conf-name>Proceedings of the 1st Workshop on Representation Learning for NLP</conf-name>, <fpage>148</fpage>&#x2013;<lpage>157</lpage>. <pub-id pub-id-type="doi">10.18653/v1/w16-1617</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Prechelt</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>1998</year>). <article-title>Automatic Early Stopping Using Cross Validation: Quantifying the Criteria</article-title>. <source>Neural Networks</source> <volume>11</volume>, <fpage>761</fpage>&#x2013;<lpage>767</lpage>. <pub-id pub-id-type="doi">10.1016/s0893-6080(98)00010-0</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rockt&#xe4;schel</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Grefenstette</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Hermann</surname>
<given-names>K. M.</given-names>
</name>
<name>
<surname>Kocisk&#xfd;</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Blunsom</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Reasoning about Entailment with Neural Attention</article-title>. <source>arXiv:1509.06664v4</source>. </citation>
</ref>
<ref id="B23">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Shang</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>Z.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2015</year>). <source>Neural Responding Machine for Short-Text Conversation</source>. <publisher-loc>Stroudsburg, Pennsylvania, USA</publisher-loc>: <publisher-name>The Association for Computer Linguistics</publisher-name>, <fpage>1577</fpage>&#x2013;<lpage>1586</lpage>. </citation>
</ref>
<ref id="B24">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Shen</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Min</surname>
<given-names>M. R.</given-names>
</name>
<name>
<surname>Su</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>Y.</given-names>
</name>
<etal/>
</person-group> (<year>2018</year>). <article-title>Baseline Needs More Love: On Simple Word-Embedding-Based Models and Associated Pooling Mechanisms</article-title>. <source>arXiv preprint arXiv:1805.09843</source>. <pub-id pub-id-type="doi">10.18653/v1/p18-1041</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Tang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Shen</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>D.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Zhu</surname>
<given-names>Z.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Adaptive Deep Feature Learning Network with Nesterov Momentum and its Application to Rotating Machinery Fault Diagnosis</article-title>. <source>Neurocomputing</source> <volume>305</volume>, <fpage>1</fpage>&#x2013;<lpage>14</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2018.04.048</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Sun</surname>
<given-names>Q.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2022</year>). <article-title>Ngcu: A New Rnn Model for Time-Series Data Prediction</article-title>. <source>Big Data Res.</source> <volume>27</volume>, <fpage>100296</fpage>. <pub-id pub-id-type="doi">10.1016/j.bdr.2021.100296</pub-id> </citation>
</ref>
<ref id="B27">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>A Compare-Aggregate Model for Matching Text Sequences</article-title>,&#x201d; in <conf-name>5th International Conference on Learning Representations, ICLR 2017, Toulon, France, April 24-26, 2017, Conference Track Proceedings (OpenReview.net)</conf-name>. </citation>
</ref>
<ref id="B28">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yao</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Zhao</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Auto-encoder Based Dimensionality Reduction</article-title>. <source>Neurocomputing</source> <volume>184</volume>, <fpage>232</fpage>&#x2013;<lpage>242</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2015.08.104</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Weizenbaum</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>1983</year>). <article-title>ELIZA - a Computer Program for the Study of Natural Language Communication between Man and Machine</article-title>. <source>Commun. ACM</source> <volume>26</volume>, <fpage>23</fpage>&#x2013;<lpage>28</lpage>. <pub-id pub-id-type="doi">10.1145/357980.357991</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Wieting</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Bansal</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Gimpel</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Livescu</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>2016</year>). &#x201c;<article-title>Towards Universal Paraphrastic Sentence Embeddings</article-title>,&#x201d; in <conf-name>4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings</conf-name>. </citation>
</ref>
<ref id="B31">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Wilensky</surname>
<given-names>R.</given-names>
</name>
</person-group> (<year>1987</year>). &#x201c;<article-title>The berkeley Unix Consultant Project</article-title>,&#x201d; in <source>Wissensbasierte Systeme</source> (<publisher-loc>Berlin, Germany</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>286</fpage>&#x2013;<lpage>296</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-642-88719-2_25</pub-id> </citation>
</ref>
<ref id="B32">
<citation citation-type="confproc">
<person-group person-group-type="author">
<name>
<surname>Yang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Yih</surname>
<given-names>W.-t.</given-names>
</name>
<name>
<surname>Meek</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Wikiqa: A challenge Dataset for Open-Domain Question Answering</article-title>,&#x201d; in <conf-name>Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing</conf-name>, <fpage>2013</fpage>&#x2013;<lpage>2018</lpage>. <pub-id pub-id-type="doi">10.18653/v1/d15-1237</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yin</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Sch&#xfc;tze</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Xiang</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Zhou</surname>
<given-names>B.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Abcnn: Attention-Based Convolutional Neural Network for Modeling Sentence Pairs</article-title>. <source>Tacl</source> <volume>4</volume>, <fpage>259</fpage>&#x2013;<lpage>272</lpage>. <pub-id pub-id-type="doi">10.1162/tacl_a_00097</pub-id> </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Xue</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>An</surname>
<given-names>L.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>2021a</year>). <article-title>A Simple and Efficient Text Matching Model Based on Deep Interaction</article-title>. <source>Inf. Process. Manage.</source> <volume>58</volume>, <fpage>102738</fpage>. <pub-id pub-id-type="doi">10.1016/j.ipm.2021.102738</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>R.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Lu</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Li</surname>
<given-names>F.</given-names>
</name>
<name>
<surname>Zhang</surname>
<given-names>X.</given-names>
</name>
<etal/>
</person-group> (<year>2021b</year>). <article-title>Sentence Pair Modeling Based on Semantic Feature Map for Human Interaction with Iot Devices</article-title>. <source>Int. J. Machine Learn. Cybernetics</source>. <pub-id pub-id-type="doi">10.1007/s13042-021-01349-x</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Yu</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Kim</surname>
<given-names>I. Y.</given-names>
</name>
<name>
<surname>Mechefske</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>An Improved Similarity-Based Prognostic Algorithm for Rul Estimation Using an Rnn Autoencoder Scheme</article-title>. <source>Reliability Eng. Syst. Saf.</source> <volume>199</volume>, <fpage>106926</fpage>. <pub-id pub-id-type="doi">10.1016/j.ress.2020.106926</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Zhang</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Huang</surname>
<given-names>X.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>Y.</given-names>
</name>
<name>
<surname>Jiang</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>He</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Wang</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Semantic Similarity Computing Model Based on Multi Model fine-grained Nonlinear Fusion</article-title>. <source>IEEE Access</source> <volume>9</volume>, <fpage>8433</fpage>&#x2013;<lpage>8443</lpage>. <pub-id pub-id-type="doi">10.1109/ACCESS.2021.3049378</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>