<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Syst. Neurosci.</journal-id>
<journal-title>Frontiers in Systems Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Syst. Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1662-5137</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fnsys.2022.845177</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Multisensory Concept Learning Framework Based on Spiking Neural Networks</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Wang</surname> <given-names>Yuwei</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1049737/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Zeng</surname> <given-names>Yi</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<xref ref-type="aff" rid="aff4"><sup>4</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/104116/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Research Center for Brain-inspired Intelligence, Institute of Automation, Chinese Academy of Sciences</institution>, <addr-line>Beijing</addr-line>, <country>China</country></aff>
<aff id="aff2"><sup>2</sup><institution>School of Artificial Intelligence, University of Chinese Academy of Sciences</institution>, <addr-line>Beijing</addr-line>, <country>China</country></aff>
<aff id="aff3"><sup>3</sup><institution>Center for Excellence in Brain Science and Intelligence Technology, Chinese Academy of Sciences</institution>, <addr-line>Shanghai</addr-line>, <country>China</country></aff>
<aff id="aff4"><sup>4</sup><institution>National Laboratory of Pattern Recognition, Institute of Automation, Chinese Academy of Sciences</institution>, <addr-line>Beijing</addr-line>, <country>China</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Yan Mark Yufik, Virtual Structures Research Inc., United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Laxmi R. Iyer, Institute for Infocomm Research (A*STAR), Singapore; Sun Zhe, RIKEN, Japan; Shangbin Chen, Huazhong University of Science and Technology, China</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Yi Zeng <email>yi.zeng&#x00040;ia.ac.cn</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>12</day>
<month>05</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>16</volume>
<elocation-id>845177</elocation-id>
<history>
<date date-type="received">
<day>29</day>
<month>12</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>20</day>
<month>04</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2022 Wang and Zeng.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Wang and Zeng</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Concept learning highly depends on multisensory integration. In this study, we propose a multisensory concept learning framework based on brain-inspired spiking neural networks to create integrated vectors relying on the concept&#x00027;s perceptual strength of auditory, gustatory, haptic, olfactory, and visual. With different assumptions, two paradigms: Independent Merge (IM) and Associate Merge (AM) are designed in the framework. For testing, we employed eight distinct neural models and three multisensory representation datasets. The experiments show that integrated vectors are closer to human beings than the non-integrated ones. Furthermore, we systematically analyze the similarities and differences between IM and AM paradigms and validate the generality of our framework.</p></abstract>
<kwd-group>
<kwd>concept learning</kwd>
<kwd>multisensory</kwd>
<kwd>spiking neural networks</kwd>
<kwd>brain-inspired</kwd>
<kwd>Independent Merge</kwd>
<kwd>Associate Merge</kwd>
</kwd-group>
<counts>
<fig-count count="6"/>
<table-count count="3"/>
<equation-count count="12"/>
<ref-count count="41"/>
<page-count count="12"/>
<word-count count="6794"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>Concept learning, or the ability to recognize commonalities and accentuate contrasts across a group of linked events in order to generate structured knowledge, is a crucial component of cognition (Roshan et al., <xref ref-type="bibr" rid="B27">2001</xref>). Multisensory integration benefits concept learning (Shams and Seitz, <xref ref-type="bibr" rid="B29">2008</xref>) and plays an important role in semantic processing (Xu et al., <xref ref-type="bibr" rid="B40">2017</xref>; Wang et al., <xref ref-type="bibr" rid="B39">2020</xref>). For example, when we learn the concept of &#x0201C;tea,&#x0201D; acoustically, we will perceive the sound of pouring water and brewing, the sound of clashing porcelain, the sound of drinking tea; on taste, we can feel the tea is a bit bitter, astringent or sweet; in touch, tea is liquid and we can feel its temperature; on smell, we can perceive the faint scent and visually, it often appears together with the teapot or tea bowl, and the tea leaves will have different colors. Combining information from multiple senses can produce enhanced perception and learning, faster response times, and improved detection, discrimination, and recognition capabilities (Calvert and Thesen, <xref ref-type="bibr" rid="B6">2004</xref>). In the brain, multisensory integration occurs mostly in the superior colliculus according to existing studies (Calvert and Thesen, <xref ref-type="bibr" rid="B6">2004</xref>; Cappe et al., <xref ref-type="bibr" rid="B7">2009</xref>). Multisensory integration is a field that has attracted the interest of cognitive psychologists, biologists, computational neuroscientists, and artificial intelligence researchers. The term &#x0201C;multisensory concept learning&#x0201D; is used in this work to describe the process of learning concepts using a model that mimics humans and combines information from multiple senses.</p>
<p>For the computational models of multisensory integration, cognitive psychologists&#x00027; models are usually focused on model design and validation from the mechanism of multisensory integration. These models are highly interpretable, taking neuroimaging and behavioral studies into consideration. The cue combination model based on Bayesian decision theory is a classical model for analyzing multisensory integration in cognitive psychology. It mainly models the stimuli of different modalities as the likelihood functions of Gaussian (Ursino et al., <xref ref-type="bibr" rid="B35">2009</xref>, <xref ref-type="bibr" rid="B34">2014</xref>) or Poisson (Anastasio et al., <xref ref-type="bibr" rid="B2">2014</xref>) distributions with different parameters, and calculates the best combination of each modality that makes the maximum posterior distribution through the assumption of conditional independence and Bayesian rules. Anastasio et al. built a model of visual and auditory fusion that combines neuronal dynamic equations with feedback information, and this model verified that multimodal stimuli have less response time than unimodal stimuli (Anastasio et al., <xref ref-type="bibr" rid="B2">2014</xref>). Parise et al. proposed multisensory correlation detector based models to describe correlation, lag, and synchrony across the senses (Parise and Ernst, <xref ref-type="bibr" rid="B25">2016</xref>). A purely visual haptic prediction model is presented by Gao et al. (<xref ref-type="bibr" rid="B10">2016</xref>) with CNNs and LSTMs, which enables robots to &#x0201C;feel&#x0201D; without physical interaction. Gepner et al. (<xref ref-type="bibr" rid="B11">2015</xref>) developed a linear-nonlinear-Poisson cascade model that incorporates information from olfaction and vision to mimic Drosophila larvae navigation decisions, and the model was able to predict Drosophila larvae reaction to new stimulus patterns well.</p>
<p>For artificial intelligence researchers, they have proposed different types of multisensory integration models based on the available data and machine learning methods, such as direct concatenation (Kiela and Bottou, <xref ref-type="bibr" rid="B18">2014</xref>; Collell et al., <xref ref-type="bibr" rid="B8">2017</xref>; Wang et al., <xref ref-type="bibr" rid="B38">2018b</xref>), canonical correlation analysis (Silberer et al., <xref ref-type="bibr" rid="B30">2013</xref>; Hill et al., <xref ref-type="bibr" rid="B14">2014</xref>), singular value decomposition of the integration matrix (Bruni et al., <xref ref-type="bibr" rid="B5">2014</xref>), multisensory context (Hill and Korhonen, <xref ref-type="bibr" rid="B13">2014</xref>), autoencoders (Silberer and Lapata, <xref ref-type="bibr" rid="B31">2014</xref>; Wang et al., <xref ref-type="bibr" rid="B37">2018a</xref>), and tensor fusion networks (Zadeh et al., <xref ref-type="bibr" rid="B41">2017</xref>; Liu et al., <xref ref-type="bibr" rid="B19">2018</xref>; Verma et al., <xref ref-type="bibr" rid="B36">2019</xref>). These works are mostly focused on concept learning and sentiment analysis tasks and are based on modeling of speech, text, and image data, which are commonly utilized in AI.</p>
<p>To our knowledge, no work exists to model the five senses of vision, hearing, touch, taste, and smell together. This might be because controlling elements for experimental design is challenging for cognitive psychologists, while data for some modalities is difficult to get using perceptrons for AI researchers. Meanwhile, cognitive psychologists have published several multisensory datasets by asking volunteers how much they perceive a specific concept through their auditory, gustatory, tactile, olfactory, and visual senses in order to establish the strength of each modality. This provides a solid basis for the design of a multisensory integration model that includes these five modalities. In this article, we will model multisensory integration using brain-like spiking neural networks and merge input from five different modalities to generate integrated representations.</p>
<p>This paper is organized as follows: Section 2 will introduce relevant studies to our model, such as multisensory datasets and fundamental SNN models; Section 3 will describe the multisensory concept learning framework based on SNNs, which includes the Independent Merge and Associate Merge paradigms. Section 4 will exhibit the experiments, and the final section will explore the future works.</p>
</sec>
<sec id="s2">
<title>2. Related Works</title>
<sec>
<title>2.1. Multisensory Concept Representation Datasets</title>
<p>Cognitive psychologists label the multisensory datasets of concepts by asking volunteers how much each concept is acquired through a specific modality and introducing statistical methods to establish the representation vector for each concept. The pioneering work in this area is by Lynott and Connell (<xref ref-type="bibr" rid="B21">2013</xref>), who proposed modality exclusivity norms for 423 adjective concepts (Lynott and Connell, <xref ref-type="bibr" rid="B20">2009</xref>) and 400 nominal concepts on strength of association with each of the five primary sensory modalities (auditory, gustatory, haptic, olfactory, visual). In this article, we combine these two datasets of their previous works and denote them as LC823. Lancaster Sensorimotor Norms were published by Lynott et al. (<xref ref-type="bibr" rid="B22">2019</xref>), which included six perceptual modalities (auditory, gustatory, haptic, interoceptive, olfactory, visual) and five action effectors (foot/leg, hand/arm, head, mouth, torso). This dataset (we denote as Lancaster40k) is the largest ever, with 39,707 psycholinguistic concepts (Lynott et al., <xref ref-type="bibr" rid="B22">2019</xref>). Binder et al. (<xref ref-type="bibr" rid="B4">2016</xref>) constructed a set of brain-based componential semantic representation (BBSR) with 65 experienced attributes, including sensory, motor, spatial, temporal, affective, social, and cognitive experiences, relying on more recent neurobiological findings. This dataset contains 535 concepts and does an excellent work of separating a priori conceptual categories and capturing semantic similarity (Binder et al., <xref ref-type="bibr" rid="B4">2016</xref>). <xref ref-type="fig" rid="F1">Figure 1</xref> shows the the concept &#x0201C;honey&#x0201D; in the multisensory concept representation datasets mentioned.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>The concept &#x0201C;honey&#x0201D; in multisensory datasets.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnsys-16-845177-g0001.tif"/>
</fig>
<p>We&#x00027;ll concentrate on the effect of five forms of senses in this article: vision, touch, sound, smell, and taste. In BBSR, we employ the average value of the sub-dimensions corresponding to these five senses, while using the first five dimensions of Lancaster40k.</p>
</sec>
<sec>
<title>2.2. Basic Neuron and Synapse Models</title>
<p>Spiking neural networks (SNNs) are commonly referred to be the third generation of neural network models since theyareinspired by current discoveries in neuroscience (Maass, <xref ref-type="bibr" rid="B23">1997</xref>). Neurons are the basic processing units of the brain. They communicate with each other <italic>via</italic> synapses. When the membrane potential reaches a threshold, a spike is produced. External stimuli are conveyed by firing rate and the temporal pattern of spike trains (Rieke et al., <xref ref-type="bibr" rid="B26">1999</xref>; Gerstner and Kistler, <xref ref-type="bibr" rid="B12">2002</xref>). SNNs integrate temporal information into the model and are capable of accurately describing spike timing with dynamic changes in synaptic weights which are more biologically plausible. We will use SNNs as the foundation of our model to build a human-like multisensory integration concept learning framework. Here, we briefly outline the neural and synaptic models that will be used in this research.</p>
<sec>
<title>2.2.1. IF Neural Model</title>
<p>The integrate-and-fire (IF) model is a large family of models which assumes that a membrane potential threshold controls the spikes of neurons. A spike is fired when the somatic membrane potential exceeds the threshold, and the membrane potential is resumed to reset potential (Gerstner and Kistler, <xref ref-type="bibr" rid="B12">2002</xref>). The neural processing is properly formalized by the model. In this article, we follow a standard implementation (Troyer and Miller, <xref ref-type="bibr" rid="B33">1997</xref>), and the membrane potential <italic>v</italic>(<italic>t</italic>) obeys</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M1"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>I</mml:mi><mml:mi>F</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mtext>&#x000A0;&#x000A0;</mml:mtext><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0003E;</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x000A0;&#x000A0;</mml:mtext><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02190;</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>with the membrane time constant &#x003C4;<sub><italic>IF</italic></sub> &#x0003D; 20 <italic>ms</italic>, the resting potential <italic>v</italic><sub><italic>rest</italic></sub> &#x0003D; &#x02212;14 <italic>mV</italic>, the threshold for spike firing <italic>v</italic><sub><italic>th</italic></sub> &#x0003D; 6 <italic>mV</italic>, the reset potential <italic>v</italic><sub><italic>r</italic></sub> &#x0003D; 0 <italic>mV</italic>, and excitatory potential <italic>E</italic><sub><italic>e</italic></sub> &#x0003D; 0 <italic>mV</italic>. Synaptic inputs are modeled as conductance <italic>g</italic><sub><italic>e</italic></sub> changes with <inline-formula><mml:math id="M2"><mml:msub><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:mi>d</mml:mi><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, where &#x003C4;<sub><italic>e</italic></sub> &#x0003D; 5 <italic>mV</italic>.</p>
</sec>
<sec>
<title>2.2.2. LIF Neural Model</title>
<p>The leaky integrate-and-fire (LIF) neuron model is one of the most popular spiking neuron models because it is biologically realistic and computationally easy to study and mimic. The LIF neuron&#x00027;s subthreshold dynamics are described by the equation below:</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M3"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mi>I</mml:mi><mml:mi>F</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mrow><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mi>I</mml:mi></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0003E;</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02190;</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>In this paper, the membrane resistance constance <italic>R</italic><sub><italic>m</italic></sub> &#x0003D; 1, &#x003C4;<sub><italic>LIF</italic></sub> &#x0003D; 20, <italic>v</italic><sub><italic>rest</italic></sub> &#x0003D; 1.05, <italic>v</italic><sub><italic>th</italic></sub> &#x0003D; 1, and <italic>v</italic><sub><italic>r</italic></sub> &#x0003D; 0.</p>
</sec>
<sec>
<title>2.2.3. Izhikevich Neural Model</title>
<p>Izhikevich model was first proposed in 2003 to replicate spiking and bursting behavior of known types of cortical neurons. The model combines the biological plausibility of Hodgkin and Huxley (<xref ref-type="bibr" rid="B15">1952</xref>) dynamics with the computing efficiency of integrate-and-fire neurons (Izhikevich, <xref ref-type="bibr" rid="B17">2003</xref>). Biophysically accurate Hodgkin-Huxley neural models are reduced to a two-dimensional system of the following dynamics ordinary with bifurcation methods:</p>
<disp-formula id="E3"><label>(3)</label><mml:math id="M4"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mfrac><mml:mrow><mml:mi>d</mml:mi><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>04</mml:mn><mml:mi>v</mml:mi><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup><mml:mo>&#x0002B;</mml:mo><mml:mn>5</mml:mn><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mn>140</mml:mn><mml:mo>-</mml:mo><mml:mi>u</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mi>I</mml:mi><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mfrac><mml:mrow><mml:mi>d</mml:mi><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mi>a</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>b</mml:mi><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mi>u</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>i</mml:mi><mml:mi>f</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0003E;</mml:mo><mml:msub><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>v</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02190;</mml:mo><mml:mi>c</mml:mi><mml:mtext>&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>a</mml:mi><mml:mi>n</mml:mi><mml:mi>d</mml:mi><mml:mtext>&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>u</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02190;</mml:mo><mml:mi>u</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mi>d</mml:mi></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where the time scale of the recovery variable <italic>u</italic> is described by the parameter <italic>a</italic>, the sensitivity of the recovery variable <italic>u</italic> to subthreshold changes of the membrane potential <italic>v</italic> is described by the parameter <italic>b</italic>, the parameter <italic>c</italic> defines the membrane potential <italic>v</italic>&#x00027;s after-spike reset value, which is induced by quick high-threshold <italic>K</italic><sup>&#x0002B;</sup> conductances and after-spike reset of the recovery variable <italic>u</italic> induced by slow high-threshold <italic>Na</italic><sup>&#x0002B;</sup> and <italic>K</italic><sup>&#x0002B;</sup> conductances is described by the parameter <italic>d</italic> (Izhikevich, <xref ref-type="bibr" rid="B17">2003</xref>).</p>
<p>The model simulates the spiking and bursting activity of known kinds of cortical or thalamic neurons such as resonator (RZ), fast spiking (FS), intrinsically bursting (IB), low-threshold spiking (LTS), regular spiking (RS), chattering (CH), and thalamo-cortical (TC) based on these four parameters. These models are employed extensively in our work and details are illustrated in <xref ref-type="table" rid="T1">Table 1</xref>.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Izhikevich models.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Neurons</bold></th>
<th valign="top" align="center" colspan="4" style="border-bottom: thin solid #000000;"><bold>Izhikevich parameters</bold></th>
</tr>
<tr>
<th/>
<th valign="top" align="center"><bold>a</bold></th>
<th valign="top" align="center"><bold>b</bold></th>
<th valign="top" align="center"><bold>c</bold></th>
<th valign="top" align="center"><bold>d</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">RZ (resonator)</td>
<td valign="top" align="center">0.10</td>
<td valign="top" align="center">0.25</td>
<td valign="top" align="center">&#x02212;65</td>
<td valign="top" align="center">2</td>
</tr>
<tr>
<td valign="top" align="left">FS (fast spiking)</td>
<td valign="top" align="center">0.10</td>
<td valign="top" align="center">0.20</td>
<td valign="top" align="center">&#x02212;65</td>
<td valign="top" align="center">2</td>
</tr>
<tr>
<td valign="top" align="left">IB (intrinsically bursting)</td>
<td valign="top" align="center">0.02</td>
<td valign="top" align="center">0.20</td>
<td valign="top" align="center">&#x02212;55</td>
<td valign="top" align="center">4</td>
</tr>
<tr>
<td valign="top" align="left">LTS (low-threshold spiking)</td>
<td valign="top" align="center">0.02</td>
<td valign="top" align="center">0.25</td>
<td valign="top" align="center">&#x02212;65</td>
<td valign="top" align="center">2</td>
</tr>
<tr>
<td valign="top" align="left">RS (regular spiking)</td>
<td valign="top" align="center">0.02</td>
<td valign="top" align="center">0.20</td>
<td valign="top" align="center">&#x02212;65</td>
<td valign="top" align="center">8</td>
</tr>
<tr>
<td valign="top" align="left">CH (chattering)</td>
<td valign="top" align="center">0.02</td>
<td valign="top" align="center">0.20</td>
<td valign="top" align="center">&#x02212;50</td>
<td valign="top" align="center">2</td>
</tr>
<tr>
<td valign="top" align="left">TC (thalamo-cortical)</td>
<td valign="top" align="center">0.02</td>
<td valign="top" align="center">0.25</td>
<td valign="top" align="center">&#x02212;65</td>
<td valign="top" align="center">0.05</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>2.2.4. STDP Synapse Models</title>
<p>Spike-timing-dependent plasticity (STDP) is a biological process that modifies the strength of neural connections in the brain. Learning and information storage in the brain, as well as the growth and refinement of neural circuits throughout brain development, are thought to be influenced by STDP (Bi and Poo, <xref ref-type="bibr" rid="B3">2001</xref>). The typical STDP model is used in this research, and the weight change &#x00394;<italic>w</italic> of a synapse relies on the relative time of presynaptic spike arrivals and postsynaptic spike arrivals. &#x00394;<italic>w</italic> &#x0003D; &#x003A3;<sub><italic>t</italic><sub><italic>pre</italic></sub></sub>&#x003A3;<sub><italic>t</italic><sub><italic>post</italic></sub></sub><italic>W</italic>(<italic>t</italic><sub><italic>post</italic></sub> &#x02212; <italic>t</italic><sub><italic>pre</italic></sub>), where the function <italic>W</italic>(&#x000B7;) is defined as:</p>
<disp-formula id="E4"><label>(4)</label><mml:math id="M5"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr columnalign="left"><mml:mtd><mml:mi>W</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo>&#x00394;</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" equalcolumns="false" class="array"><mml:mtr columnalign="left"><mml:mtd><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x0002B;</mml:mo></mml:mrow></mml:msub><mml:mi>e</mml:mi><mml:mi>x</mml:mi><mml:mi>p</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mo>&#x00394;</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x0002B;</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mtext>&#x000A0;</mml:mtext><mml:mo>&#x00394;</mml:mo><mml:mi>t</mml:mi><mml:mo>&#x0003E;</mml:mo><mml:mn>0</mml:mn></mml:mtd></mml:mtr><mml:mtr columnalign="left"><mml:mtd><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo></mml:mrow></mml:msub><mml:mi>e</mml:mi><mml:mi>x</mml:mi><mml:mi>p</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mfrac><mml:mrow><mml:mo>&#x00394;</mml:mo><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mtext>&#x000A0;</mml:mtext><mml:mo>&#x00394;</mml:mo><mml:mi>t</mml:mi><mml:mo>&#x0003C;</mml:mo><mml:mn>0</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>When implement STDP, we follow the way of Brian2 (Stimberg et al., <xref ref-type="bibr" rid="B32">2019</xref>), which defines two variables <italic>a</italic><sub><italic>pre</italic></sub> and <italic>a</italic><sub><italic>post</italic></sub> as the &#x0201C;traces&#x0201D; of of pre- and post-synaptic activity, governed by the following differential equations</p>
<disp-formula id="E5"><label>(5)</label><mml:math id="M6"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:msub><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>&#x003C4;</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mi>d</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Once a presynaptic spike occurs, the presynaptic trace is updated and the weight is modified according to the rule</p>
<disp-formula id="E6"><label>(6)</label><mml:math id="M7"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02190;</mml:mo><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>w</mml:mi><mml:mo>&#x02190;</mml:mo><mml:mi>w</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>And when a postsynaptic spike occurs:</p>
<disp-formula id="E7"><label>(7)</label><mml:math id="M8"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02190;</mml:mo><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>w</mml:mi><mml:mo>&#x02190;</mml:mo><mml:mi>w</mml:mi><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>This is proved to be equivalent for the two kinds of STDP formulations. And, in this article &#x003C4;<sub><italic>pre</italic></sub> &#x0003D; &#x003C4;<sub><italic>post</italic></sub> &#x0003D; 1<italic>ms</italic>.</p>
</sec>
</sec>
</sec>
<sec id="s3">
<title>3. The Framework of Multisensory Concept Learning Framework Based on Spiking Neural Networks</title>
<p>We present a multisensory concept learning framework based on SNNs in this part. The model&#x00027;s input is a multisensory vector labeled by cognitive psychologists, with an integrated vector as the output following SNNs merging. Since there is no biological study to show whether the information of multiple senses is independent or associated before integration, two different paradigms: Independent Merge (IM) and Associate Merge (AM) are designed in our framework. The types of inputs and outputs are the same for both paradigms, but the architectural design of SNNs is different. These two paradigms involve the same phase in the framework, and only oneparadigm is chosen for concept integration, depending on the assumption that whether multiple sensory input is independent before integration.</p>
<p><xref ref-type="fig" rid="F2">Figure 2</xref> illustrates the workflow: Firstly, for each modality of the concept, we employ a neural model and transform its perceptual strength in the concept&#x00027;s multisensory vector into external stimuli to the neuron (we work on five sensory modalities: auditory, gustatory, haptic, olfactory, visual, so the dimensions of the multisensory vector is five); Secondly, the architecture of SNN is designed according to different assumptions. We choose the IM paradigm if we assume that multiple senses are independent of each other before fusion, and we choose the AM paradigm if we assume that multiple senses are associated with each other; Thirdly, we specify the neuron model in SNN and sequentially feed concepts to the network, with STDP rules adjusting the network&#x00027;s connection weights. Given the running interval [0, <italic>T</italic>], we record the spike trains of each neuron; Finally, we convert the spike trains of specific neurons into binarycode as the final integrated representation. The framework is described in detail with the IM and AM paradigms individually in the following sections.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>The framework of multisensory concept learning framework based on spiking neural networks. Firstly, for each modality of the concept, we employ a neural model and transform its perceptual strength in the concept&#x00027;s multisensory vector into external stimuli to the neuron; Secondly, the architecture of SNN is designed according to different assumptions. We choose the IM paradigm if we assume that multiple senses are independent of each other before fusion, and we choose the AM paradigm if we assume that multiple senses are associated with each other; Thirdly, we specify the neuron model in SNN and sequentially feed concepts to the network, with STDP rules adjusting the network&#x00027;s connection weights. Given the running interval [0, <italic>T</italic>], we record the spike trains of each neuron; Finally, we convert the spike trains of specific neurons into binarycode as the final integrated representation.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnsys-16-845177-g0002.tif"/>
</fig>
<sec>
<title>3.1. The Framework</title>
<sec>
<title>3.1.1. Independent Merge</title>
<p>The IM paradigm is founded on the commonly used cognitive psychology assumption that information for each modality of the concept is independent before integration. It&#x00027;s a two-layer spiking neural network model, with five neurons corresponding to the stimuli of the concept&#x00027;s five separate modal information in the second layer, and a neuron reflecting the neural state after multisensory integration in the second layer. We record the spiking train of the postsynaptic neuron and transform them into integrated vectors for the concept.</p>
<p>For each concept, we get its representation from human-labeled vectors, <inline-formula><mml:math id="M9"><mml:mover accent="true"><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mo>&#x02192;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>G</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>H</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>O</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>V</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. The subscripts here represent the concept&#x00027;s perceptual strength as indicated by auditory, gustatory, haptic, olfactory, and visual senses. We min-max normalize the multisensory representation of the concept in the dataset as input to the model during the data preparation stage such that each value of the vector is between 0 and 1. In LC823, for instance, the vector for the concept &#x0201C;honey&#x0201D; is [0.13, 0.95, 0.57, 0.75, 0.80]. We employ LIF or Izhikevich as presynaptic neural models and IF as postsynaptic neural models independently for the generality of the framework. Initially, for each presynaptic neuron, we regard the current <italic>I</italic> &#x0003D; <italic>m</italic><sub><italic>i</italic></sub>&#x0002A;<italic>I</italic><sub><italic>boost</italic></sub> as the stimuli to the neuron <italic>where i</italic> &#x02208; [<italic>A, G, H, O, V</italic>] The the conductance <italic>g</italic><sub><italic>e</italic></sub> of the postsynaptic neuron is updated whenever the presynaptic neuron fires as <italic>g</italic><sub><italic>e</italic></sub>&#x02190;<italic>g</italic><sub><italic>e</italic></sub>&#x0002B;&#x00394;<italic>W</italic><sub><italic>ij</italic></sub>, and the postsynaptic neuron generates spikes based on the IF model. The synaptic strength between the postsynaptic neuron and the presynaptic neuron is referred to as the weight &#x00394;<italic>W</italic><sub><italic>ij</italic></sub> in this case. The initial weights between presynaptic and postsynaptic neurons <inline-formula><mml:math id="M10"><mml:msubsup><mml:mrow><mml:mi>W</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:math></inline-formula> where <inline-formula><mml:math id="M11"><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></inline-formula>,<inline-formula><mml:math id="M12"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> represents the variance for each kind of multisensory data. They are calculated using the Bayesian formula and the assumption that each modal is independent before to fusion (details in the Appendix). At the same time, the spike trains of presynaptic and postsynaptic neurons will dynamically adjust to the weights in accordance with the STDP law. During [0, <italic>T</italic>], we record the spike train of the postsynaptic neuron <italic>S</italic><sup><italic>post</italic></sup>([0, <italic>T</italic>]) and transform them into binarycode <italic>B</italic><sup><italic>post</italic></sup>([0, <italic>T</italic>]), as the final integration representation for the concept in the following manner:</p>
<disp-formula id="E8"><label>(8)</label><mml:math id="M13"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:msup><mml:mi>B</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>T</mml:mi><mml:mo stretchy="false">]</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mi mathvariant="-tex-caligraphic">T</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msup><mml:mi>S</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo stretchy="false">]</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="-tex-caligraphic">T</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msup><mml:mi>S</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:mo>&#x0002A;</mml:mo><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo stretchy="false">]</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mo>&#x022EF;</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi mathvariant="-tex-caligraphic">T</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msup><mml:mi>S</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mi>k</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mo>&#x0002A;</mml:mo><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi><mml:mo>&#x0002A;</mml:mo><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo stretchy="false">]</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mo>&#x022EF;</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="-tex-caligraphic">T</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msup><mml:mi>S</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo stretchy="false">&#x0230A;</mml:mo><mml:mrow><mml:mfrac><mml:mi>T</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:mfrac></mml:mrow><mml:mo stretchy="false">&#x0230B;</mml:mo></mml:mrow><mml:mo>&#x0002A;</mml:mo><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>T</mml:mi><mml:mo stretchy="false">]</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">]</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Here <inline-formula><mml:math id="M14"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">T</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mi>n</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>v</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> operation means that if there is any spikes in the interval, then the bit is 1, otherwise it is 0.</p>
</sec>
<sec>
<title>3.1.2. Associate Merge</title>
<p>The AM paradigm assumes that the information for each modality of the concept is associate before integration. It&#x00027;s a five-neuron spiking neural network model, with five neurons corresponding to the stimuli of the concept&#x00027;s five separate modal information. They are connected to one another, and there are no self-connections. We record the spiking trains of all neurons and transform them into integrated vectors for the concept.</p>
<p>We use LIF or Izhikevich neural models to model each neuron for the generality of the framework. For each concept, we get its normalized representation from human-labeled vectors, <inline-formula><mml:math id="M15"><mml:mover accent="true"><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mo>&#x02192;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>G</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>H</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>O</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>V</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:math></inline-formula>. Initially, for each neuron <italic>i</italic> &#x02208; [<italic>A, G, H, O, V</italic>], we consider <italic>I</italic> &#x0003D; <italic>m</italic><sub><italic>i</italic></sub>&#x0002A;<italic>I</italic><sub><italic>boost</italic></sub> as the stimuli. The the current <italic>I</italic> of the postsynaptic neuron is updated whenever the presynaptic neuron fires as <italic>I</italic> &#x02190; <italic>I</italic> &#x0002B; &#x00394;<italic>W</italic><sub><italic>ij</italic></sub>. And the postsynaptic neuron generates spikes based on the its model. The weight <italic>W</italic><sub><italic>ij</italic></sub> is the synaptic strength between the presynaptic neuron and the postsynaptic neuron. The initial value for the weight is determined by the correlation each modality pair overall the representation dataset, i.e., <italic>W</italic><sub>0</sub> &#x0003D; <italic>Corr</italic>(<italic>i, j</italic>) where <italic>i, j</italic> &#x02208; [<italic>A, G, H, O, V</italic>], which is different from AM paradigm. Simultaneously, presynaptic and postsynaptic neurons&#x00027; spike trains will dynamically change to the weights in accordance with the STDP law. We denote <italic>S</italic><sup><italic>i</italic></sup>([0, <italic>T</italic>]) as the <italic>ith</italic> neuron&#x00027;s spike trains during [0, <italic>T</italic>] and corresponding binary vector <italic>B</italic><sup><italic>i</italic></sup>([0, <italic>T</italic>]). And we record the spike trains of all neurons, transform them into binarycode <italic>B</italic><sup><italic>i</italic></sup>([0, <italic>T</italic>]) and concatenate them as the final integration vector <italic>B</italic>([0, <italic>T</italic>]) in the following way:</p>
<disp-formula id="E9"><label>(9)</label><mml:math id="M16"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msup><mml:mi>B</mml:mi><mml:mi>i</mml:mi></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>T</mml:mi><mml:mo stretchy="false">]</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>=</mml:mo><mml:mo stretchy="false">[</mml:mo><mml:mi mathvariant="-tex-caligraphic">T</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msup><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo stretchy="false">]</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="-tex-caligraphic">T</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msup><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:mo>&#x0002A;</mml:mo><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo stretchy="false">]</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mo>&#x022EF;</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi mathvariant="-tex-caligraphic">T</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msup><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mi>k</mml:mi><mml:mo>&#x02212;</mml:mo><mml:mn>1</mml:mn><mml:mo stretchy="false">)</mml:mo><mml:mo>&#x0002A;</mml:mo><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi><mml:mo>&#x0002A;</mml:mo><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo stretchy="false">]</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo>,</mml:mo><mml:mo>&#x022EF;</mml:mo><mml:mo>,</mml:mo><mml:mi mathvariant="-tex-caligraphic">T</mml:mi><mml:mo stretchy="false">(</mml:mo><mml:msup><mml:mi>S</mml:mi><mml:mi>i</mml:mi></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo stretchy="false">&#x0230A;</mml:mo><mml:mrow><mml:mfrac><mml:mi>T</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:mfrac></mml:mrow><mml:mo stretchy="false">&#x0230B;</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x0002A;</mml:mo><mml:mi>t</mml:mi><mml:mi>o</mml:mi><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>T</mml:mi><mml:mo stretchy="false">]</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">)</mml:mo><mml:mo stretchy="false">]</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E10"><label>(10)</label><mml:math id="M17"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>B</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:msup><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msup><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo>&#x02295;</mml:mo><mml:msup><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mrow><mml:mi>H</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02295;</mml:mo><mml:msup><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mrow><mml:mi>G</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02295;</mml:mo><mml:msup><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mrow><mml:mi>O</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x02295;</mml:mo><mml:msup><mml:mrow><mml:mi>B</mml:mi></mml:mrow><mml:mrow><mml:mi>V</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:mi>T</mml:mi></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo><mml:mo>]</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
</sec>
</sec>
</sec>
<sec id="s4">
<title>4. Experiments</title>
<sec>
<title>4.1. Concept Similarity Test</title>
<p>Concept similarity test is commonly used in the field of artificial intelligence to evaluate the effectiveness of system-generated representations (Agirre et al., <xref ref-type="bibr" rid="B1">2009</xref>). Generally, humans score the similarity of a particular concept pair, while the concept pair corresponds to the system-generated representation to calculate the similarity score. After the two scores are ranked in the measure dataset, the Spearman&#x00027;s correlation coefficient is calculated to reflect how close the system-generated representations are to humans. In this article, we evaluate the closeness of the concepts&#x00027; original or multisensory integration representations and human beings with WordSim353 (Agirre et al., <xref ref-type="bibr" rid="B1">2009</xref>) and SCWS1994 (Huang et al., <xref ref-type="bibr" rid="B16">2012</xref>).</p>
<sec>
<title>4.1.1. The Experiment</title>
<p>To thoroughly test our framework, we did experiments for IM and AM paradigms with three multisensory datasets (BBSR, LC823, Lancaster40k) respectively and analyzed the effectiveness differences between the representations after SNN integration and the original representations. In the experiments, both IM and AM paradigms involve a unique parameter in the process of conversion from spike trains to binarycode: the tolerance <italic>tol</italic>. It represents the size of the reducing window for converting spike trains in the time interval into binarycode, which reflects the strength of compressing the spike sequence into a integrated binarycode. In each dimension of the integrated vector, a larger <italic>tol</italic> signifies a higher degree of information compression and a bigger reducing window, and <italic>vice versa</italic>. But, if <italic>tol</italic> is too small, the representation vector&#x00027;s dimensionality will be too large, and if <italic>tol</italic> is too big, the diversity of all representations will be damaged. Therefore, we traverse <italic>tol</italic> across the range [0, 500] while restricting diversity to the range [0.05, 0.95], and the results indicate the present model&#x00027;s ideal results as well as the matching <italic>tol</italic>.</p>
<p>We used the evaluation datasets WordSim353 and SCWS1994 for testing, and the inputs of the models were from different sources of multisensory representation datasets: BBSR, LC823an, Lancaster40k, and tested using two paradigms, IM and AM, respectively. For the AM paradigm, Izhikevich&#x00027;s seven models and LIF model were used, while for the IM paradigm, IF model were used for postsynaptic neurons and Izhikevich&#x00027;s seven models and LIF model were used for presynaptic neurons. The running time of all the tests is 100 ms and <italic>I</italic><sub><italic>boost</italic></sub> &#x0003D; 100.</p>
</sec>
<sec>
<title>4.1.2. Results and Analysis</title>
<p>From the overall results for both IM and AM paradigms, the integrated vectors are closer to humans than the original vectors based on our models: 37 submodels achieved better results for a total of 48 tests for both IM and AM, as <xref ref-type="table" rid="T2">Table 2</xref> shows. In terms of overall dataset, 15/16 tests work better for the BBSR dataset, 14/16 tests work better for LC823an, and 8/16 tests work better for Lancaster40k.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Concept similarity test results.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Merge way</bold></th>
<th valign="top" align="left"><bold>Model</bold></th>
<th valign="top" align="center" colspan="4" style="border-bottom: thin solid #000000;"><bold>BBSR</bold></th>
<th valign="top" align="center" colspan="4" style="border-bottom: thin solid #000000;"><bold>LC823an</bold></th>
<th valign="top" align="center" colspan="4" style="border-bottom: thin solid #000000;"><bold>Lancaster40k</bold></th>
</tr>
<tr>
<th/>
<th/>
<th valign="top" align="center"><bold>Tol</bold></th>
<th valign="top" align="center"><bold>WordSim353</bold></th>
<th valign="top" align="center"><bold>SCWS1994</bold></th>
<th valign="top" align="center"><bold>Average</bold></th>
<th valign="top" align="center"><bold>Tol</bold></th>
<th valign="top" align="center"><bold>WordSim353</bold></th>
<th valign="top" align="center"><bold>SCWS1994</bold></th>
<th valign="top" align="center"><bold>Average</bold></th>
<th valign="top" align="center"><bold>Tol</bold></th>
<th valign="top" align="center"><bold>WordSim353</bold></th>
<th valign="top" align="center"><bold>SCWS1994</bold></th>
<th valign="top" align="center"><bold>Average</bold></th>
</tr>
</thead>
<tbody>
<tr style="border-bottom: thin solid #000000;">
<td valign="top" align="left">Origin</td>
<td valign="top" align="left">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">0.4182</td>
<td valign="top" align="center">0.5838</td>
<td valign="top" align="center">0.5010</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">0.1321</td>
<td valign="top" align="center">0.5525</td>
<td valign="top" align="center">0.3423</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">0.2640</td>
<td valign="top" align="center"><bold>0.3974</bold></td>
<td valign="top" align="center">0.3534</td>
</tr> <tr>
<td valign="top" align="left">AM</td>
<td valign="top" align="left">Izh-RZ</td>
<td valign="top" align="center">93</td>
<td valign="top" align="center">0.3455</td>
<td valign="top" align="center"><underline>0.6089</underline></td>
<td valign="top" align="center">0.4772</td>
<td valign="top" align="center">165</td>
<td valign="top" align="center"><underline>0.3804</underline></td>
<td valign="top" align="center">0.4260</td>
<td valign="top" align="center"><underline>0.4032</underline></td>
<td valign="top" align="center">9</td>
<td valign="top" align="center"><underline>0.3560</underline></td>
<td valign="top" align="center">0.3295</td>
<td valign="top" align="center">0.3427</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Izh-FS</td>
<td valign="top" align="center">95</td>
<td valign="top" align="center"><underline>0.4955</underline></td>
<td valign="top" align="center">0.5659</td>
<td valign="top" align="center"><underline>0.5307</underline></td>
<td valign="top" align="center">312</td>
<td valign="top" align="center"><underline>0.4223</underline></td>
<td valign="top" align="center">0.3788</td>
<td valign="top" align="center"><underline>0.4006</underline></td>
<td valign="top" align="center">9</td>
<td valign="top" align="center"><underline>0.3787</underline></td>
<td valign="top" align="center">0.3471</td>
<td valign="top" align="center"><underline>0.3629</underline></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Izh-IB</td>
<td valign="top" align="center">384</td>
<td valign="top" align="center"><underline>0.5455</underline></td>
<td valign="top" align="center"><underline>0.5870</underline></td>
<td valign="top" align="center"><underline>0.5662</underline></td>
<td valign="top" align="center">32</td>
<td valign="top" align="center"><underline>0.3696</underline></td>
<td valign="top" align="center">0.5277</td>
<td valign="top" align="center"><underline>0.4486</underline></td>
<td valign="top" align="center">25</td>
<td valign="top" align="center"><underline>0.3388</underline></td>
<td valign="top" align="center">0.3818</td>
<td valign="top" align="center"><underline>0.3603</underline></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Izh-LTS</td>
<td valign="top" align="center">174</td>
<td valign="top" align="center"><underline>0.5068</underline></td>
<td valign="top" align="center"><underline>0.6127</underline></td>
<td valign="top" align="center"><underline>0.5598</underline></td>
<td valign="top" align="center">17</td>
<td valign="top" align="center"><underline>0.3107</underline></td>
<td valign="top" align="center">0.5390</td>
<td valign="top" align="center"><underline>0.4249</underline></td>
<td valign="top" align="center">16</td>
<td valign="top" align="center"><underline>0.3557</underline></td>
<td valign="top" align="center">0.3629</td>
<td valign="top" align="center"><underline>0.3593</underline></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Izh-RS</td>
<td valign="top" align="center">366</td>
<td valign="top" align="center"><underline>0.4955</underline></td>
<td valign="top" align="center"><underline>0.5857</underline></td>
<td valign="top" align="center"><underline>0.5406</underline></td>
<td valign="top" align="center">84</td>
<td valign="top" align="center"><underline>0.5179</underline></td>
<td valign="top" align="center">0.5271</td>
<td valign="top" align="center"><underline>0.5225</underline></td>
<td valign="top" align="center">55</td>
<td valign="top" align="center"><underline>0.3206</underline></td>
<td valign="top" align="center">0.3708</td>
<td valign="top" align="center"><underline>0.3457</underline></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Izh-CH</td>
<td valign="top" align="center">170</td>
<td valign="top" align="center"><underline>0.4273</underline></td>
<td valign="top" align="center"><underline>0.5928</underline></td>
<td valign="top" align="center"><underline>0.5100</underline></td>
<td valign="top" align="center">7</td>
<td valign="top" align="center">0.1089</td>
<td valign="top" align="center">0.4884</td>
<td valign="top" align="center">0.2986</td>
<td valign="top" align="center">14</td>
<td valign="top" align="center"><underline>0.3150</underline></td>
<td valign="top" align="center">0.3349</td>
<td valign="top" align="center"><underline>0.3249</underline></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Izh-TC</td>
<td valign="top" align="center">148</td>
<td valign="top" align="center"><underline>0.5068</underline></td>
<td valign="top" align="center"><underline>0.6103</underline></td>
<td valign="top" align="center"><underline>0.5586</underline></td>
<td valign="top" align="center">6</td>
<td valign="top" align="center"><underline>0.2214</underline></td>
<td valign="top" align="center">0.5181</td>
<td valign="top" align="center"><underline>0.3698</underline></td>
<td valign="top" align="center">7</td>
<td valign="top" align="center"><underline><bold>0.3979</bold></underline></td>
<td valign="top" align="center">0.3364</td>
<td valign="top" align="center"><underline>0.3672</underline></td>
</tr>
<tr style="border-bottom: thin solid #000000;">
<td/>
<td valign="top" align="left">LIF</td>
<td valign="top" align="center">187</td>
<td valign="top" align="center"><underline>0.5727</underline></td>
<td valign="top" align="center"><underline>0.6927</underline></td>
<td valign="top" align="center"><underline>0.6327</underline></td>
<td valign="top" align="center">330</td>
<td valign="top" align="center"><underline>0.5036</underline></td>
<td valign="top" align="center"><underline><bold>0.6330</bold></underline></td>
<td valign="top" align="center"><underline>0.5683</underline></td>
<td valign="top" align="center">86</td>
<td valign="top" align="center"><underline>0.1788</underline></td>
<td valign="top" align="center">0.3500</td>
<td valign="top" align="center"><underline>0.2644</underline></td>
</tr>
<tr>
<td valign="top" align="left">IM</td>
<td valign="top" align="left">Izh-RZ</td>
<td valign="top" align="center">17</td>
<td valign="top" align="center"><underline>0.4636</underline></td>
<td valign="top" align="center"><underline>0.634</underline></td>
<td valign="top" align="center"><underline>0.5488</underline></td>
<td valign="top" align="center">10</td>
<td valign="top" align="center"><underline>0.5545</underline></td>
<td valign="top" align="center"><underline>0.5618</underline></td>
<td valign="top" align="center"><underline>0.5581</underline></td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">0.2026</td>
<td valign="top" align="center">0.3139</td>
<td valign="top" align="center">0.2583</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Izh-FS</td>
<td valign="top" align="center">17</td>
<td valign="top" align="center"><underline>0.4636</underline></td>
<td valign="top" align="center"><underline>0.6388</underline></td>
<td valign="top" align="center"><underline>0.5512</underline></td>
<td valign="top" align="center">10</td>
<td valign="top" align="center"><underline>0.5545</underline></td>
<td valign="top" align="center"><underline>0.5617</underline></td>
<td valign="top" align="center"><underline>0.5581</underline></td>
<td valign="top" align="center">21</td>
<td valign="top" align="center"><underline>0.3371</underline></td>
<td valign="top" align="center">0.2910</td>
<td valign="top" align="center">0.3140</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Izh-IB</td>
<td valign="top" align="center">7</td>
<td valign="top" align="center"><underline><bold>0.5477</bold></underline></td>
<td valign="top" align="center"><underline>0.5988</underline></td>
<td valign="top" align="center"><underline><bold>0.5733</bold></underline></td>
<td valign="top" align="center">24</td>
<td valign="top" align="center"><underline>0.5509</underline></td>
<td valign="top" align="center">0.5491</td>
<td valign="top" align="center"><underline>0.5500</underline></td>
<td valign="top" align="center">31</td>
<td valign="top" align="center">0.1597</td>
<td valign="top" align="center">0.3040</td>
<td valign="top" align="center">0.2319</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Izh-LTS</td>
<td valign="top" align="center">83</td>
<td valign="top" align="center"><underline>0.5000</underline></td>
<td valign="top" align="center"><underline><bold>0.6417</bold></underline></td>
<td valign="top" align="center"><underline>0.5708</underline></td>
<td valign="top" align="center">18</td>
<td valign="top" align="center"><underline><bold>0.6080</bold></underline></td>
<td valign="top" align="center">0.5361</td>
<td valign="top" align="center"><underline><bold>0.5721</bold></underline></td>
<td valign="top" align="center">56</td>
<td valign="top" align="center"><underline>0.3610</underline></td>
<td valign="top" align="center">0.3448</td>
<td valign="top" align="center">0.3529</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Izh-RS</td>
<td valign="top" align="center">196</td>
<td valign="top" align="center"><underline>0.5023</underline></td>
<td valign="top" align="center">0.5530</td>
<td valign="top" align="center"><underline>0.5276</underline></td>
<td valign="top" align="center">163</td>
<td valign="top" align="center"><underline>0.4830</underline></td>
<td valign="top" align="center">0.4613</td>
<td valign="top" align="center"><underline>0.4722</underline></td>
<td valign="top" align="center">68</td>
<td valign="top" align="center">0.0757</td>
<td valign="top" align="center">0.2959</td>
<td valign="top" align="center">0.1858</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Izh-CH</td>
<td valign="top" align="center">94</td>
<td valign="top" align="center"><underline>0.4659</underline></td>
<td valign="top" align="center">0.5786</td>
<td valign="top" align="center"><underline>0.5222</underline></td>
<td valign="top" align="center">8</td>
<td valign="top" align="center"><underline>0.5696</underline></td>
<td valign="top" align="center">0.4746</td>
<td valign="top" align="center"><underline>0.5221</underline></td>
<td valign="top" align="center">50</td>
<td valign="top" align="center"><underline>0.3843</underline></td>
<td valign="top" align="center">0.3813</td>
<td valign="top" align="center"><underline><bold>0.3828</bold></underline></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Izh-TC</td>
<td valign="top" align="center">17</td>
<td valign="top" align="center"><underline>0.4636</underline></td>
<td valign="top" align="center"><underline>0.6125</underline></td>
<td valign="top" align="center"><underline>0.5381</underline></td>
<td valign="top" align="center">5</td>
<td valign="top" align="center"><underline>0.4509</underline></td>
<td valign="top" align="center">0.5310</td>
<td valign="top" align="center"><underline>0.4909</underline></td>
<td valign="top" align="center">20</td>
<td valign="top" align="center"><underline>0.3387</underline></td>
<td valign="top" align="center">0.3042</td>
<td valign="top" align="center">0.3215</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">LIF</td>
<td valign="top" align="center">143</td>
<td valign="top" align="center"><underline>0.4205</underline></td>
<td valign="top" align="center"><underline>0.6167</underline></td>
<td valign="top" align="center"><underline>0.5186</underline></td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0.0643</td>
<td valign="top" align="center"><underline>0.5672</underline></td>
<td valign="top" align="center">0.3158</td>
<td valign="top" align="center">324</td>
<td valign="top" align="center">0.0018</td>
<td valign="top" align="center">0.1481</td>
<td valign="top" align="center">0.1965</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The bold values indicates the current measure dataset reflect the best results, while the underlined values imply that the multisensory integrated representation is closer to humans than the original representation</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>In almost all experiments, multisensory integrated representations based on our framework outperform unintegrated ones, with the exception of the instability shown in IM and AM paradigms when Lancaster40k is employed as the input. For any of the multisensory vectors, an integration way could be found to improve their representations.</p>
</sec>
</sec>
<sec>
<title>4.2. Comparisons Between IM and AM Paradigms</title>
<p>Unlike the analysis of the macro-level above, in this section we introduce the concept feature norms to compare IM and AM paradigms from the micro-level perspective of each concept. Concept feature norms are a way of representing concepts by using standardized and systematic feature descriptions that mirror human comprehension. The similarities and differences of concepts are related to the intersection and difference of concept feature norms. McRae&#x00027;s concept feature norms, introduced by McRae et al. (<xref ref-type="bibr" rid="B24">2005</xref>), are the most prominent work in this area. They not only supplied 541 concepts with feature norms, but also proposed a methodology for generating them. For example, the feature norms of the concept &#x0201C;basement&#x0201D; are &#x0201C;<italic>used for storage,&#x0201D; &#x0201C;found below ground,&#x0201D; &#x0201C;is cold,&#x0201D; &#x0201C;found on the bottom floor,&#x0201D; &#x0201C;is dark,&#x0201D; &#x0201C;is damp,&#x0201D; &#x0201C;made of cement,&#x0201D; &#x0201C;part of a house,&#x0201D; &#x0201C;has windows,&#x0201D; &#x0201C;has a furnace,&#x0201D; &#x0201C;has a foundation,&#x0201D; &#x0201C;has stairways,&#x0201D; &#x0201C;has walls,&#x0201D; &#x0201C;is musty,&#x0201D; &#x0201C;is scary,&#x0201D; and &#x0201C;is the lowest floor.&#x0201D;</italic> Another semantic feature norms dataset analogous to McRae is CSLB (Centre for Speech, Language, and the Brain). They collected 866 concepts and improved the feature normalization and feature filtering procedure (Devereux et al., <xref ref-type="bibr" rid="B9">2014</xref>). The McRae and CSLB criteria for human conceptual cognition are used in this research to investigate how each concept is similar to human cognition.</p>
<p>We compare and analyze IM and AM paradigms from two perspectives. First, we use the perceptual strength-related metric Modality Exclusivity to compare the two paradigms to explore the sensitive of them to the concepts&#x00027; strength distribution of multisensory information. Then, to assess the generality of the IM and AM paradigms, we introduce nine psycholinguistic dimensions derived from the concept&#x00027;s nature, which are unrelated to perceptual strength.</p>
<sec>
<title>4.2.1. Modality Exclusivity</title>
<p>Modality Exclusivity (ME) is a metric measuring how much of a concept is perceived through a single perceptual modality (Lynott and Connell, <xref ref-type="bibr" rid="B21">2013</xref>). For each concept, the value of ME is calculated as the perceptual strength range divided by the sum, and spanning from 0 to 100% for completely multimodal to completely unimodal perception. <xref ref-type="fig" rid="F3">Figure 3</xref> show some examples.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Modality exclusivity demonstration. Modality exclusivity (ME) is a metric measuring how much of a concept is perceived through a single perceptual modality. For each concept, the value of ME is calculated as the perceptual strength range divided by the sum, and spanning from 0 to 100% for completely multimodal to completely unimodal perception.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnsys-16-845177-g0003.tif"/>
</fig>
<p>In the concept feature norms dataset, we first obtain all similar concepts <italic>c</italic><sup><italic>similar</italic></sup> for each concept <italic>c</italic> based on the number of feature overlaps and record their rank list <inline-formula><mml:math id="M19"><mml:msubsup><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> sorted by similarity. Then, for all concepts, the corresponding multisensory integrated binary representations <italic>B</italic><sup><italic>IM</italic></sup> and <italic>B</italic><sup><italic>AM</italic></sup> are produced using the IM and AM paradigms, respectively. Next, for concept <italic>c</italic>, its <italic>k</italic> similar concepts <inline-formula><mml:math id="M20"><mml:msubsup><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>I</mml:mi><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and <inline-formula><mml:math id="M21"><mml:msubsup><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> are computed based on integrated binarycodes and harming distance, respectively. We query the rank of these <italic>k</italic> similar concepts in the feature norms space <inline-formula><mml:math id="M22"><mml:msubsup><mml:mrow><mml:mi>R</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>m</mml:mi><mml:mi>i</mml:mi><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msubsup></mml:math></inline-formula> and take the average value, denoted as <italic>kAR</italic><sub><italic>c</italic><sub><italic>IM</italic></sub></sub> and <italic>kAR</italic><sub><italic>c</italic><sub><italic>AM</italic></sub></sub>, which reflects the closeness of the multisensory representations to human cognition using two ways of integration in our framework. Smaller values of <italic>kAR</italic> indicate closer to human cognition at the microscopic level. Finally, we focus on all concepts in the representation dataset and calculate the correlation coefficients between the <italic>kAR</italic><sub><italic>c</italic><sub><italic>IM</italic></sub></sub> or <italic>kAR</italic><sub><italic>c</italic><sub><italic>AM</italic></sub></sub> arrays obtained using the above approach and the ME arrays corresponding to the concepts. This coefficient reflects the correlation between the two different multisensory concept integration paradigms and modal exclusivity. And in this experiment we only test the Izhikevich model and set k to 5.</p>
<p>The results in <xref ref-type="table" rid="T3">Table 3</xref> reveal the difference between IM and AM paradigms. The IM paradigm has a stronger negative correlation in both concept feature norms test sets, but the AM paradigms has a slightly positive correlation. We investigate this discrepancy further by viewing the FS model in detail, as shown in <xref ref-type="fig" rid="F4">Figure 4</xref>. The results reveal that for concepts with higher ME (such as &#x0201C;spring,&#x0201D; &#x0201C;thunder,&#x0201D; &#x0201C;yellow,&#x0201D; &#x0201C;debate,&#x0201D; &#x0201C;clang&#x0201D; in <xref ref-type="fig" rid="F3">Figure 3</xref>), the IM paradigm is better at multisensory integration. While the AM paradigm is less input biased for each modality, it benefits the concept of uniform modal distribution (such as &#x0201C;theory,&#x0201D; &#x0201C;knowledge,&#x0201D; &#x0201C;pig,&#x0201D; &#x0201C;duck,&#x0201D; &#x0201C;lake&#x0201D; in <xref ref-type="fig" rid="F3">Figure 3</xref>).</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>The sensibility of IM and AM results to modality exclusivity.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Izhkevich model</bold></th>
<th valign="top" align="center" colspan="2" style="border-bottom: thin solid #000000;"><bold>AM</bold></th>
<th valign="top" align="center" colspan="2" style="border-bottom: thin solid #000000;"><bold>IM</bold></th>
</tr>
<tr>
<th/>
<th valign="top" align="center"><bold>McRae</bold></th>
<th valign="top" align="center"><bold>CSLB</bold></th>
<th valign="top" align="center"><bold>McRae</bold></th>
<th valign="top" align="center"><bold>CSLB</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">RZ</td>
<td valign="top" align="center">0.0149</td>
<td valign="top" align="center">&#x02212;0.0987</td>
<td valign="top" align="center">&#x02212;0.1524</td>
<td valign="top" align="center">&#x02212;0.4848</td>
</tr>
<tr>
<td valign="top" align="left">FS</td>
<td valign="top" align="center">0.2679</td>
<td valign="top" align="center">0.0901</td>
<td valign="top" align="center">&#x02212;0.134</td>
<td valign="top" align="center">&#x02212;0.4447</td>
</tr>
<tr>
<td valign="top" align="left">IB</td>
<td valign="top" align="center">&#x02212;0.0559</td>
<td valign="top" align="center">0.0191</td>
<td valign="top" align="center">&#x02212;0.2672</td>
<td valign="top" align="center">&#x02212;0.4986</td>
</tr>
<tr>
<td valign="top" align="left">LTS</td>
<td valign="top" align="center">0.2113</td>
<td valign="top" align="center">0.035</td>
<td valign="top" align="center">&#x02212;0.12</td>
<td valign="top" align="center">&#x02212;0.0453</td>
</tr>
<tr>
<td valign="top" align="left">RS</td>
<td valign="top" align="center">0.1943</td>
<td valign="top" align="center">&#x02212;0.0087</td>
<td valign="top" align="center">&#x02212;0.006</td>
<td valign="top" align="center">&#x02212;0.1997</td>
</tr>
<tr>
<td valign="top" align="left">CH</td>
<td valign="top" align="center">0.0988</td>
<td valign="top" align="center">0.0197</td>
<td valign="top" align="center">0.0294</td>
<td valign="top" align="center">0.0964</td>
</tr>
<tr>
<td valign="top" align="left">TC</td>
<td valign="top" align="center">0.2078</td>
<td valign="top" align="center">0.0398</td>
<td valign="top" align="center">&#x02212;0.2115</td>
<td valign="top" align="center">&#x02212;0.4761</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>The correlation between ME and average of five similar concept rankings.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnsys-16-845177-g0004.tif"/>
</fig>
</sec>
<sec>
<title>4.2.2. Generality Analysis</title>
<p>The ME metric used in the previous experiments is a perceptual strength-related indicator for the concept representation. In this part, we will test the framework from the input concept itself. And we introduce Glasgow norms which are a set of normative assessments on nine psycholinguistic dimensions: arousal (AROU), valence (VAL), dominance (DOM), concreteness (CNC), imageability (IMAG), familiarity (FAM), age of acquisition (AOA), semantic size (SIZE), and gender association (GEND) for 5,553 concepts (Scott et al., <xref ref-type="bibr" rid="B28">2019</xref>).</p>
<p>In the same manner as the previous experiment. In concept feature norms, we first record all similar concepts for each concept, then sort them by similarity and rank them. Then, for IM and AM paradigms, we use the same concept input, get the integration vector for each concept, find their k similar, and get the mean value of their ranking in concept feature norms as <italic>kAR</italic><sub><italic>c</italic><sub><italic>IM</italic></sub></sub> and <italic>kAR</italic><sub><italic>c</italic><sub><italic>AM</italic></sub></sub>. Finally, we determine the correlation coefficient between each psychological characteristic and the concept&#x00027;s average ranking value <italic>kAR</italic> for the two paradigms. We still only test the Izhikevich model in this experiment, and the value is set to 5.</p>
<p>We used heatmaps (<xref ref-type="fig" rid="F5">Figure 5</xref>) to visualize the correlation coefficients between the IM and AM paradigms&#x00027; <italic>kAR</italic> and nine psycholinguistics in the two concept feature norms sets McRae and CSLB. Additionally, we omit the adopted Izhikevich submodels and provide the correlation coefficients using a beeswarm (<xref ref-type="fig" rid="F6">Figure 6</xref>) to explain them more clearly.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>The heatmap of generality analysis results.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnsys-16-845177-g0005.tif"/>
</fig>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>The beeswarm of correlation distribution.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fnsys-16-845177-g0006.tif"/>
</fig>
<p>According to the experimental results presented, the absolute values of all correlation coefficients are &#x0003C;0.3. The effect of vectors after integration of either IM or AM paradigms does not have any relationship with the nature of the concepts for several dimensions, including AOA, AROU, FAM, IMAG, and VAL. This indicates that both paradigms have good generality and the framework is not affected by the concepts themselves.</p>
</sec>
</sec>
</sec>
<sec sec-type="discussion" id="s5">
<title>5. Discussion</title>
<p>In this study, we propose a SNN-based concept learning framework for multisensory integration that can generate integration vectors based on psychologist-labeled multimodal representations. Vision, hearing, touch, smell, and taste are among the five modalities used in our research, which also includes a brain-like SNN model. We intend to add more brain-like processes in the future, such as multisensory fusion plasticity. The multisensory data we currently use are labeled by cognitive psychologists, which is relatively expensive and small, and in the future we consider expanding the relevant dataset by mapping for larger scale experiments. The current research focuses on multisensory representation of concepts, which is a subset of pattern representation in AI, and future research can be deeply integrated with downstream tasks to create AI systems that incorporate multisensory integration. At the same time, this places more demands on multisensory perceptrons. Human perception of concepts has not only multisensory perception but also more textual information based on abstract information, and it is also worth exploring how to combine these two parts to build human-like concept learning systems in the future.</p>
</sec>
<sec sec-type="data-availability" id="s6">
<title>Data Availability Statement</title>
<p>Publicly available datasets were analyzed in this study. This data can be found at: <ext-link ext-link-type="uri" xlink:href="http://osf.io/7emr6/">http://osf.io/7emr6/</ext-link>; <ext-link ext-link-type="uri" xlink:href="http://www.neuro.mcw.edu/resources.html">http://www.neuro.mcw.edu/resources.html</ext-link>; <ext-link ext-link-type="uri" xlink:href="https://link.springer.com/article/10.3758/BRM.41.2.558">https://link.springer.com/article/10.3758/BRM.41.2.558</ext-link>; <ext-link ext-link-type="uri" xlink:href="https://link.springer.com/article/10.3758/s13428-012-0267-0">https://link.springer.com/article/10.3758/s13428-012-0267-0</ext-link>.</p>
</sec>
<sec id="s7">
<title>Author Contributions</title>
<p>YW and YZ designed the study, performed the experiments, and wrote the manuscript. Both authors contributed to the article and approved the submitted version.</p>
</sec>
<sec sec-type="funding-information" id="s8">
<title>Funding</title>
<p>This study was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDB32070100).</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<ack><p>We thank Dr. Yanchao Bi and Dr. Xiaosha Wang for helpful discussions and generous sharing of psychology-related researches.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Agirre</surname> <given-names>E.</given-names></name> <name><surname>Alfonseca</surname> <given-names>E.</given-names></name> <name><surname>Hall</surname> <given-names>K.</given-names></name> <name><surname>Kravalova</surname> <given-names>J.</given-names></name> <name><surname>Pacsca</surname> <given-names>M.</given-names></name> <name><surname>Soroa</surname> <given-names>A.</given-names></name></person-group> (<year>2009</year>). <article-title>A study on similarity and relatedness using distributional and word net-based approaches</article-title>, in <source>Proceedings of Human Language Technologies: The 2009 Annual Conference of the North American Chapter of the Association for Computational Linguistics</source> (<publisher-loc>Boulder, CO</publisher-loc>: <publisher-name>Association for Computational Linguistics</publisher-name>), <fpage>19</fpage>&#x02013;<lpage>27</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://aclanthology.org/N09-1003">https://aclanthology.org/N09-1003</ext-link></citation>
</ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Anastasio</surname> <given-names>T. J.</given-names></name> <name><surname>Patton</surname> <given-names>P. E.</given-names></name> <name><surname>Belkacem-Boussaid</surname> <given-names>K.</given-names></name></person-group> (<year>2014</year>). <article-title>Using Bayes&#x00027; rule to model multisensory enhancement in the superior colliculus</article-title>. <source>Neural Comput</source>. <volume>12</volume>, <fpage>1165</fpage>&#x02013;<lpage>1187</lpage>. <pub-id pub-id-type="doi">10.1162/089976600300015547</pub-id><pub-id pub-id-type="pmid">10905812</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bi</surname> <given-names>G.-Q.</given-names></name> <name><surname>Poo</surname> <given-names>M.-M.</given-names></name></person-group> (<year>2001</year>). <article-title>Synaptic modification by correlated activity: Hebb&#x00027;s postulate revisited</article-title>. <source>Annu. Rev. Neurosci</source>. <volume>24</volume>, <fpage>139</fpage>&#x02013;<lpage>166</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.neuro.24.1.139</pub-id><pub-id pub-id-type="pmid">11283308</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Binder</surname> <given-names>J. R.</given-names></name> <name><surname>Conant</surname> <given-names>L. L.</given-names></name> <name><surname>Humphries</surname> <given-names>C. J.</given-names></name> <name><surname>Fernandino</surname> <given-names>L.</given-names></name> <name><surname>Simons</surname> <given-names>S. B.</given-names></name> <name><surname>Aguilar</surname> <given-names>M.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>Toward a brain-based componential semantic representation</article-title>. <source>Cogn. Neuropsychol</source>. <volume>33</volume>, <fpage>130</fpage>&#x02013;<lpage>174</lpage>. <pub-id pub-id-type="doi">10.1080/02643294.2016.1147426</pub-id><pub-id pub-id-type="pmid">27310469</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bruni</surname> <given-names>E.</given-names></name> <name><surname>Tran</surname> <given-names>N.-K.</given-names></name> <name><surname>Baroni</surname> <given-names>M.</given-names></name></person-group> (<year>2014</year>). <article-title>Multimodal distributional semantics</article-title>. <source>J. Artif. Intell. Res</source>. <volume>49</volume>, <fpage>1</fpage>&#x02013;<lpage>47</lpage>. <pub-id pub-id-type="doi">10.1613/jair.4135</pub-id></citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Calvert</surname> <given-names>G. A.</given-names></name> <name><surname>Thesen</surname> <given-names>T.</given-names></name></person-group> (<year>2004</year>). <article-title>Multisensory integration: methodological approaches and emerging principles in the human brain</article-title>. <source>J. Physiol. Paris</source> <volume>98</volume>, <fpage>191</fpage>&#x02013;<lpage>205</lpage>. <pub-id pub-id-type="doi">10.1016/j.jphysparis.2004.03.018</pub-id><pub-id pub-id-type="pmid">15477032</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cappe</surname> <given-names>C.</given-names></name> <name><surname>Rouiller</surname> <given-names>E. M.</given-names></name> <name><surname>Barone</surname> <given-names>P.</given-names></name></person-group> (<year>2009</year>). <article-title>Multisensory anatomical pathways</article-title>. <source>Hear. Res</source>. <volume>258</volume>, <fpage>28</fpage>&#x02013;<lpage>36</lpage>. <pub-id pub-id-type="doi">10.1016/j.heares.2009.04.017</pub-id><pub-id pub-id-type="pmid">19410641</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Collell</surname> <given-names>G.</given-names></name> <name><surname>Zhang</surname> <given-names>T.</given-names></name> <name><surname>Moens</surname> <given-names>M. -F.</given-names></name></person-group> (<year>2017</year>). <article-title>Imagined visual representations as multimodal embeddings</article-title>, in <source>Proceedings of the AAAI Conference on Artificial Intelligence</source>, Vol. <volume>31</volume> (<publisher-loc>San Francisco, CA</publisher-loc>: <publisher-name>AAAI</publisher-name>).</citation>
</ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Devereux</surname> <given-names>B. J.</given-names></name> <name><surname>Tyler</surname> <given-names>L. K.</given-names></name> <name><surname>Geertzen</surname> <given-names>J.</given-names></name> <name><surname>Randall</surname> <given-names>B.</given-names></name></person-group> (<year>2014</year>). <article-title>The centre for speech, language and the brain (CSLB) concept property norms</article-title>. <source>Behav. Res. Methods</source> <volume>46</volume>, <fpage>1119</fpage>&#x02013;<lpage>1127</lpage>. <pub-id pub-id-type="doi">10.3758/s13428-013-0420-4</pub-id><pub-id pub-id-type="pmid">24356992</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gao</surname> <given-names>Y.</given-names></name> <name><surname>Hendricks</surname> <given-names>L. A.</given-names></name> <name><surname>Kuchenbecker</surname> <given-names>K. J.</given-names></name> <name><surname>Darrell</surname> <given-names>T.</given-names></name></person-group> (<year>2016</year>). <article-title>Deep learning for tactile understanding from visual and haptic data</article-title>. <source>arXiv:1511.06065</source>. <pub-id pub-id-type="doi">10.1109/ICRA.2016.7487176</pub-id></citation>
</ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gepner</surname> <given-names>R.</given-names></name> <name><surname>Skanata</surname> <given-names>M. M.</given-names></name> <name><surname>Bernat</surname> <given-names>N. M.</given-names></name> <name><surname>Kaplow</surname> <given-names>M.</given-names></name> <name><surname>Gershow</surname> <given-names>M.</given-names></name></person-group> (<year>2015</year>). <article-title>Computations underlying drosophila photo-taxis, odor-taxis, and multi-sensory integration</article-title>. <source>eLife</source> <volume>4</volume>:<fpage>e6229</fpage>. <pub-id pub-id-type="doi">10.7554/eLife.06229</pub-id><pub-id pub-id-type="pmid">25945916</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gerstner</surname> <given-names>W.</given-names></name> <name><surname>Kistler</surname> <given-names>W. M.</given-names></name></person-group> (<year>2002</year>). <source>Spiking Neuron Models: Single Neurons, Populations, Plasticity</source>. <publisher-name>Cambridge University Press</publisher-name>. <pub-id pub-id-type="doi">10.1017/CBO9780511815706</pub-id></citation>
</ref>
<ref id="B13">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hill</surname> <given-names>F.</given-names></name> <name><surname>Korhonen</surname> <given-names>A.</given-names></name></person-group> (<year>2014</year>). <article-title>Learning abstract concept embeddings from multi-modal data: since you probably can&#x00027;t see what I mean</article-title>, in <source>Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing</source> (<publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>EMNLP</publisher-name>), <fpage>255</fpage>&#x02013;<lpage>265</lpage>. <pub-id pub-id-type="doi">10.3115/v1/D14-1032</pub-id></citation>
</ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hill</surname> <given-names>F.</given-names></name> <name><surname>Reichart</surname> <given-names>R.</given-names></name> <name><surname>Korhonen</surname> <given-names>A.</given-names></name></person-group> (<year>2014</year>). <article-title>Multi-modal models for concrete and abstract concept meaning</article-title>. <source>Trans. Assoc. Comput. Linguist</source>. <volume>2</volume>, <fpage>285</fpage>&#x02013;<lpage>296</lpage>. <pub-id pub-id-type="doi">10.1162/tacl_a_00183</pub-id></citation>
</ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hodgkin</surname> <given-names>A. L.</given-names></name> <name><surname>Huxley</surname> <given-names>A. F.</given-names></name></person-group> (<year>1952</year>). <article-title>A quantitative description of membrane current and its application to conduction and excitation in nerve</article-title>. <source>J. Physiol</source>. <volume>117</volume>, <fpage>500</fpage>&#x02013;<lpage>544</lpage>. <pub-id pub-id-type="doi">10.1113/jphysiol.1952.sp004764</pub-id><pub-id pub-id-type="pmid">2185861</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Huang</surname> <given-names>E. H.</given-names></name> <name><surname>Socher</surname> <given-names>R.</given-names></name> <name><surname>Manning</surname> <given-names>C. D.</given-names></name> <name><surname>Ng</surname> <given-names>A. Y.</given-names></name></person-group> (<year>2012</year>). <article-title>Improving word representations via global context and multiple word prototypes</article-title>, in <source>Proceedings of the 50th Annual Meeting of the Association for ComputationalLinguistics, Vol. 1</source> (<publisher-loc>Jeju Island</publisher-loc>: <publisher-name>Association for Computational Linguistics</publisher-name>), <fpage>873</fpage>&#x02013;<lpage>882</lpage>.</citation>
</ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Izhikevich</surname> <given-names>E. M.</given-names></name></person-group> (<year>2003</year>). <article-title>Simple model of spiking neurons</article-title>. <source>IEEE Trans. Neural Netw</source>. <volume>14</volume>, <fpage>1569</fpage>&#x02013;<lpage>1572</lpage>. <pub-id pub-id-type="doi">10.1109/TNN.2003.820440</pub-id><pub-id pub-id-type="pmid">18244602</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kiela</surname> <given-names>D.</given-names></name> <name><surname>Bottou</surname> <given-names>L.</given-names></name></person-group> (<year>2014</year>). <article-title>Learning image embeddings using convolutional neural networks for improved multi-modal semantics</article-title>, in <source>Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing</source> (<publisher-loc>Doha</publisher-loc>: <publisher-name>EMNLP</publisher-name>). <pub-id pub-id-type="doi">10.3115/v1/D14-1005</pub-id></citation>
</ref>
<ref id="B19">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>Z.</given-names></name> <name><surname>Shen</surname> <given-names>Y.</given-names></name> <name><surname>Lakshminarasimhan</surname> <given-names>V. B.</given-names></name> <name><surname>Liang</surname> <given-names>P. P.</given-names></name> <name><surname>Zadeh</surname> <given-names>A.</given-names></name> <name><surname>Morency</surname> <given-names>L.-P.</given-names></name></person-group> (<year>2018</year>). <article-title>Efficient low-rank multimodal fusion with modality-specific factors</article-title>. <source>arXiv preprint arXiv:1806.00064</source>. <pub-id pub-id-type="doi">10.18653/v1/P18-1209</pub-id></citation>
</ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lynott</surname> <given-names>D.</given-names></name> <name><surname>Connell</surname> <given-names>L.</given-names></name></person-group> (<year>2009</year>). <article-title>Modality exclusivity norms for 423 object properties</article-title>. <source>Behav. Res. Methods</source> <volume>41</volume>, <fpage>558</fpage>&#x02013;<lpage>564</lpage>. <pub-id pub-id-type="doi">10.3758/BRM.41.2.558</pub-id><pub-id pub-id-type="pmid">19363198</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lynott</surname> <given-names>D.</given-names></name> <name><surname>Connell</surname> <given-names>L.</given-names></name></person-group> (<year>2013</year>). <article-title>Modality exclusivity norms for 400 nouns: the relationship between perceptual experience and surface word form</article-title>. <source>Behav. Res. Methods</source> <volume>45</volume>, <fpage>516</fpage>&#x02013;<lpage>526</lpage>. <pub-id pub-id-type="doi">10.3758/s13428-012-0267-0</pub-id><pub-id pub-id-type="pmid">23055172</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lynott</surname> <given-names>D.</given-names></name> <name><surname>Connell</surname> <given-names>L.</given-names></name> <name><surname>Brysbaert</surname> <given-names>M.</given-names></name> <name><surname>Brand</surname> <given-names>J.</given-names></name> <name><surname>Carney</surname> <given-names>J.</given-names></name></person-group> (<year>2019</year>). <article-title>The Lancaster sensorimotor norms: multidimensional measures of perceptual and action strength for 40,000 English words</article-title>. <source>Behav. Res. Methods</source> <fpage>1</fpage>&#x02013;<lpage>21</lpage>. <pub-id pub-id-type="doi">10.31234/osf.io/ktjwp</pub-id><pub-id pub-id-type="pmid">31832879</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Maass</surname> <given-names>W.</given-names></name></person-group> (<year>1997</year>). <article-title>Networks of spiking neurons: the third generation of neural network models</article-title>. <source>Neural Netw</source>. <volume>10</volume>, <fpage>1659</fpage>&#x02013;<lpage>1671</lpage>. <pub-id pub-id-type="doi">10.1016/S0893-6080(97)00011-7</pub-id></citation>
</ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>McRae</surname> <given-names>K.</given-names></name> <name><surname>Cree</surname> <given-names>G. S.</given-names></name> <name><surname>Seidenberg</surname> <given-names>M. S.</given-names></name> <name><surname>McNorgan</surname> <given-names>C.</given-names></name></person-group> (<year>2005</year>). <article-title>Semantic feature production norms for a large set of living and nonliving things</article-title>. <source>Behav. Res. Methods</source> <volume>37</volume>, <fpage>547</fpage>&#x02013;<lpage>559</lpage>. <pub-id pub-id-type="doi">10.3758/BF03192726</pub-id><pub-id pub-id-type="pmid">16629288</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Parise</surname> <given-names>C. V.</given-names></name> <name><surname>Ernst</surname> <given-names>M. O.</given-names></name></person-group> (<year>2016</year>). <article-title>Correlation detection as a general mechanism for multisensory integration</article-title>. <source>Nat. Commun</source>. <volume>7</volume>:<fpage>11543</fpage>. <pub-id pub-id-type="doi">10.1038/ncomms11543</pub-id><pub-id pub-id-type="pmid">27265526</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Rieke</surname> <given-names>F.</given-names></name> <name><surname>Warland</surname> <given-names>D.</given-names></name> <name><surname>Van Steveninck</surname> <given-names>R. d. R.</given-names></name> <name><surname>Bialek</surname> <given-names>W.</given-names></name></person-group> (<year>1999</year>). <source>Spikes: Exploring the Neural Code</source>. <publisher-name>MIT Press</publisher-name>.</citation>
</ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Roshan</surname> <given-names>C.</given-names></name> <name><surname>Barker</surname> <given-names>R. A.</given-names></name> <name><surname>Sahakian</surname> <given-names>B. J.</given-names></name> <name><surname>Robbins</surname> <given-names>T. W.</given-names></name></person-group> (<year>2001</year>). <article-title>Mechanisms of cognitive set flexibility in Parkinson&#x00027;s disease</article-title>. <source>Brain A J. Neurol</source>. <volume>124</volume>, <fpage>2503</fpage>&#x02013;<lpage>2512</lpage>. <pub-id pub-id-type="doi">10.1093/brain/124.12.2503</pub-id><pub-id pub-id-type="pmid">11701603</pub-id></citation></ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Scott</surname> <given-names>G. G.</given-names></name> <name><surname>Keitel</surname> <given-names>A.</given-names></name> <name><surname>Becirspahic</surname> <given-names>M.</given-names></name> <name><surname>Yao</surname> <given-names>B.</given-names></name> <name><surname>Sereno</surname> <given-names>S. C.</given-names></name></person-group> (<year>2019</year>). <article-title>The glasgow norms: ratings of 5,500 words on nine scales</article-title>. <source>Behav. Res. Methods</source> <volume>51</volume>, <fpage>1258</fpage>&#x02013;<lpage>1270</lpage>. <pub-id pub-id-type="doi">10.3758/s13428-018-1099-3</pub-id><pub-id pub-id-type="pmid">30206797</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shams</surname> <given-names>L.</given-names></name> <name><surname>Seitz</surname> <given-names>A. R.</given-names></name></person-group> (<year>2008</year>). <article-title>Benefits of multisensory learning</article-title>. <source>Trends Cogn</source>. <volume>12</volume>, <fpage>411</fpage>&#x02013;<lpage>417</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2008.07.006</pub-id><pub-id pub-id-type="pmid">18805039</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Silberer</surname> <given-names>C.</given-names></name> <name><surname>Ferrari</surname> <given-names>V.</given-names></name> <name><surname>Lapata</surname> <given-names>M.</given-names></name></person-group> (<year>2013</year>). <article-title>Models of semantic representation with visual attributes</article-title>, in <source>Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Vol. 1</source> (<publisher-loc>Sofia</publisher-loc>: <publisher-name>Association for Computational Linguistics</publisher-name>), <fpage>572</fpage>&#x02013;<lpage>582</lpage>.</citation>
</ref>
<ref id="B31">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Silberer</surname> <given-names>C.</given-names></name> <name><surname>Lapata</surname> <given-names>M.</given-names></name></person-group> (<year>2014</year>). <article-title>Learning grounded meaning representations with autoencoders</article-title>, in <source>Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, Vol. 1</source> (<publisher-loc>Baltimore</publisher-loc>: <publisher-name>Association for Computational Linguistics</publisher-name>), <fpage>721</fpage>&#x02013;<lpage>732</lpage>. <pub-id pub-id-type="doi">10.3115/v1/P14-1068</pub-id></citation>
</ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stimberg</surname> <given-names>M.</given-names></name> <name><surname>Brette</surname> <given-names>R.</given-names></name> <name><surname>Goodman</surname> <given-names>D. F.</given-names></name></person-group> (<year>2019</year>). <article-title>Brian 2, an intuitive and efficient neural simulator</article-title>. <source>Elife</source> <volume>8</volume>:<fpage>e47314</fpage>. <pub-id pub-id-type="doi">10.7554/eLife.47314</pub-id><pub-id pub-id-type="pmid">31429824</pub-id></citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Troyer</surname> <given-names>T. W.</given-names></name> <name><surname>Miller</surname> <given-names>K. D.</given-names></name></person-group> (<year>1997</year>). <article-title>Physiological gain leads to high isi variability in a simple model of a cortical regular spiking cell</article-title>. <source>Neural Comput</source>. <volume>9</volume>, <fpage>971</fpage>&#x02013;<lpage>983</lpage>. <pub-id pub-id-type="doi">10.1162/neco.1997.9.5.971</pub-id><pub-id pub-id-type="pmid">9188190</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ursino</surname> <given-names>M.</given-names></name> <name><surname>Cuppini</surname> <given-names>C.</given-names></name> <name><surname>Magosso</surname> <given-names>E.</given-names></name></person-group> (<year>2014</year>). <article-title>Neurocomputational approaches to modelling multisensory integration in the brain: a review</article-title>. <source>Neural Netw</source>. <volume>60</volume>, <fpage>141</fpage>&#x02013;<lpage>165</lpage>. <pub-id pub-id-type="doi">10.1016/j.neunet.2014.08.003</pub-id><pub-id pub-id-type="pmid">25218929</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ursino</surname> <given-names>M.</given-names></name> <name><surname>Cuppini</surname> <given-names>C.</given-names></name> <name><surname>Magosso</surname> <given-names>E.</given-names></name> <name><surname>Serino</surname> <given-names>A.</given-names></name> <name><surname>Pellegrino</surname> <given-names>G. D.</given-names></name></person-group> (<year>2009</year>). <article-title>Multisensory integration in the superior colliculus: a neural network model</article-title>. <source>J. Comput. Neurosci</source>. <volume>26</volume>, <fpage>55</fpage>&#x02013;<lpage>73</lpage>. <pub-id pub-id-type="doi">10.1007/s10827-008-0096-4</pub-id><pub-id pub-id-type="pmid">18478323</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Verma</surname> <given-names>S.</given-names></name> <name><surname>Wang</surname> <given-names>C.</given-names></name> <name><surname>Zhu</surname> <given-names>L.</given-names></name> <name><surname>Liu</surname> <given-names>W.</given-names></name></person-group> (<year>2019</year>). <article-title>Deepcu: Integrating both common and unique latent information for multimodal sentiment analysis</article-title>, in <source>International Joint Conference on Artificial Intelligence</source> (<publisher-loc>Macao</publisher-loc>). <pub-id pub-id-type="doi">10.24963/ijcai.2019/503</pub-id></citation>
</ref>
<ref id="B37">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>S.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Zong</surname> <given-names>C.</given-names></name></person-group> (<year>2018a</year>). <article-title>Associative multichannel autoencoder for multimodal word representation</article-title>, in <source>Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing</source> (<publisher-loc>Brussels</publisher-loc>), <fpage>115</fpage>&#x02013;<lpage>124</lpage>. <pub-id pub-id-type="doi">10.18653/v1/D18-1011</pub-id></citation>
</ref>
<ref id="B38">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>S.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Zong</surname> <given-names>C.</given-names></name></person-group> (<year>2018b</year>). <article-title>Learning multimodal word representation via dynamic fusion methods</article-title>, in <source>Thirty-Second AAAI Conference on Artificial Intelligence</source> (<publisher-loc>New Orelans, LA</publisher-loc>).</citation>
</ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>X.</given-names></name> <name><surname>Men</surname> <given-names>W.</given-names></name> <name><surname>Gao</surname> <given-names>J.</given-names></name> <name><surname>Caramazza</surname> <given-names>A.</given-names></name> <name><surname>Bi</surname> <given-names>Y.</given-names></name></person-group> (<year>2020</year>). <article-title>Two forms of knowledge representations in the human brain</article-title>. <source>Neuron</source> <volume>107</volume>, <fpage>383</fpage>&#x02013;<lpage>393.e5</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2020.04.010</pub-id><pub-id pub-id-type="pmid">32386524</pub-id></citation></ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Xu</surname> <given-names>Y.</given-names></name> <name><surname>Yong</surname> <given-names>H.</given-names></name> <name><surname>Bi</surname> <given-names>Y.</given-names></name></person-group> (<year>2017</year>). <article-title>A tri-network model of human semantic processing</article-title>. <source>Front. Psychol</source>. <volume>8</volume>:<fpage>1538</fpage>. <pub-id pub-id-type="doi">10.3389/fpsyg.2017.01538</pub-id><pub-id pub-id-type="pmid">28955266</pub-id></citation></ref>
<ref id="B41">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zadeh</surname> <given-names>A.</given-names></name> <name><surname>Chen</surname> <given-names>M.</given-names></name> <name><surname>Poria</surname> <given-names>S.</given-names></name> <name><surname>Cambria</surname> <given-names>E.</given-names></name> <name><surname>Morency</surname> <given-names>L.-P.</given-names></name></person-group> (<year>2017</year>). <article-title>Tensor fusion network for multimodal sentiment analysis</article-title>. <source>arXiv preprint arXiv:1707.07250</source>. <pub-id pub-id-type="doi">10.18653/v1/D17-1115</pub-id></citation>
</ref>
</ref-list>
<app-group>
<app>
<title>Appendix</title>
<sec>
<title>The Initial Weights in IM</title>
<p>Similar to what cognitive psychologists (Ursino et al., <xref ref-type="bibr" rid="B34">2014</xref>) have done before, we assume that for the concept <italic>s</italic> and its each modality <italic>i</italic> &#x02208; [<italic>A, G, H, O, V</italic>] representations, <italic>p</italic>(<italic>x</italic><sub><italic>i</italic></sub>|<italic>s</italic>) &#x0007E; <italic>N</italic>(<italic>x</italic><sub><italic>i</italic></sub>; <italic>s</italic>, &#x003C3;<sub><italic>i</italic></sub>), where <italic>N</italic>(<italic>x</italic>; &#x003BC;, &#x003C3;) stands for the normal distribution over <italic>x</italic> with mean &#x003BC; and standard deviation &#x003C3;. They are conditionally independent from each other and by Bayes&#x00027; rule,</p>
<disp-formula id="E12"><label>(11)</label><mml:math id="M23"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>p</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>s</mml:mi><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>G</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>H</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>O</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>V</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0221D;</mml:mo><mml:mi>p</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>G</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>H</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>O</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>V</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:mi>s</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x0221D;</mml:mo><mml:mstyle displaystyle="true"><mml:munder><mml:mrow><mml:mo>&#x0220F;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:munder></mml:mstyle><mml:mi>p</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:mi>s</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mstyle displaystyle="true"><mml:msub><mml:mrow><mml:mo>&#x0220F;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mstyle><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msqrt><mml:mrow><mml:mn>2</mml:mn><mml:mi>&#x003C0;</mml:mi></mml:mrow></mml:msqrt><mml:msub><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mi>s</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mrow></mml:msup></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x0221D;</mml:mo><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:mi>s</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The maximum-a-posteriori estimation for <italic>s</italic> is <inline-formula><mml:math id="M24"><mml:mover accent="false"><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mo>^</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mrow></mml:mfrac><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, where <inline-formula><mml:math id="M25"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:math></inline-formula> reflects the reliability of each modality for the same concept <italic>s</italic>. In our IM schema, we regard normalized reliability as the initial weights between pre-synaptic neurons (describing each modality) and the post-synaptic neuron(for integration), i.e.,</p>
<disp-formula id="E13"><label>(12)</label><mml:math id="M26"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mo>&#x003A3;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where we can get each &#x003C3;<sub><italic>i</italic></sub> <italic>via</italic> psychologist-labeled multisensory datasets.</p>
</sec>
</app>
</app-group>
</back>
</article>