<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="article-commentary">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Psychol.</journal-id>
<journal-title>Frontiers in Psychology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Psychol.</abbrev-journal-title>
<issn pub-type="epub">1664-1078</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpsyg.2014.00257</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychology</subject>
<subj-group>
<subject>General Commentary Article</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Multisensory integration, learning, and the predictive coding hypothesis</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Altieri</surname> <given-names>Nicholas</given-names></name>
<xref ref-type="author-notes" rid="fn001"><sup>&#x0002A;</sup></xref>
</contrib>
</contrib-group>
<aff><institution>ISU Multimodal Language Processing Lab, Department of Communication Sciences and Disorders, Idaho State University</institution> <country>Pocatello, Idaho, USA</country></aff>
<author-notes>
<fn fn-type="corresp" id="fn001"><p>&#x0002A;Correspondence: <email>altinich&#x00040;isu.edu</email></p></fn>
<fn fn-type="other" id="fn002"><p>This article was submitted to Language Sciences, a section of the journal Frontiers in Psychology.</p></fn>
<fn fn-type="edited-by"><p>Edited by: Albert Costa, University Pompeu Fabra, Spain</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Ryan A. Stevenson, University of Toronto, Canada; Jordi Navarra, Fundaci&#x000F3; Sant Joan de D&#x000E9;u - Parc Sanitari Sant Joan de D&#x000E9;u - Hospital Sant Joan de D&#x000E9;u, Spain</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>24</day>
<month>03</month>
<year>2014</year>
</pub-date>
<pub-date pub-type="collection">
<year>2014</year>
</pub-date>
<volume>5</volume>
<elocation-id>257</elocation-id>
<history>
<date date-type="received">
<day>12</day>
<month>11</month>
<year>2013</year>
</date>
<date date-type="accepted">
<day>10</day>
<month>03</month>
<year>2014</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2014 Altieri.</copyright-statement>
<copyright-year>2014</copyright-year>
<license license-type="open-access" xlink:href="http://creativecommons.org/licenses/by/3.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p>
</license>
</permissions>
<related-article id="RA1" related-article-type="commentary-article" journal-id="Front Psychol" journal-id-type="nlm-ta" vol="4" page="388" xlink:href="23874309" ext-link-type="pubmed">A commentary on <article-title>Speech through ears and eyes: interfacing the senses with the supramodal brain</article-title> by van Wassenhove, V. (2013). Front. Psychol. 4:388. doi: 10.3389/fpsyg.2013.00388</related-article>
<kwd-group>
<kwd>predictive coding</kwd>
<kwd>Bayesian inference</kwd>
<kwd>audiovisual speech integration</kwd>
<kwd>EEG</kwd>
<kwd>parallel models</kwd>
</kwd-group>
<counts>
<fig-count count="1"/>
<table-count count="0"/>
<equation-count count="0"/>
<ref-count count="23"/>
<page-count count="3"/>
<word-count count="1814"/>
</counts>
</article-meta>
</front>
<body>
<p>The multimodal nature of perception has generated several questions of importance pertaining to the encoding, learning, and retrieval of linguistic representations (e.g., Summerfield, <xref ref-type="bibr" rid="B18">1987</xref>; Altieri et al., <xref ref-type="bibr" rid="B1">2011</xref>; van Wassenhove, <xref ref-type="bibr" rid="B19">2013</xref>). Historically, many theoretical accounts of speech perception have been driven by descriptions of auditory encoding; this makes sense because normal-hearing listeners rely predominantly on the auditory signal. However, from both evolutionary and empirical standpoints, comprehensive neurobiological accounts of speech perception must account for interactions across sensory modalities and the interplay of cross-modal and articulatory representations. These include auditory, visual, and somatosensory modalities.</p>
<p>In a recent review, van Wassenhove (<xref ref-type="bibr" rid="B19">2013</xref>) discussed key frameworks describing how visual cues interface with the auditory modality to improve auditory recognition (Sumby and Pollack, <xref ref-type="bibr" rid="B17">1954</xref>), or otherwise contribute to an illusory percept for mismatched auditory-visual syllables (McGurk and MacDonald, <xref ref-type="bibr" rid="B9">1976</xref>). These frameworks encompass multiple levels of analysis. Some of these higher cognitive processing models that discuss parallel processing (Altieri and Townsend, <xref ref-type="bibr" rid="B3">2011</xref>) or the independent extraction of features from the auditory and visual modalities (Massaro, <xref ref-type="bibr" rid="B22">1987</xref>, Fuzzy Logical Model of Perception), early feature encoding (van Wassenhove et al., <xref ref-type="bibr" rid="B20">2005</xref>), and encoding/timing at the neural level (Poeppel et al., <xref ref-type="bibr" rid="B11">2008</xref>; Schroeder et al., <xref ref-type="bibr" rid="B14">2008</xref>).</p>
<p>This commentary on van Wassenhove (<xref ref-type="bibr" rid="B19">2013</xref>) will examine predictive coding hypotheses as one theory for how visemes are matched with auditory cues. Crucially, a hypothesized role shall be emphasized for cross-modal neural plasticity and multisensory learning in reinforcing the sharing of cues across modalities into adulthood.</p>
<sec>
<title>Predictive encoding and fixed priors</title>
<p>A critical question in speech research concerns how time-variable signals interface with internal representations to yield a stable percept. Although speech signals are highly variable (multiple talkers, dialects, etc.), our percepts appear stable due to dimensionality reduction. These questions become even more complex in multisensory speech perception since we are now dealing with the issue of how visual speech gestures coalesce with the auditory signal as the respective signals unfold at different rates and reach cortical areas at different times. In fact, these signals must co-occur within an optimal spatio-temporal window to have a significant probability of undergoing integration (Conrey and Pisoni, <xref ref-type="bibr" rid="B4">2006</xref>; Stevenson et al., <xref ref-type="bibr" rid="B16">2012</xref>).</p>
<p>The <italic>predictive coding hypothesis</italic> incorporates these aforementioned observations to describe integration in the following ways: (1) Temporally congruent auditory and visual inputs will be processed by cortical integration circuitry, (2), internal representations (&#x0201C;fixed Bayesian priors&#x0201D;) are compared and matched against the inputs, and (3) hypotheses about the intended utterance are actively generated. van Wassenhove et al.&#x00027;s (<xref ref-type="bibr" rid="B20">2005</xref>) EEG study exemplified key components of the visual predicative coding hypothesis. When presented with auditory and visual syllables in normal conversational settings, the visual signal leads the auditory by tens or even hundreds of milliseconds. Thus, featural information in the visual signal constrains predictions about the content of the auditory signal. The authors showed that early visual speech information speeds-up auditory processing, as evidenced by temporal facilitation in the early auditory ERPs. This finding was interpreted as a reduction in the residual error in the auditory signal by the visual signal. One promising hypothesis is that visual information interacts with the auditory cortex in such a way that it modulates excitability in auditory regions via oscillatory phase resetting (Schroeder et al., <xref ref-type="bibr" rid="B14">2008</xref>). Predictive coding hypotheses may also be extended to account for broad classes of stimuli including speech and non-speech, and matched and mismatched signals&#x02014;all of which have been shown to evoke early ERPs associated with visual prediction (Stekelenburg and Vroomen, <xref ref-type="bibr" rid="B15">2007</xref>).</p>
</sec>
<sec>
<title>Fixed priors</title>
<p>Hypothetically, visual cues can provide predictive information so long as they precede the auditory stimulus and provide reliable cues (see Nahorna et al., <xref ref-type="bibr" rid="B23">2012</xref>). A critical issue pertaining to visual predictive coding, then, relates to the &#x0201C;rigidity&#x0201D; of the internal rules (fixed priors). van Wassenhove (<xref ref-type="bibr" rid="B19">2013</xref>) discussed research suggesting the stability of priors/representations that are innate or otherwise become firmly established during critical developmental periods (Rosenblum et al., <xref ref-type="bibr" rid="B13">1997</xref>; Lewkowicz, <xref ref-type="bibr" rid="B8">2000</xref>). Lewkowicz (<xref ref-type="bibr" rid="B8">2000</xref>) argued that the ability to detect multisensory synchrony and match &#x0201C;duration and rate&#x0201D; are established early in life. In the domain of speech, Rosenblum and colleagues have argued that infants are sensitive to the McGurk effect and also to matched vs. mismatched articulatory movements and speech sounds.</p>
<p>While these studies suggest some rigidity of priors, I would emphasize that prior probabilities or &#x0201C;internal rules&#x0201D; remain malleable into adulthood. This adaptive perspective finds support among Bayesian theorists who argue that priors are continually updated in light of new evidence. Research indicates that differences in the ability to detect subtle auditory-visual asynchronies changes even into early adulthood (Hillock et al., <xref ref-type="bibr" rid="B7">2011</xref>). Additionally, perceptual learning and adaptation techniques can alter priors in such a way that perceptions of asynchronies are modified via practice (Fujisaki et al., <xref ref-type="bibr" rid="B6">2004</xref>; Vatakis et al., <xref ref-type="bibr" rid="B21">2007</xref>; Powers et al., <xref ref-type="bibr" rid="B12">2009</xref>) or experience with a second language (Navarra et al., <xref ref-type="bibr" rid="B10">2010</xref>). Importantly, continual updating of &#x0201C;fixed&#x0201D; priors allows adult perceivers to (re)learn, fine tune, and adapt to multimodal signals across listening conditions, variable talkers, and attentional loads. van Wassenhove (<xref ref-type="bibr" rid="B19">2013</xref>) discussed how subjects can &#x0201C;automatically&#x0201D; match pitch and spatial frequency patterns (Evans and Treisman, <xref ref-type="bibr" rid="B5">2010</xref>). This certainly shows that subjects can match auditory and visual information based on prior experience. Altieri et al. (<xref ref-type="bibr" rid="B2">2013</xref>) have also shown that adults can learn to match auditory and visual patterns more efficiently after only one day of practice! Reaction times and EEG signals indicated rapid learning and higher integration efficiency after only 1 h of training, followed by a period of gradual learning that remained stable over 1 week.</p>
<p>Such findings appear consistent with a unified parallel framework where visual information influences auditory processing and where visual predictability can be reweighted through learning. Figure <xref ref-type="fig" rid="F1">1</xref> represents an attempt to couch predictive coding within adaptive parallel accounts of integration.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p><bold>Inputs interact with noise while evidence for a category (e.g., &#x0201C;ba&#x0201D;) accumulates toward threshold (&#x003B3;)</bold>. Once enough information in either modality reaches threshold, a decision is made (e.g., &#x0201C;ba&#x0201D; vs. &#x0201C;da&#x0201D;). Visual information interacts with auditory cortical regions (dotted line) leading to updated priors. This model does not rule out the possibility that auditory cues can reciprocally influence viseme recognition.</p></caption>
<graphic xlink:href="fpsyg-05-00257-g0001.tif"/>
</fig>
</sec>
</body>
<back>
<ack>
<p>The research was supported by the INBRE Program, NIH Grant Nos. P20 RR016454 (National Center for Research Resources) and P20 GM103408 (National Institute of General Medical Sciences).</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Altieri</surname> <given-names>N.</given-names></name> <name><surname>Pisoni</surname> <given-names>D. B.</given-names></name> <name><surname>Townsend</surname> <given-names>J. T.</given-names></name></person-group> (<year>2011</year>). <article-title>Behavioral, clinical, and neurobiological constraints on theories of audiovisual speech integration: a review and suggestions for new directions</article-title>. <source>Seeing Perceiving</source> <volume>24</volume>, <fpage>513</fpage>&#x02013;<lpage>539</lpage>. <pub-id pub-id-type="doi">10.1163/187847611X595864</pub-id><pub-id pub-id-type="pmid">21968081</pub-id></citation>
</ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Altieri</surname> <given-names>N.</given-names></name> <name><surname>Stevenson</surname> <given-names>R. A.</given-names></name> <name><surname>Wallace</surname> <given-names>M. T.</given-names></name> <name><surname>Wenger</surname> <given-names>M. J.</given-names></name></person-group> (<year>2013</year>). <article-title>Learning to associate auditory and visual stimuli: capacity and neural measures of efficiency</article-title>. <source>Brain Topogr</source>. [Epub ahead of print]. <pub-id pub-id-type="doi">10.1007/s10548-013-0333-7</pub-id><pub-id pub-id-type="pmid">24276220</pub-id></citation>
</ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Altieri</surname> <given-names>N.</given-names></name> <name><surname>Townsend</surname> <given-names>J. T.</given-names></name></person-group> (<year>2011</year>). <article-title>An assessment of behavioral dynamic information processing measures in audiovisual speech perception</article-title>. <source>Front. Psychol</source>. <volume>2</volume>:<issue>238</issue>. <pub-id pub-id-type="doi">10.3389/fpsyg.2011.00238</pub-id><pub-id pub-id-type="pmid">21980314</pub-id></citation>
</ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Conrey</surname> <given-names>B.</given-names></name> <name><surname>Pisoni</surname> <given-names>D. B.</given-names></name></person-group> (<year>2006</year>). <article-title>Auditory-visual speech perception and synchrony detection for speech and nonspeech signals</article-title>. <source>J. Acoust. Soc. Am</source>. <volume>119</volume>, <fpage>4065</fpage>. <pub-id pub-id-type="doi">10.1121/1.2195091</pub-id><pub-id pub-id-type="pmid">16838548</pub-id></citation>
</ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Evans</surname> <given-names>K. K.</given-names></name> <name><surname>Treisman</surname> <given-names>A.</given-names></name></person-group> (<year>2010</year>). <article-title>Natural cross-modal mappings between visual and auditory features</article-title>. <source>J. Vis</source>. <volume>10</volume>:<fpage>6</fpage>. <pub-id pub-id-type="doi">10.1167/10.1.6</pub-id><pub-id pub-id-type="pmid">21216758</pub-id></citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fujisaki</surname> <given-names>W.</given-names></name> <name><surname>Shimojo</surname> <given-names>S.</given-names></name> <name><surname>Kashino</surname> <given-names>M.</given-names></name> <name><surname>Nishida</surname> <given-names>S.</given-names></name></person-group> (<year>2004</year>). <article-title>Recalibration of audiovisual simultaneity</article-title>. <source>Nat. Neurosci</source>. <volume>7</volume>, <fpage>773</fpage>&#x02013;<lpage>778</lpage>. <pub-id pub-id-type="doi">10.1038/nn1268</pub-id><pub-id pub-id-type="pmid">15195098</pub-id></citation>
</ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hillock</surname> <given-names>A. R.</given-names></name> <name><surname>Powers</surname> <given-names>A. R.</given-names></name> <name><surname>Wallace</surname> <given-names>M. T.</given-names></name></person-group> (<year>2011</year>). <article-title>Binding of sights and sounds: age-related changes in audiovisual temporal processing</article-title>. <source>Neuropsychologia</source> <volume>49</volume>, <fpage>461</fpage>&#x02013;<lpage>467</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuropsychologia.2010.11.041</pub-id><pub-id pub-id-type="pmid">21134385</pub-id></citation>
</ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lewkowicz</surname> <given-names>D. J.</given-names></name></person-group> (<year>2000</year>). <article-title>The development of inter-sensory temporal perception: an epignetic systems/limitations view</article-title>. <source>Psychol. Bull</source>. <volume>162</volume>, <fpage>281</fpage>&#x02013;<lpage>308</lpage>. <pub-id pub-id-type="doi">10.1037/0033-2909.126.2.281</pub-id><pub-id pub-id-type="pmid">10748644</pub-id></citation>
</ref>
<ref id="B22">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Massaro</surname> <given-names>D. W.</given-names></name></person-group> (<year>1987</year>). <article-title>Speech perception by ear and eye</article-title>, in <source>Hearing by Eye: The Psychology of Lip-Reading</source>, eds <person-group person-group-type="editor"><name><surname>Dodd</surname> <given-names>B.</given-names></name> <name><surname>Campbell</surname> <given-names>R.</given-names></name></person-group> (<publisher-loc>Hillsdale, NJ</publisher-loc>: <publisher-name>Lawrence Erlbaum</publisher-name>), <fpage>53</fpage>&#x02013;<lpage>83</lpage>.</citation>
</ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>McGurk</surname> <given-names>H.</given-names></name> <name><surname>MacDonald</surname> <given-names>J. W.</given-names></name></person-group> (<year>1976</year>). <article-title>Hearing lips and seeing voices</article-title>. <source>Nature</source> <volume>264</volume>, <fpage>746</fpage>&#x02013;<lpage>748</lpage>. <pub-id pub-id-type="doi">10.1038/264746a0</pub-id><pub-id pub-id-type="pmid">1012311</pub-id></citation>
</ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nahorna</surname> <given-names>O.</given-names></name> <name><surname>Berthommier</surname> <given-names>F.</given-names></name> <name><surname>Schwartz</surname> <given-names>J. L.</given-names></name></person-group> (<year>2012</year>). <article-title>Binding and unbinding the auditory and visual streams in the McGurk effect</article-title>. <source>J. Acoust. Soc. Am</source>. <volume>132</volume>, <fpage>1061</fpage>&#x02013;<lpage>1077</lpage>. <pub-id pub-id-type="doi">10.1121/1.4728187</pub-id><pub-id pub-id-type="pmid">22894226</pub-id></citation>
</ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Navarra</surname> <given-names>J.</given-names></name> <name><surname>Alsius</surname> <given-names>A.</given-names></name> <name><surname>Velasco</surname> <given-names>I.</given-names></name> <name><surname>Soto-Faraco</surname> <given-names>S.</given-names></name> <name><surname>Spence</surname> <given-names>C.</given-names></name></person-group> (<year>2010</year>). <article-title>Perception of audiovisual speech synchrony for native and non-native language</article-title>. <source>Brain Res</source>. <volume>1323</volume>, <fpage>84</fpage>&#x02013;<lpage>93</lpage>. <pub-id pub-id-type="doi">10.1016/j.brainres.2010.01.059</pub-id><pub-id pub-id-type="pmid">20117103</pub-id></citation>
</ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poeppel</surname> <given-names>D.</given-names></name> <name><surname>Idsardi</surname> <given-names>W. J.</given-names></name> <name><surname>van Wassenhove</surname> <given-names>V.</given-names></name></person-group> (<year>2008</year>). <article-title>Speech perception at the interface of neurobiology and linguistics</article-title>. <source>Philos. Trans. R. Soc. Lond. B Biol. Sci</source>. <volume>363</volume>, <fpage>1071</fpage>&#x02013;<lpage>1086</lpage>. <pub-id pub-id-type="doi">10.1098/rstb.2007.2160</pub-id><pub-id pub-id-type="pmid">17890189</pub-id></citation>
</ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Powers</surname> <given-names>A. R.</given-names> <suffix>3rd.</suffix></name> <name><surname>Hillock</surname> <given-names>A. R.</given-names></name> <name><surname>Wallace</surname> <given-names>M. T.</given-names></name></person-group> (<year>2009</year>). <article-title>Perceptual training narrows the temporal window of multisensory binding</article-title>. <source>J. Neurosci</source>. <volume>29</volume>, <fpage>12265</fpage>&#x02013;<lpage>12274</lpage>. <pub-id pub-id-type="doi">10.1523/JNEUROSCI.3501-09.2009</pub-id><pub-id pub-id-type="pmid">19793985</pub-id></citation>
</ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rosenblum</surname> <given-names>L.</given-names></name> <name><surname>Schmuckler</surname> <given-names>M. A.</given-names></name> <name><surname>Johnson</surname> <given-names>J. A.</given-names></name></person-group> (<year>1997</year>). <article-title>The McGurk effect in infants</article-title>. <source>Percept. Psychophys</source>. <volume>59</volume>, <fpage>347</fpage>&#x02013;<lpage>357</lpage>. <pub-id pub-id-type="doi">10.3758/BF03211902</pub-id><pub-id pub-id-type="pmid">9136265</pub-id></citation>
</ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schroeder</surname> <given-names>C.</given-names></name> <name><surname>Lakatos</surname> <given-names>P.</given-names></name> <name><surname>Kajikawa</surname> <given-names>Y.</given-names></name> <name><surname>Partan</surname> <given-names>S.</given-names></name> <name><surname>Puce</surname> <given-names>A.</given-names></name></person-group> (<year>2008</year>). <article-title>Neuronal oscillations and visual amplification of speech</article-title>. <source>Trends Cogn. Sci</source>. <volume>12</volume>, <fpage>106</fpage>&#x02013;<lpage>113</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2008.01.002</pub-id><pub-id pub-id-type="pmid">18280772</pub-id></citation>
</ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stekelenburg</surname> <given-names>J. J.</given-names></name> <name><surname>Vroomen</surname> <given-names>J.</given-names></name></person-group> (<year>2007</year>). <article-title>Neural correlates of multisensory integration of ecologically valid audiovisual events</article-title>. <source>J. Cogn. Neurosci</source>. <volume>19</volume>, <fpage>1964</fpage>&#x02013;<lpage>1973</lpage>. <pub-id pub-id-type="doi">10.1162/jocn.2007.19.12.1964</pub-id><pub-id pub-id-type="pmid">17892381</pub-id></citation>
</ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stevenson</surname> <given-names>R. A.</given-names></name> <name><surname>Zemtsov</surname> <given-names>R. K.</given-names></name> <name><surname>Wallace</surname> <given-names>M. T.</given-names></name></person-group> (<year>2012</year>). <article-title>Individual differences in the multisensory temporal binding window predict susceptibility to audiovisual illusions</article-title>. <source>J. Exp. Psychol. Hum. Percept. Perform</source>. <volume>38</volume>, <fpage>1517</fpage>&#x02013;<lpage>1529</lpage>. <pub-id pub-id-type="doi">10.1037/a0027339</pub-id><pub-id pub-id-type="pmid">22390292</pub-id></citation>
</ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sumby</surname> <given-names>W. H.</given-names></name> <name><surname>Pollack</surname> <given-names>I.</given-names></name></person-group> (<year>1954</year>). <article-title>Visual contribution to speech intelligibility in noise</article-title>. <source>J. Acoust. Soc. Am</source>. <volume>26</volume>, <fpage>212</fpage>&#x02013;<lpage>215</lpage>. <pub-id pub-id-type="doi">10.1121/1.1907309</pub-id></citation>
</ref>
<ref id="B18">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Summerfield</surname> <given-names>Q.</given-names></name></person-group> (<year>1987</year>). <article-title>Some preliminaries to a comprehensive account of audio-visual speech perception</article-title>, in <source>The Psychology of Lip-Reading</source>, eds <person-group person-group-type="editor"><name><surname>Dodd</surname> <given-names>B.</given-names></name> <name><surname>Campbell</surname> <given-names>R.</given-names></name></person-group> (<publisher-loc>Hillsdale, NJ</publisher-loc>: <publisher-name>LEA</publisher-name>), <fpage>3</fpage>&#x02013;<lpage>50</lpage>.</citation>
</ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>van Wassenhove</surname> <given-names>V.</given-names></name></person-group> (<year>2013</year>). <article-title>Speech through ears and eyes: interfacing the senses with the supramodal brain</article-title>. <source>Front. Psychol</source>. <volume>4</volume>:<issue>388</issue>. <pub-id pub-id-type="doi">10.3389/fpsyg.2013.00388</pub-id><pub-id pub-id-type="pmid">23874309</pub-id></citation>
</ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>van Wassenhove</surname> <given-names>V.</given-names></name> <name><surname>Grant</surname> <given-names>K. W.</given-names></name> <name><surname>Poeppel</surname> <given-names>D.</given-names></name></person-group> (<year>2005</year>). <article-title>Visual speech speeds up the neural processing of auditory speech</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A</source>. <volume>102</volume>, <fpage>1181</fpage>&#x02013;<lpage>1186</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.0408949102</pub-id><pub-id pub-id-type="pmid">15647358</pub-id></citation>
</ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vatakis</surname> <given-names>A.</given-names></name> <name><surname>Navarra</surname> <given-names>J.</given-names></name> <name><surname>Soto-Faraco</surname> <given-names>S.</given-names></name> <name><surname>Spence</surname> <given-names>C.</given-names></name></person-group> (<year>2007</year>). <article-title>Temporal recalibration during asynchronous audiovisual speech perception</article-title>. <source>Exp. Brain Res</source>. <volume>181</volume>, <fpage>173</fpage>&#x02013;<lpage>181</lpage>. <pub-id pub-id-type="doi">10.1007/s00221-007-0918-z</pub-id><pub-id pub-id-type="pmid">17431598</pub-id></citation>
</ref>
</ref-list>
</back>
</article>