<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Psychology</journal-id>
<journal-title>Frontiers in Psychology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Psychology</abbrev-journal-title>
<issn pub-type="epub">1664-1078</issn>
<publisher>
<publisher-name>Frontiers Research Foundation</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpsyg.2010.00232</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Segregation of Vowels and Consonants in Human Auditory Cortex: Evidence for Distributed Hierarchical Organization</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Obleser</surname> <given-names>Jonas</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="author-notes" rid="fn001">&#x0002A;</xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Leaver</surname> <given-names>Amber M.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>VanMeter</surname> <given-names>John</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
</contrib>
<contrib contrib-type="author">
<name><surname>Rauschecker</surname> <given-names>Josef P.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Laboratory of Integrative Neuroscience and Cognition, Department of Physiology and Biophysics, Georgetown University Medical Center</institution> <country>Washington, DC, USA</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Neuropsychology, Max Planck Institute for Human Cognitive and Brain Sciences</institution> <country>Leipzig, Germany</country></aff>
<aff id="aff3"><sup>3</sup><institution>Center for Functional and Molecular Imaging, Georgetown University Medical Center</institution> <country>Washington, DC, USA</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Micah M. Murray, Universit&#x000E9; de Lausanne, Switzerland</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Lee M. Miller, University of California Davis, USA; Elia Formisano, Maastricht University, Netherlands</p></fn>
<fn fn-type="corresp" id="fn001"><p>&#x0002A;Correspondence: Jonas Obleser, Max Planck Institute for Human Cognitive and Brain Sciences, Stephanstrasse 1a, 04103 Leipzig, Germany. e-mail: <email>obleser&#x00040;cbs.mpg.de</email></p></fn>
<fn fn-type="other" id="fn002"><p>This article was submitted to Frontiers in Auditory Cognitive Neuroscience, a specialty of Frontiers in Psychology.</p></fn>
</author-notes>
<pub-date pub-type="epreprint">
<day>30</day>
<month>10</month>
<year>2010</year>
</pub-date>
<pub-date pub-type="epub">
<day>24</day>
<month>12</month>
<year>2010</year>
</pub-date>
<pub-date pub-type="collection">
<year>2010</year>
</pub-date>
<volume>1</volume>
<elocation-id>232</elocation-id>
<history>
<date date-type="received">
<day>13</day>
<month>08</month>
<year>2010</year>
</date>
<date date-type="accepted">
<day>08</day>
<month>12</month>
<year>2010</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2010 Obleser, Leaver, VanMeter and Rauschecker.</copyright-statement>
<copyright-year>2010</copyright-year>
<license license-type="open-access" xlink:href="http://www.frontiersin.org/licenseagreement"><p>This is an open-access article subject to an exclusive license agreement between the authors and the Frontiers Research Foundation, which permits unrestricted use, distribution, and reproduction in any medium, provided the original authors and source are credited.</p></license>
</permissions>
<abstract>
<p>The speech signal consists of a continuous stream of consonants and vowels, which must be de- and encoded in human auditory cortex to ensure the robust recognition and categorization of speech sounds. We used small-voxel functional magnetic resonance imaging to study information encoded in local brain activation patterns elicited by consonant-vowel syllables, and by a control set of noise bursts. First, activation of anterior&#x02013;lateral superior temporal cortex was seen when controlling for unspecific acoustic processing (syllables versus band-passed noises, in a &#x0201C;classic&#x0201D; subtraction-based design). Second, a classifier algorithm, which was trained and tested iteratively on data from all subjects to discriminate local brain activation patterns, yielded separations of cortical patches discriminative of vowel category versus patches discriminative of stop-consonant category across the entire superior temporal cortex, yet with regional differences in average classification accuracy. Overlap (voxels correctly classifying both speech sound categories) was surprisingly sparse. Third, lending further plausibility to the results, classification of speech&#x02013;noise differences was generally superior to speech&#x02013;speech classifications, with the no\ exception of a left anterior region, where speech&#x02013;speech classification accuracies were significantly better. These data demonstrate that acoustic&#x02013;phonetic features are encoded in complex yet sparsely overlapping local patterns of neural activity distributed hierarchically across different regions of the auditory cortex. The redundancy apparent in these multiple patterns may partly explain the robustness of phonemic representations.</p>
</abstract>
<kwd-group>
<kwd>auditory cortex</kwd>
<kwd>speech</kwd>
<kwd>multivariate pattern classification</kwd>
<kwd>fMRI</kwd>
<kwd>syllables</kwd>
<kwd>vowels</kwd>
<kwd>consonants</kwd>
</kwd-group>
<counts>
<fig-count count="9"/>
<table-count count="0"/>
<equation-count count="0"/>
<ref-count count="58"/>
<page-count count="14"/>
<word-count count="9368"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="introduction">
<title>Introduction</title>
<p>Speech perception requires a cascade of processing steps that lead to a surprisingly robust mapping of the acoustic speech stream onto phonological representations and, ultimately, meaning. This is only possible due to highly efficient acoustic decoding and neural encoding of the speech signal throughout various levels of the auditory pathway. Imagine listening to a stream of words beginning with <italic>dee</italic>&#x02026;, <italic>goo</italic>&#x02026;, or <italic>dow</italic>&#x02026;, uttered by different talkers: One usually does not experience any difficulty in perceiving, categorizing, and further processing these speech sounds, although they may be produced, for example, by a male, female, or child, whose voices differ vastly in fundamental frequency. The mechanisms with which the brain accomplishes the invariant categorization and identification of speech sounds and which subareas of auditory cortex are crucially involved remains largely unexplained. Some of the relevant cortical structures have been identified in recent years through microelectrode recordings in non-human primates and by using neuroimaging techniques in humans. It has been shown repeatedly that structures surrounding the primary (or core) areas of auditory cortex are critically involved in speech perception. In particular, research conducted over the last 10 years has demonstrated consistently that the anterior and lateral parts of the superior temporal gyrus (STG) and superior temporal sulcus (STS) are activated more rigorously by speech sounds than by non-speech noise or pseudo-speech of similar acoustic complexity (Binder et al., <xref ref-type="bibr" rid="B3">2000</xref>; Scott et al., <xref ref-type="bibr" rid="B46">2000</xref>; Davis and Johnsrude, <xref ref-type="bibr" rid="B8">2003</xref>; Liebenthal et al., <xref ref-type="bibr" rid="B24">2005</xref>; Warren et al., <xref ref-type="bibr" rid="B53">2005</xref>; Obleser et al., <xref ref-type="bibr" rid="B28">2006</xref>; Rauschecker and Scott, <xref ref-type="bibr" rid="B41">2009</xref>). However, there is an ongoing debate as to whether these anterior&#x02013;lateral areas actually house abstract and categorical representations of speech sounds. Other authors have argued for the importance of posterior STG/STS in phonetic&#x02013;phonological processing (e.g., Okada et al., <xref ref-type="bibr" rid="B34">2010</xref>). A third position would be that the neural speech sound code is completely distributed and does not have a defined locus of main representation at all.</p>
<p>We hypothesize that local activation patterns providing segregation of acoustic&#x02013;phonetic features occur most frequently in higher areas of auditory cortex (Wang, <xref ref-type="bibr" rid="B51">2000</xref>; Tian et al., <xref ref-type="bibr" rid="B50">2001</xref>; Read et al., <xref ref-type="bibr" rid="B42">2002</xref>; Zatorre et al., <xref ref-type="bibr" rid="B56">2004</xref>). A robust approach to test this hypothesis would be to analyze the anatomical distribution and mean accuracy of local classifying patterns across areas of the superior temporal cortex. Although functional magnetic resonance imaging (fMRI) is a technique that averages over a vast number of neurons with different response behaviors in each sampled voxel, it can be used to detect complex local patterns that extend over millimeters of cortex, especially when comparably small voxels are sampled (here less than 2&#x02009;mm in each dimension) and multivariate analysis methods are used (Haxby et al., <xref ref-type="bibr" rid="B14">2001</xref>; Haynes and Rees, <xref ref-type="bibr" rid="B16">2005</xref>; Kriegeskorte et al., <xref ref-type="bibr" rid="B21">2006</xref>; Norman et al., <xref ref-type="bibr" rid="B27">2006</xref>). Particularly relevant to phoneme representation, these methods are capable of exploiting the richness and complexity of information across local arrays of voxels rather than being restricted to broad <sc>bold</sc> amplitude differences averaged across large units of voxels (for discussion see Obleser and Eisner, <xref ref-type="bibr" rid="B29">2009</xref>). Notably, a seminal study by Formisano et al. (<xref ref-type="bibr" rid="B11">2008</xref>) demonstrated in seven subjects robust above-chance classification for a set of isolated vowel stimuli (around 65% correctness when training and testing the classifier on single trials of data) in temporal areas ranging from lateral Heschl&#x00027;s gyri anterior and posterior, down into the superior temporal sulcus. The study demonstrated the power of the multivariate method and its applicability to problems of speech sound representation. However, one might expect regional variation in accuracy of classification across superior temporal cortex, an issue not specifically addressed by the study of Formisano et al. (<xref ref-type="bibr" rid="B11">2008</xref>). Moreover, in order to approach the robustness with which speech sounds are neurally encoded, it is important to consider that such sounds are rarely heard in isolation. Two next steps follow immediately from this, covered in the present report.</p>
<p>First, the direct comparison of the information contained in activation patterns for speech versus noise (here, consonant-vowel syllables versus band-passed noise classification) to the information for within-speech activation patterns (e.g., vowel classification) will help understand the hierarchies of the subregions in superior temporal cortex. Second, the systematic variation of acoustic&#x02013;phonetic features not in isolated vowels, but in naturally articulated syllables will put any machine-learning algorithm presented with such complex data to a more thorough test.</p>
<p>To this end, we chose naturally produced syllables, built from two categories of stop-consonants ([d] vs. [g]) and two broad categories of vowels ([u:, o:] vs. [i:, e:]). Importantly, such an orthogonal 2&#x02009;&#x000D7;&#x02009;2 design allows to disentangle, within broad activations of the superior temporal cortex by speech, local voxel activation patterns most informative for decoding the heard <italic>vowel</italic> quality (back vowels [u:, o:] vs. front vowels [i:, e]; see Figure <xref ref-type="fig" rid="F1">1</xref>; see also Obleser et al., <xref ref-type="bibr" rid="B32">2004b</xref>, <xref ref-type="bibr" rid="B28">2006</xref>) from patterns relevant for decoding the heard stop-consonant quality of a syllable. Fortunately, the acoustic correlates of the &#x0201C;place of articulation&#x0201D; feature, for instance, are well defined for both vowels and consonants (e.g., Blumstein and Stevens, <xref ref-type="bibr" rid="B5">1980</xref>; Lahiri et al., <xref ref-type="bibr" rid="B22">1984</xref>) and are therefore suitable to test this hypothesis. By including a conventional contrast (speech versus band-passed noise), results from this design will be able to compare the relative gain offered by multivariate analysis methods over classical (univariate) analyses and help to resolve questions of hierarchical processing in the auditory brain.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p><bold>Exemplary spectrograms of stimuli from the respective syllable conditions (waveform section of consonantal burst has been amplified by &#x0002B;6 dB before calculating spectrogram for illustration only)</bold>. The principal 2&#x02009;&#x000D7;&#x02009;2 design realized by the selection of phonetic features for the syllables is also shown.</p></caption>
<graphic xlink:href="fpsyg-01-00232-g001.tif"/>
</fig>
<p>By first training a classifier on the auditory brain data from a set of participants (independent observations) and testing it then on a new set of data (from another subject; repeating this procedure as many times as there are subjects), and by using the responses to natural consonant&#x02013;vowel combinations as data, this challenging classification problem is most suited to query the across-subjects consistency of neural information on speech sounds for defined subregions of the auditory cortex. Also, we will compare the speech&#x02013;speech classifier performance to speech&#x02013;noise classifier performance in select subregions of the superior temporal cortex in order to establish a profile of these regions&#x02019; response specificities.</p>
</sec>
<sec id="s1" sec-type="materials|methods">
<title>Materials and Methods</title>
<sec>
<title>Subjects</title>
<p>Sixteen Subjects (8 females, mean age 24.5&#x02009;years, SD 4.9) took part in the study. All of the subjects were right-handed monolingual speakers of American English and reported no history of neurological or otological disorders. Informed consent was obtained from all subjects, and they received a financial compensation of $20.</p>
</sec>
<sec>
<title>Stimuli</title>
<p>The speech sound set consisted of 64 syllable tokens; they were acoustically variant realizations of eight American English consonant-vowel (CV) syllables: [di:] (e.g., as in &#x0201C;<underline>dee</underline>per&#x0201D;), [de:] (<underline>dai</underline>sy), [du:] (&#x0201C;<underline>Doo</underline>little&#x0201D;), [do:] (&#x0201C;dope&#x0201D;), [gi:] (&#x0201C;<underline>gee</underline>zer&#x0201D;), [ge:] (&#x0201C;<underline>gait</underline>&#x0201D;), [gu:] (&#x0201C;<underline>Goo</underline>gle&#x0201D;), and [go:] (&#x0201C;<underline>goa</underline>t&#x0201D;; these words were articulated with exaggerated long vowels to allow for coarticulation-free resulting edits). Each syllable was used in four different versions, edited from single-word utterances of four monolingual native speakers (two females and two males, recorded using a DAT-recorder and a microphone in a sound-proof chamber). Figure <xref ref-type="fig" rid="F1">1</xref> gives an overview over the spectro-temporal characteristics of the syllables. The CV syllables were selected to test for possible mapping mechanisms of speech sounds in the auditory cortices. See <xref ref-type="app" rid="A1">Appendix</xref> for extensive description of acoustic characteristics of the entire syllable set.</p>
</sec>
<sec>
<title>Experimental design and scanning</title>
<p>The paradigm contrasted the syllables in four conditions: [di:/de:], [du:/do:], [gi:/ge:], and [gu:/go:] (Figure <xref ref-type="fig" rid="F1">1</xref>). All syllables were instantly recognized as human speech and correctly identified when first heard by the subjects. Please note that there was considerable acoustic intra-conditional variance due to the four different speakers and the combined usage of [u:, o:] and [i:, e:].</p>
<p>As an additional non-speech reference condition, a set of eight different noise bursts was presented, comprising four different center frequencies (0.25, 0.5, 2, and 4&#x02009;kHz) and two different bandwidths (one-third and one-octave), expected to activate mainly early (core and belt) areas of the auditory cortex (Wessinger et al., <xref ref-type="bibr" rid="B54">2001</xref>).</p>
<p>All audio files were equalized with respect to sampling rate (22.05&#x02009;kHz), envelope (180&#x02009;ms length; no fade-in but cut at zero-crossing for syllables, 3&#x02009;ms Gaussian fade-in for noise bursts; 10&#x02009;ms Gaussian fade-out) and RMS intensity. Stimuli were presented binaurally using Presentation software (Neurobehavioral Systems Inc.) and a customized air-conduction sound delivery system at an average sound pressure level of 65&#x02009;dB.</p>
<p>Functional magnetic resonance imaging was performed on a 3-Tesla Siemens Trio scanner using the standard volume head coil for radio frequency transmission. After positioning the subject, playing back a 16-item sequence of exemplary syllables from all four conditions and ensuring that subjects readily recognized all items as speech, a 42-min echo planar image acquisition period with small voxel sizes and a sparse sampling procedure started (TR&#x02009;&#x0003D;&#x02009;10&#x02009;s, TA&#x02009;&#x0003D;&#x02009;2.48&#x02009;s, 25 axial slices, resolution 1.5&#x02009;mm&#x02009;&#x000D7;&#x02009;1.5&#x02009;mm&#x02009;&#x000D7;&#x02009;1.9&#x02009;mm, no gap, TE&#x02009;&#x0003D;&#x02009;36&#x02009;ms, 90&#x000B0; flip angle, 192&#x02009;mm&#x02009;&#x000D7;&#x02009;192&#x02009;mm field of view, 128&#x02009;&#x000D7;&#x02009;128 matrix). The slices were positioned such as to cover the entire superior and middle temporal gyri and the inferior frontal gyrus, approximately parallel to the AC&#x02013;PC line. In total, 252 volumes of interest were acquired; 42 volumes per condition (four syllable conditions; band-passed noises; silent trials).</p>
<p>Subjects were instructed to listen attentively to the sequence of sounds: Between volume acquisitions, 7500&#x02009;ms of silence allowed presentation of eight acoustically different stimuli of one condition (either eight stimuli of a given syllable condition or eight noise bursts) with an onset asynchrony of 900&#x02009;ms. To exemplify, for the <italic>/d/&#x02013;front vowel</italic> condition in a given trial, a sequence of, e.g., [di:]<sub>male1</sub>, [de:]<sub>male2</sub>, [di:]<sub>female1</sub>, [di:]<sub>male2</sub>, [di:]<sub>female2</sub>, [de:]<sub>female2</sub>, [de:]<sub>male1</sub>, and [de:]<sub>female2</sub> was presented in silence, followed by volume acquisition; as evident from such an exemplary sequence, many salient acoustic effects (e.g., pitch; idiosyncracies in consonant&#x02013;vowel coarticulation, etc.) will average out and the volume acquisition will primarily capture an &#x0201C;average&#x0201D; activation related to the &#x0201C;abstract&#x0201D; spectro-temporal features of the syllables&#x02019; stop-consonant (alveolar /d/ or velar /g/) and vowel category (front vowels /i:, e:/ or back vowel /u:, o:/) only.</p>
<p>Presentation of conditions was pseudo-randomized. Four different randomization lists were used counterbalanced across participants, avoiding any unintended systematic order effects. After functional data acquisition, a 3-D high-resolution anatomical T1-weighted scan (MP-RAGE, 256&#x02009;mm<sup>3</sup> field of view, and resolution 1&#x02009;mm&#x02009;&#x000D7;&#x02009;1&#x02009;mm&#x02009;&#x000D7;&#x02009;1&#x02009;mm) was also acquired. Total scanning time amounted to 55&#x02009;min.</p>
</sec>
<sec>
<title>Data analysis</title>
<p>All data were analyzed using SPM8 (Wellcome Department of Imaging Neuroscience). The first volume was discarded, and all further volumes were realigned to the first volume acquired and corrected for field inhomogeneities (&#x0201C;unwarped&#x0201D;). They were co-registered to the high-resolution anatomical scan. For further reference, a spatial normalization was performed (using the gray-matter segmentation and normalization approach as implemented in SPM). For further analysis strategies, mildly smoothed images (using a 3&#x02009;mm&#x02009;&#x000D7;&#x02009;3&#x02009;mm&#x02009;&#x000D7;&#x02009;4&#x02009;mm Gaussian kernel) as well as entirely non-smoothed images were retained.</p>
<p>In order to arrive at <italic>univariate</italic> estimates of activation in all conditions against silence, the native-space image time series of individuals were modeled in a general linear model using a finite impulse response (length 1&#x02009;s, first order). Scaling to the grand mean and 128-s high-pass filtering were applied. The resulting contrast images (especially the <italic>sound greater than silence</italic> contrast, the <italic>speech greater than noise</italic> contrast, as well as all four <italic>syllable greater silence</italic> contrasts) were transformed to MNI space using the normalization parameters derived in each individual earlier. Figure <xref ref-type="fig" rid="FA1">A1</xref>A of Appendix shows two examples of individuals&#x02019; non-smoothed activation maps in native space for the contrast <italic>sound greater than silence</italic>. Random-effects models of the univariate data were thresholded at <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.005 and a cluster extent of 30; a Monte Carlo simulation (Slotnick et al., <xref ref-type="bibr" rid="B48">2003</xref>) ensured that this combination, given our data acquisition parameters, protects against inflated type-I errors on the whole brain significance level of &#x003B1;&#x02009;&#x0003D;&#x02009;0.05.</p>
<p>Univariate fMRI analyses focus on differences in activation strength associated with the experimental conditions. <italic>Pattern</italic> or <italic>multivariate</italic> analysis, by contrast, focuses on the information contained in a region&#x00027;s activity pattern changes related to the experimental conditions, which allows inferences about the information content of a region (Kriegeskorte and Bandettini, <xref ref-type="bibr" rid="B20">2007</xref>; Formisano et al., <xref ref-type="bibr" rid="B11">2008</xref>). For our classification purposes, we used the un-smoothed and non-thresholded but MNI-normalized maps of individual brain activity patterns, estimated as condition-specific contrasts of each condition (i.e., the SPM maps of non-thresholded <italic>t</italic>-estimates from the condition-specific contrasts, e.g., /d/&#x02013;front against silence, /g/&#x02013;front against silence, and so forth; see Misaki et al., <xref ref-type="bibr" rid="B25">2010</xref> for evidence on the advantageous performance of <italic>t</italic>-value-based classification; note, however, that this approach is different from training and testing on single trial data (cf. Formisano et al., <xref ref-type="bibr" rid="B11">2008</xref>).</p>
<p>In order to ensure absolute independence of testing and training sets, we decided to pursue an across-participants classification approach. As data in our experiment had been acquired within one run, single trials of a given individual would have been too dependent on each other, and we chose to pursue an across-participants classification instead: We split our subject sample into <italic>n</italic>&#x02009;&#x02212;&#x02009;1 <italic>training</italic> data sets and a <italic>n</italic>&#x02009;&#x0003D;&#x02009;1-sized <italic>testing</italic> data set. This procedure was repeated <italic>n</italic> times, yielding for each voxel and each classification task <italic>n</italic>&#x02009;&#x0003D;&#x02009;16 estimates, from which an average classification accuracy could be derived. Please note that successful (i.e., significant above-chance) classification in such an approach is particularly meaningful, as it indicates that the information coded in a certain spatial location (voxel or group of voxels) is reliable across individuals.</p>
<p>A linear support vector machine (SVM) classifier was applied to analyze the brain activation patterns (LIBSVM Matlab-toolbox v2.89). Several studies in cognitive neuroscience have recently reported accurate classification performance using a SVM classifier (e.g., Haynes and Rees, <xref ref-type="bibr" rid="B16">2005</xref>; Formisano et al., <xref ref-type="bibr" rid="B11">2008</xref>), and SVM is one of the most widely used classification approaches across research fields. For each of our three main classification problems (accurate vowel classification from the syllable data; accurate stop-consonant classification from the same data; accurate noise versus speech classification), a feature vector was obtained using the <italic>t</italic>-estimates from a set of voxels (see &#x0201C;search-light&#x0201D; approach below) as feature values. In short, a linear SVM separates training data points <italic>x</italic> for two different given labels (e.g., consonant [d] vs. consonant [g]) by fitting a hyperplane <italic>w<sup>T</sup></italic> <italic>x</italic>&#x02009;&#x0002B;&#x02009;<italic>b</italic>&#x02009;&#x0003D;&#x02009;0 defined by the weight vector <italic>w</italic> and an offset <italic>b</italic>. The classification performance (accuracy) was tested, as outlined above, by using a leave-one-out cross validation (LOOCV) across participants&#x02019; data sets: The classifier was trained on 15 data sets, while one data set was left out for later testing the classifier in &#x0201C;predicting&#x0201D; the labels from the brain activation pattern, ensuring strict independence of training and test data. Classification accuracies were obtained by comparing the predicted labels with actual data labels and averaged across the 16 leave-one-out iterations afterward, resulting in a mean classification accuracy value per voxel.</p>
<p>We chose a multivariate so-called &#x0201C;search-light&#x0201D; approach to estimate the local discriminative pattern over the entire voxel space measured (Kriegeskorte et al., <xref ref-type="bibr" rid="B21">2006</xref>; Haynes, <xref ref-type="bibr" rid="B15">2009</xref>): multivariate pattern classifications were conducted for each voxel position, with the &#x0201C;search-light&#x0201D; feature vector containing <italic>t</italic>-estimates for that voxel and a defined group of its closest neighbors. Here, a search-light radius of 4.5&#x02009;mm (i.e., approximately three times the voxel length in each dimension) was selected, comprising about 60 (un-smoothed) voxels per search-light position. Thus, any significant voxel shown in the figures will represent a robust local pattern of its 60 nearest neighbors. We ensured &#x0201C;robustness&#x0201D; by constructing bootstrapped (<italic>n</italic>&#x02009;&#x0003D;&#x02009;1000) confidence intervals (CI) for each voxel patch&#x00027;s mean accuracy (as obtained in the leave-one-out-procedure). Thus, we were able to identify voxel patches with a mean accuracy above chance, that is, voxel patches whose 95% CI did not cover the 50% chance level. The procedure is exemplified in Figure <xref ref-type="fig" rid="FA2">A2</xref> of Appendix.</p>
<p>As an additional control for a possible inflated alpha error due to multiple comparisons (which is often neglected when using CI; Benjamini and Yekutieli, <xref ref-type="bibr" rid="B2">2005</xref>), we used a procedure suggested in analogy to the established false discovery rate (FDR; e.g., Genovese et al., <xref ref-type="bibr" rid="B13">2002</xref>), called <italic>false coverage-statement rate</italic> (FCR). In brief, we &#x0201C;selected&#x0201D; those voxels whose 95% CI for accuracy did not cover the 50% (chance) level in a first pass (see above). In a second correcting pass, we (re-)constructed FCR-corrected CI for these select voxels at a level of <italic>1</italic>&#x02009;&#x02212;&#x02009;<italic>R</italic>&#x02009;&#x000D7;&#x02009;<italic>q</italic>/<italic>m</italic>, where <italic>R</italic> is the number of selected voxels at the first pass, <italic>m</italic> is the total number of voxels tested, and q is the tolerated rate for false coverage statements, here 0.05 (Benjamini and Yekutieli, <xref ref-type="bibr" rid="B2">2005</xref>). See also Figure <xref ref-type="fig" rid="FA2">A2</xref> of Appendix. Effectively, this procedure yielded FCR-corrected voxel-wise confidence limits at &#x003B1;&#x02009;&#x0223C;&#x02009;0.004 rather than 0.05; approximately two-thirds of all voxels declared robust classifiers at the first pass also survived this correcting second pass.</p>
<p>Note that all brain map overlays of average classification accuracy are thresholded to show only voxels with a bootstrapped CI lower limit not covering 50% (chance level; 1000 repetitions), while all bar graphs or numerical reports and ensuing inferences are based on the FCR-corrected data. Significance differences, as indicated by asterisks in the bar graphs, were assessed using Wilcoxon signed-rank tests, corrected for multiple comparisons using false discovery rate (FDR, <italic>q</italic>&#x02009;&#x0003D;&#x02009;0.05) correction.</p>
</sec>
</sec>
<sec>
<title>Results</title>
<sec>
<title>Univariate contrasts analysis</title>
<p>All 16 subjects were included in the analysis, as they exhibited bilateral activation of temporal lobe structures when global contrasts of any auditory activation were tested. Examples are shown in Figure <xref ref-type="fig" rid="FA1">A1</xref> of Appendix.</p>
<p>As predicted, the global contrast of speech sounds over band-pass filtered noise (random effects, using the mildly smoothed images) yielded focal bilateral activations of the lateral middle to anterior aspects of the STG extending into the upper bank of the STS (Figure <xref ref-type="fig" rid="F2">2</xref>).</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p><bold>Random effects (<italic>N</italic>&#x02009;&#x0003D;&#x02009;16) of the univariate analysis for <italic>speech</italic>&#x02009;&#x0003E;&#x02009;<italic>band-passed noise</italic> (red) and <italic>sound</italic>&#x02009;&#x0003E;&#x02009;<italic>silence</italic> (blue; overlay purple) based on smoothed and normalized individual contrast maps, thresholded at <italic>p</italic>&#x02009;&#x0003c;&#x02009;0. 005 at the voxel level and a cluster extent of at least 30 voxels; displayed on an average of all 16 subjects&#x02019; T1-weighted images (normalized in MNI space)</bold>.</p></caption>
<graphic xlink:href="fpsyg-01-00232-g002.tif"/>
</fig>
<p>In order to reveal cortical patches that might show stronger activation of one vowel type over another or one stop-consonant type over another, we tested the direct univariate contrasts of syllable conditions or groups of conditions against each other. However, none of these contrasts yielded robust significant auditory activations at any reasonable statistical threshold (<italic>p</italic>&#x02009;&#x0003C;&#x02009;0.001 or <italic>p</italic>&#x02009;&#x0003C;&#x02009;0.005 uncorrected). This negative result held true on a whole-volume level as well as within a search volume restricted by a liberally-thresholded (<italic>p</italic>&#x02009;&#x0003C;&#x02009;0.01) &#x0201C;<italic>sound greater than silence</italic>&#x0201D; contrast. This outcome is to be taken as safe indication that, at this &#x0201C;macroscopic&#x0201D; resolution which contains tens or hundreds of voxels, the different classes of speech stimuli used here do not exert broad differences in <sc>bold</sc> amplitudes. All further analyses thus focused on studying local <italic>multi-voxel patterns</italic> of activation (using the Search-light Approach described in Materials and Methods) rather than massed univariate tests on activation strengths.</p>
</sec>
<sec>
<title>Multivariate pattern classification results</title>
<p>Figures <xref ref-type="fig" rid="F3">3</xref>&#x02013;<xref ref-type="fig" rid="F6">6</xref> illustrate the results for robust (i.e., significantly above-chance) vowel&#x02013;vowel, stop&#x02013;stop-consonant, as well as noise&#x02013;speech classification from local patterns of brain activity. In Figure <xref ref-type="fig" rid="F3">3</xref>, all voxels marked in color represent centroids of local 60-voxel clusters, from which correct prediction of vowel (red) or stop-consonant (blue) category of a heard syllable was possible.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p><bold>Maps of correct classification for vowel and stop category</bold>. Display of significantly above-chance classification voxels from the leave-one-out across-subjects classification. Axial (top panels) and sagittal slices (bottom panels) are shown, arranged for left and right hemisphere view, and displayed on a standard T1 template brain. Note the sparse overlap in voxels significantly classifying vowel information (red) and stop information (blue; voxels allowing correct classification of both are marked in purple).</p></caption>
<graphic xlink:href="fpsyg-01-00232-g003.tif"/>
</fig>
<p>The displayed mean accuracies per voxel resulted from averaging the results of all 16 leave-one-out classifications per voxel; a 1000-iterations bootstrap resampling test was used to retain only those voxels where the 95%-CI of the mean accuracy did not include chance level (50%). As can be seen in Figure <xref ref-type="fig" rid="F3">3</xref>, voxels that allow for such robust classification are, first, distributed throughout the lateral superior temporal cortex. Second, these voxels appear to contain neural populations that are highly selective in their spectro-temporal response properties. This is evident from the fact that voxels robustly classifying both vowel and stop-consonant categories (i.e., steady-state, formant-like sounds with simple temporal structure versus non-steady-state, sweep-like sounds with complex temporal structure) are sparse (see also Figure <xref ref-type="fig" rid="F6">6</xref>). Also, subareas of auditory core cortex (here using the definition of Morosan et al., <xref ref-type="bibr" rid="B26">2001</xref>; Rademacher et al., <xref ref-type="bibr" rid="B39">2001</xref>; see Figure <xref ref-type="fig" rid="FA3">A3</xref> of Appendix) appear relatively spared in the left (but see noise versus speech classification results below); in the right, stop-consonant classification is above-chance in TE 1.0 and vowel classification in TE 1.1. (Figure <xref ref-type="fig" rid="FA3">A3</xref>A of Appendix). Figure <xref ref-type="fig" rid="FA1">A1</xref>B of Appendix also shows individual vowel and stop-consonant classification results for four different subjects.</p>
<p>These observations were followed up with a quantitative analysis of mean accuracy within regions of interest (Figure <xref ref-type="fig" rid="F4">4</xref>). Differences in mean accuracy between vowel and stop classification; differences in speech classification (average vowel and stop classification) and noise versus speech classification; and the proportion of significantly classifying voxels were tested in pre-defined regions of interest. The outline of these regions is shown in Figure <xref ref-type="fig" rid="FA3">A3</xref> of Appendix. It directly resulted from the distribution of broadly &#x0201C;sound-activated&#x0201D; voxels (T15&#x02009;&#x0003E;&#x02009;3 in the &#x0201C;sound&#x02009;&#x0003E;&#x02009;silence&#x0201D; contrast); left and right voxels likely to be located in primary auditory cortex (PAC; TE 1.0&#x02013;1.2; Morosan et al., <xref ref-type="bibr" rid="B26">2001</xref>) were separated from regions anterior (ant) and posterior to it (post) as well as lateral to it (mid).</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p><bold>Comparisons of classification accuracy in regions of interest</bold>. <bold>(A)</bold> Classification accuracies for vowel (red) and stop-consonant (blue) classification. Means of all FCR-corrected above-chance voxels within a region &#x000B1;1 standard error of the mean are shown. Asterisks indicate a FDR-corrected significant difference; &#x0002A;<italic>p</italic>&#x02009;&#x0003C;&#x02009;0.05, &#x0002A;&#x0002A;<italic>p</italic>&#x02009;&#x0003C;&#x02009;0.01, &#x0002A;&#x0002A;&#x0002A;<italic>p</italic>&#x02009;&#x0003C;&#x02009;0.001. <italic>L, R &#x02013; total average of significant left and right hemisphere voxels, respectively; for anatomical definition of subregions see Figure <xref ref-type="fig" rid="FA3">A3</xref></italic> <italic>of Appendix</italic>. <bold>(B)</bold> Classification accuracies for speech&#x02013;speech classification (accuracies averaged across vowel and stop; slate gray) and noise&#x02013;speech classification (purple). Means of all FCR-corrected above-chance voxels within a region&#x02009;&#x000B1;&#x02009;1 standard error of the mean are shown. Asterisks indicate a FDR-corrected significant difference; &#x0002A;<italic>p</italic>&#x02009;&#x0003C;&#x02009;0.05, &#x0002A;&#x0002A;<italic>p</italic>&#x02009;&#x0003C;&#x02009;0.01, &#x0002A;&#x0002A;&#x0002A;<italic>p</italic>&#x02009;&#x0003C;&#x02009;0.001. <bold>(C)</bold> Left&#x02013;Right hemisphere differences for speech&#x02013;speech classification (accuracies averaged across vowel and stop). Mean differences in accuracy &#x000B1;95% confidence limits are shown. Asterisks indicate a FDR-corrected significant difference; &#x0002A;<italic>p</italic>&#x02009;&#x0003C;&#x02009;0.05, &#x0002A;&#x0002A;<italic>p</italic>&#x02009;&#x0003C;&#x02009;0.01, &#x0002A;&#x0002A;&#x0002A;<italic>p</italic>&#x02009;&#x0003C;&#x02009;0.001. NB, the anterior regions show the most distinct left-bias for mean classification accuracy, while the primary areas show no significant bias.</p></caption>
<graphic xlink:href="fpsyg-01-00232-g004.tif"/>
</fig>
<p>Vowels and stops were classified significantly above chance throughout these various subregions (Figure <xref ref-type="fig" rid="F4">4</xref>). In each patch, the difference in mean accuracy between vowel and stop classification was also assessed statistically to find any potential classification accuracy differences between the two broad speech sound categories. Particularly in the left auditory core region, stop-consonants were classified significantly better than vowels (Figure <xref ref-type="fig" rid="F4">4</xref>A).</p>
<p>Next, we directly compared accuracy in a speech versus band-passed noise classification analysis with the average accuracy in the two speech versus speech classification analyses (i.e., vowel categorization, stop-consonant classification; Figure <xref ref-type="fig" rid="F5">5</xref>). Three findings from this analysis deserve to be elaborated.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p><bold>Maps of correct noise versus speech classification</bold>. Display of significantly above-chance classification voxels from the leave-one-out across-subjects classification. Axial slices are shown, arranged for left and right hemisphere view, and displayed on a standard T1 template brain.</p></caption>
<graphic xlink:href="fpsyg-01-00232-g005.tif"/>
</fig>
<p>First, the noise versus speech classification (accomplished by randomly choosing one of the four speech conditions and presenting it alongside the band-passed noise condition to the classifier, in order to ensure balanced training and test sets) yielded somewhat higher average classification performances than the speech versus speech classifications described above (Figure <xref ref-type="fig" rid="F4">4</xref>B). This lends overall plausibility to the multivariate analysis, as the band-passed noise condition differs from the various syllable conditions by having much less detailed spectro-temporal complexity, and its neural imprints should be distinguished from neural imprints of syllables quite robustly by the classifier, which was indeed the case.</p>
<p>Second, the topographic distribution of local patterns that allowed noise versus speech classification (Figure <xref ref-type="fig" rid="F5">5</xref>) strikingly resembles the result from the univariate <sc>bold</sc> contrast analysis (Figure <xref ref-type="fig" rid="F2">2</xref>); most voxels robustly classifying noise from speech map to anterior and lateral parts of the superior temporal cortex; primary auditory areas also appear spared more clearly than was the case in the within-speech (vowel or consonant) classification analyses.</p>
<p>Third, we compared the average accuracy of vowel&#x02013;vowel and stop&#x02013;stop classification (speech&#x02013;speech classification; purple bars in Figure <xref ref-type="fig" rid="F4">4</xref>B) and compared it statistically to noise&#x02013;speech classification (gray bars). Only in the left anterior region, speech&#x02013;speech classification accuracy was statistically better than noise&#x02013;speech classification (<italic>p</italic>&#x02009;&#x0003C;&#x02009;0.01). In PAC as well as the posterior regions, speech&#x02013;speech classification accuracies were not statistically different from noise&#x02013;speech classification accuracies.</p>
<p>A comparison of hemispheres in mean accuracy did not yield a strong hemispheric bias in accuracy. However, when again averaging accuracies across stop and vowel classification and testing for left&#x02013;right differences, a lateralization to the left was seen across regions (leftmost bar in Figure <xref ref-type="fig" rid="F4">4</xref>C; p&#x02009;&#x0003C;&#x02009;0.05). Also, Figure <xref ref-type="fig" rid="F4">4</xref> shows a strong advantage in both vowel and stop classification accuracy for the left anterior region of interest (both mean stop and mean vowel accuracy &#x0003E;60%), when compared to its right hemisphere homolog region of interest (both &#x0003C;60%).</p>
<p>Figure <xref ref-type="fig" rid="F6">6</xref> gives a quantitative survey of the relative sparse overlap in voxels that contribute accurately to both vowel and stop classification (cf. Figure <xref ref-type="fig" rid="F3">3</xref>). Plotted are the proportions of voxels in each subregion that allow above-chance classification (i.e., number of FCR-corrected above-chance voxels divided by number of all voxels, in % per region) of vowels (red), stop-consonants (blue), or of both these classification problems (purple). As can be seen, a proportion of less than 10% of voxels in all regions contribute significantly (corrected for multiple comparisons) to the accurate classification of speech sounds. However, more importantly, the proportion of voxels contributing accurately to both speech sound classification problems (i.e., the &#x0201C;overlap&#x0201D;) remains surprisingly low. This provides strong evidence in favor of local distributed patterns (&#x0201C;patches&#x0201D;) of spectro-temporal analysis that are most tuned either to vocalic or to consonantal features, across several subregions of human auditory cortex.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p><bold>Proportion of significantly above-chance voxels per region, for vowel (red) classification, stop (blue) classification, as well as voxels (and their immediate neighbors; see <xref ref-type="sec" rid="s1">Materials and Methods</xref>) that significantly classified both speech sound categories (&#x0201C;overlap,&#x0201D; shown in purple)</bold>. Note the low number of such &#x0201C;overlap&#x0201D; voxels across all subregions.<italic>For anatomical definition of subregions see <bold>Figure <xref ref-type="fig" rid="F3">A3</xref></bold> of <italic>Appendix</italic></italic>.</p></caption>
<graphic xlink:href="fpsyg-01-00232-g006.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="discussion">
<title>Discussion</title>
<p>Having participants listen to a simple 2&#x02009;&#x000D7;&#x02009;2 array of varying stop-consonant and vowel features in natural spoken syllables in a small-voxel fMRI study, we tested the superior temporal cortex for the accuracy by which its neural imprints allow the decoding of acoustic&#x02013;phonetic features &#x02013; across participants.</p>
<p>First, the univariate or by now &#x0201C;classic&#x0201D; approach of directly contrasting speech with non-speech noise stimuli yielded a clear-cut hierarchy (Figure <xref ref-type="fig" rid="F2">2</xref>), where all syllables more than band-passed noise bursts activate the anterior and lateral parts of the superior temporal cortex. This is in line with most current models of hierarchical processing in central auditory pathways (e.g., Hickok and Poeppel, <xref ref-type="bibr" rid="B18">2007</xref>; Rauschecker and Scott, <xref ref-type="bibr" rid="B41">2009</xref>). The peak coordinates of the univariate <italic>speech</italic>&#x02009;&#x0003E;&#x02009;<italic>band-passed noise</italic> contrast concur with areas in the lateral and somewhat anterior sections of the STS previously labeled as voice-sensitive as well as those labeled speech-sensitive (Belin et al., <xref ref-type="bibr" rid="B1">2000</xref>; Petkov et al., <xref ref-type="bibr" rid="B36">2009</xref>). This is not surprising as the literature review shows that there is, to date, no clear separation of speech versus voice-sensitive areas, and our design was not tuned toward disentangling any voice versus speech sensitivity.</p>
<p>However, the more critical observation here is the inability of these subtraction-based analyses to yield consistent (i.e., across participants) evidence for broad activation differences for acoustic&#x02013;phonetic categories against each other. At an acceptable level of thresholding and using mildly smoothed data, no consistent activation differences for the various syllables, that is, between vowels or between consonants, could be revealed. This speaks to (i) a good overall matching of acoustic parameters that could have influenced the auditory cortical <sc>bold</sc> amplitudes (e.g., loudness, stimulus length), and (ii) a more microscopic level of encoding for the critical information of consonants and vowels, presumably distributed over various stages of the central auditory processing &#x0201C;hierarchy.&#x0201D; Previous MEG source localization studies, using isolated vowels or syllables, had proposed such comparably broad <italic>topographic</italic> differences, albeit at very isolated time steps (i.e., for a brief period 100&#x02009;ms after vowel onset; Obleser et al., <xref ref-type="bibr" rid="B33">2003</xref>, <xref ref-type="bibr" rid="B32">2004b</xref>; Shestakova et al., <xref ref-type="bibr" rid="B47">2004</xref>); however, no broad <italic>amplitude</italic> differences could be observed there either. Using rather complex vowel-alternating design and specific vowel-detection tasks, similar shifts in topography had been elicited in MEG as well as in fMRI (Obleser et al., <xref ref-type="bibr" rid="B31">2004a</xref>, <xref ref-type="bibr" rid="B28">2006</xref>). In the latter study, we had observed hints to a vowel feature separation in the anterior temporal cortex using standard univariate group statistics, but the vowel material used there had much less acoustic variance, used isolated vowels, and &#x02013; as mentioned above &#x02013; a very specific vowel-detection task. However, the carefully balanced syllable material, the highly increased acoustic variance in stimulus tokens, and the &#x02013; arguably pivotal &#x02013; absence of a task (other than attentive listening) in the current study revealed the limits of univariate subtraction analysis in studying human speech sound representation.</p>
<p>Second, the success of the multivariate pattern classifier stands in contrast to this and provides good evidence toward a model of speech sound representation that includes clusters of activity specific to particular speech sound categories distributed over several areas of the superior temporal cortex. To reiterate, the classifier was trained on activation data from various participants and tested on an independent, left-out set of data from another participant and had to solve a challenging task (classifying broad vowel categories or stop-consonant categories from acoustically diverse syllables).</p>
<p>It was only in this type of analysis that voxels (or, patches of voxels, following from the &#x0201C;search-light&#x0201D; approach) could be identified that allowed robust above-chance and across-individuals classification of vowel or stop category. The overall accuracy in classification was far from perfect, yet in a range (&#x0223C;55&#x02013;65%) that is comparable to the few previous studies of speech-related fMRI classification reports (Formisano et al., <xref ref-type="bibr" rid="B11">2008</xref>; Okada et al., <xref ref-type="bibr" rid="B34">2010</xref>; Herrmann et al., <xref ref-type="bibr" rid="B17">in press</xref>). However, in the current study the classifier arguably had to solve a harder problem, being trained on a variety of independent subjects and tested on another, also independent subject. Moreover, vowel and stop categories had to be classified from naturally coarticulated syllables. Adding further plausibility to these conclusions, the classifier performed significantly better overall in the acoustically simpler problem of classifying band-passed noise (i.e., uniform and simple spectro-temporal shape) from speech sound information (Figure <xref ref-type="fig" rid="F4">4</xref>B). It is also very likely that through additional sophisticated algorithms, for example, recursive feature elimination (De Martino et al., <xref ref-type="bibr" rid="B9">2008</xref>), the performance of the classifier could be improved further. The &#x0201C;search-light&#x0201D; approach employed here has the further limitation of ignoring the information entailed in the co-variance (connectivity) of voxels in remote regions or hemispheres. For the purposes of this study, however, the argument stands that a patch of voxels in the left anterior region of the STG, for example, which carries information to accurately classify, for example, /d/ from /g/ in more than 60% of all (independent) subjects above chance can be taken as good evidence that the cortical volume represented by these voxels encodes relevant neural information for this stop-consonant percept.</p>
<p>Third, to go beyond the observation of such above-chance voxel patches across a wide range of superior temporal cortex (which has also been reported in the vowels-only study by Formisano et al., <xref ref-type="bibr" rid="B11">2008</xref>), we ran a set of region-specific analyses for the average classification accuracy. The main findings from this region-specific analysis can be summarized as follows:</p>
<p>All eight subregions (left and right primary auditory cortex [PAC]; anterior; mid [lateral to PAC]; and posterior) carry substantial information for correctly classifying vowel categories as well as stop-consonant categories, and also the more salient noise&#x02013;speech distinction (Figures <xref ref-type="fig" rid="F3">3</xref>&#x02013;<xref ref-type="fig" rid="F6">6</xref>). Thus, differentiation of abstract spectro-temporal features may already begin at the core and belt level, which is in line with recordings from non-human primates and rodents (e.g., Steinschneider et al., <xref ref-type="bibr" rid="B49">1995</xref>; Wang et al., <xref ref-type="bibr" rid="B52">1995</xref>; Tian et al., <xref ref-type="bibr" rid="B50">2001</xref>; Engineer et al., <xref ref-type="bibr" rid="B10">2008</xref>). Between regions, however, differences in average accuracy for vowels, stops, and noise&#x02013;speech classification were observed (Figure <xref ref-type="fig" rid="F4">4</xref>). All findings taken together single out the left anterior region (i.e., left-hemispheric voxels that were (i) activated by sound in a random-effects analysis and (ii) anterior to a probabilistic border of the PAC, as suggested by Morosan et al., <xref ref-type="bibr" rid="B26">2001</xref>; see Figure <xref ref-type="fig" rid="F3">A3</xref> of Appendix).</p>
<p>To list a conjunction of these findings, the left anterior region</p>
<list list-type="bullet">
<list-item><p>showed the highest average classification accuracy for the vowel and stop classification (&#x0003E;60%; Figure <xref ref-type="fig" rid="F4">4</xref>A);</p></list-item>
<list-item><p>was the only region to show an average speech&#x02013;speech classification accuracy that was statistically superior to the less specific noise&#x02013;speech classification (Figure <xref ref-type="fig" rid="F4">4</xref>B).</p></list-item>
<list-item><p>showed the most pronounced leftward lateralization, when based on average accuracy (Figure <xref ref-type="fig" rid="F4">4</xref>C, yielding a &#x0223C;4% leftward bias). This might be taken as corroborating evidence to a 2-FDG PET study on left-lateralized monkey vocalizations processing; Poremba et al., <xref ref-type="bibr" rid="B38">2004</xref>.</p></list-item>
</list>
<p>In sum, our results for the left anterior region fit well with previous results on comparably high levels of complexity being processed in the (left) anterior parts of the superior temporal cortex (for imaging results from non-humans see, e.g., Poremba et al., <xref ref-type="bibr" rid="B38">2004</xref>; Petkov et al., <xref ref-type="bibr" rid="B35">2008</xref>; for imaging results in humans see, e.g., Binder et al., <xref ref-type="bibr" rid="B4">2004</xref>; Zatorre et al., <xref ref-type="bibr" rid="B56">2004</xref>; Obleser et al., <xref ref-type="bibr" rid="B28">2006</xref>; Leaver and Rauschecker, <xref ref-type="bibr" rid="B23">2010</xref>; a list that can be extended further if studies on more complex forms of linguistic information, mainly syntax, are taken into account, e.g., Friederici et al., <xref ref-type="bibr" rid="B12">2000</xref>; Rogalsky and Hickok, <xref ref-type="bibr" rid="B43">2009</xref>; Brennan et al., <xref ref-type="bibr" rid="B6">2010</xref>). Recall, however, that the current data do not allow to draw any conclusions upon possibly discriminative information being available in the inferior frontal or inferior parietal cortex (e.g., Raizada and Poldrack, <xref ref-type="bibr" rid="B40">2007</xref>), as these regions were not activated in the broad &#x0201C;sound&#x02009;&#x0003E;&#x02009;silence&#x0201D; comparison and were not fully covered by our chosen slices, respectively.</p>
<p>As for the ongoing debate whether <italic>posterior</italic> (see, e.g., Okada et al., <xref ref-type="bibr" rid="B34">2010</xref>) or <italic>anterior</italic> (see evidence listed above) aspects of the central auditory processing pathways have a relatively more important role in processing spectro-temporal characteristics of speech, i.e., vowel and consonant perception, our study reaffirms the evidence for the anteriority hypothesis. The left posterior region did not show a statistically significant advantage for the speech&#x02013;speech over noise&#x02013;speech classification, and its voxels showed overall weaker classification accuracies than the left anterior region. The evidence presented in this study, therefore, adds compellingly to the existing data on a predominant role of left anterior regions in decoding, analyzing and representing speech sounds.</p>
<p>Our conclusions on the differential distribution of voxel patches that accurately classify speech sound features gains plausibility by two side findings we have reported above. First, above-chance classification in both hemispheres (combined with a moderate leftward bias in overall accuracy, which is in essence present across the entire superior temporal cortex) is a likely outcome given the mixed evidence on left-dominant versus bilateral speech sound processing (see Hickok and Poeppel, <xref ref-type="bibr" rid="B18">2007</xref>; Obleser and Eisner, <xref ref-type="bibr" rid="B29">2009</xref>; Petkov et al., <xref ref-type="bibr" rid="B36">2009</xref> for reviews). Second, left primary auditory cortex and the region lateral to it (&#x0201C;mid&#x0201D;), which probably includes human belt and parabelt cortex (Wessinger et al., <xref ref-type="bibr" rid="B54">2001</xref>; Humphries et al., <xref ref-type="bibr" rid="B19">2010</xref>), showed a significantly better accuracy in classifying stop-consonants than vowels. This is in line with a wide body of research suggesting a left-hemispheric bias toward a better temporal analysis of the signal, which has been claimed to be especially relevant for the analysis of stop-consonant formant transitions (e.g., Schwartz and Tallal, <xref ref-type="bibr" rid="B45">1980</xref>; Zatorre and Belin, <xref ref-type="bibr" rid="B55">2001</xref>; Poeppel, <xref ref-type="bibr" rid="B37">2003</xref>; Sch&#x000F6;nwiesner et al., <xref ref-type="bibr" rid="B44">2005</xref>; Obleser et al., <xref ref-type="bibr" rid="B30">2008</xref>).</p>
<p>Lastly, what can we infer from these data about the functional organization of speech sounds in the superior temporal cortex across participants? Microscopic (i.e., below-voxel-size) organization in the superior temporal areas is expected to be quite variable across participants. However, our data imply that there is enough spatial concordance within local topographical maps of acoustic&#x02013;phonetic feature sensitivity to produce classification accuracies above chance. Also recall that we did not submit single voxels and single trial data to the classifier, but patches of neighboring voxels (which essentially allows for co-registered and normalized participant data to vary to some extent and still contribute to the same voxel patch) and statistical <italic>t</italic>-values (see Misaki et al., <xref ref-type="bibr" rid="B25">2010</xref>), respectively. Therefore, the reported classification accuracies across participants form a lower bound of possible classification accuracy. What these data will not answer is the true &#x0201C;nature&#x0201D; or abstractness of features that aided successful classification in these various subareas of the auditory cortex. It is conceivable (and, in fact, highly likely) that different areas are coding different and differentially abstract features of the syllable material; which, again, would be testimony to the redundant or multi-level neural implementation of speech sound information.</p>
<sec>
<title>Conclusion</title>
<p>In sum, the reported results show a widely distributed range of local cortical patches in the superior temporal cortex to encode critical information on vowel, consonant, as well as non-speech noise. Yet it assigns a specific role to left anterior superior temporal cortex in the processing of complex spectro-temporal patterns (Leaver and Rauschecker, <xref ref-type="bibr" rid="B23">2010</xref>). In this respect, our results extend previous evidence from subtraction-based designs.</p>
<p>Univariate analyses of broad <sc>bold</sc> differences and multivariate analyses of local patterns of small-voxel activations are converging upon a robust speech versus noise distinction. The wide distribution of information on the vowel and stop category across regions of left and right superior temporal cortex accounts well for previous difficulties in pinpointing robust &#x0201C;phoneme areas&#x0201D; or &#x0201C;phonetic maps&#x0201D; in human auditory cortex. A closer inspection of average classification performance across select subregions of auditory cortex, however, singles out the left anterior region of the superior temporal cortex as containing the highest proportion of voxels bearing information for speech sound categorizations (Figure <xref ref-type="fig" rid="F6">6</xref>) and yielding the strongest lateralization toward the left (Figure <xref ref-type="fig" rid="F4">4</xref>). The consistent and non-overlapping classification into vowels and consonants in the current study was surprisingly robust and resonates with patient studies reporting selective deficits in processing of vowels and consonants (Caramazza et al., <xref ref-type="bibr" rid="B7">2000</xref>). How exactly even finer phoneme categories or features (different types of vowels and consonants) are represented within these subregions escapes the current technology and may have to await future new approaches with even better resolution.</p>
<p>In sum, multivariate analysis of natural speech sounds opens a new level of sophistication in our conclusions on the topography of human speech sound processing in auditory cortical areas. First, it <italic>converges with subtraction-based analysis methods</italic> for broad, that is, acoustically very salient comparisons (as reported here for the speech versus noise comparisons). Second, it demonstrates that <italic>local activation patterns throughout auditory subregions in the superior temporal cortex</italic> contain robust (i.e., significant above-chance and sufficiently consistent across participants) encodings of different categories of speech sounds, with a special emphasis on the role of the left anterior STG/STS region. Third, it yields, across a wide area of superior temporal cortex, a <italic>surprisingly low overlap</italic> of those local patterns best for classification of vowel information and those best for stop-consonant classification. Given the knowledge about the hierarchical nature of non-primary auditory cortex derived from non-human primate studies, we can propose that complex sounds, including speech sounds as studied here, are represented in hierarchical networks distributed over a wide array of cortical areas. How these distributed &#x0201C;patches&#x0201D; communicate with each other to form a coherent percept will require further studies that can speak to dynamic functional connectivity.</p>
</sec>
</sec>
<sec>
<title>Conflict of Interest Statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<app-group>
<app id="A1">
<title>Appendix</title>
<p>Acoustic analysis of stimuli. Voiced stop consonants and vowels were selected to reflect either highly homogeneous or largely heterogeneous feature combinations with respect to acoustics and phonology: both alveolar consonants [d] and front vowels [i], [e] have more energy in the higher frequencies, while velar consonants [g] and back vowels [u], [o] have more energy in the lower frequencies. Since the CV-syllable stimuli were naturally co-articulated, acoustics of stop consonant and vowel were not independent but rather influenced by the actual combination within the syllable (Farnetani, <xref ref-type="bibr" rid="B57">1997</xref>; Fitch et al., <xref ref-type="bibr" rid="B58">1997</xref>) (as shown by a spectral analysis of the syllables&#x02019; frequency spectrum and subsequent mixed-model analyses of variance with speaker as a random factor): The consonantal burst&#x00027;s center frequency was on average 300 Hz higher if followed by a front vowel with high F2 (i.e., [i], [e]) than if followed by a low-F2 back vowel ([u], [o]; <italic>F</italic>(1,3) 13.96, <italic>p</italic> &#x0003C; 0.033). Conversely, the vowel&#x00027;s F2 frequency varied mildly due to the preceding consonant, with the typically low F2 in the back vowels [u], [o] being on average 90 Hz higher if preceded by the alveolar [d] than by the velar [g] [<italic>F</italic>(1,3) 8.04, <italic>p</italic> &#x0003C; 0.065).</p>
<fig id="FA1" position="float">
<label>Figure A1</label>
<caption><p><bold>(A)</bold> Exemplary single-subject activations in <italic>sound</italic>&#x02009;&#x0003E;&#x02009;<italic>silence</italic> contrast, thresholded at <italic>p</italic> &#x0003C; 0.001 at the voxel-level, using a Wavelet SPM analysis of individual, non-smoothed, native-space image time-series; displayed on individual T1-weighted structural images. <bold>(B)</bold> Examples for single-subject vowel (red) and stop (blue) classification performance. Voxels in the respective participants shown (and their immediate neighbors; see <xref ref-type="sec" rid="s1">Materials and Methods</xref>) could correctly classify vowel or stop category above chance (only voxels with performance &#x0003E;50% shown) in a given participant&#x00027;s data, when trained on the 15 remaining participants.</p></caption>
<graphic xlink:href="fpsyg-01-00232-a001.tif"/>
</fig>
<fig id="FA2" position="float">
<label>Figure A2</label>
<caption><p><bold>Example of the classification procedure applied, shown here is a simplified sketch for a case of stop consonant classification in one voxel patch</bold>. Main steps included selection of a voxel and its neighbors (&#x0201C;searchlight&#x0201D; approach, see text), using unsmoothed t-estimate vectors from each condition (versus baseline) and <italic>n</italic>&#x02009;&#x02212;&#x02009;1 independent subjects for training the classifier (upper panel). Testing and assessment of classification accuracy (here shown for the consonant) took place using an independent nth subject&#x00027;s data; the entire procedure at this voxel patch was repeated n times. The resulting average accuracies were bootstrapped (1000 repetitions) to attain 95% confidence limit estimates, which, in a last step, were also corrected for multiple comparisons (false coverage statement rate, FCR, see text; dashed line in bottom panel). A voxel patch was termed &#x0201C;accurate&#x0201D; in classification only when revealing an FCR-corrected confidence limit not covering the 50% chance level.</p></caption>
<graphic xlink:href="fpsyg-01-00232-a002.tif"/>
</fig>
<fig id="FA3" position="float">
<label>Figure A3</label>
<caption><p><bold>(A)</bold> Overlay of classification results on Probability map of primary auditory cortex, using the labels of Morosan et al., <xref ref-type="bibr" rid="B26">2001</xref> (TE 1.2&#x02013;TE 1.0). Notably, TE 1.2 appears relatively spared by voxels that allow significant across-subjects classification of vowel or consonant category. <bold>(B)</bold> Illustration of regions of interest used. A simple selection procedure was done, such as that voxels that fell within the probabilistic bounds of primary auditory cortex were defined following Rademacher et al., <xref ref-type="bibr" rid="B39">2001</xref> (lPAC, rPAC); voxels anterior to that (lANT, rANT) and posterior to that (lPOST, rPOST) were used to define regions, as well as voxels that were within the posterior&#x02013;anterior bounds of PAC but more lateral (lMID, rMID).</p></caption>
<graphic xlink:href="fpsyg-01-00232-a003.tif"/>
</fig>
</app>
</app-group>
<ack><p>This study was supported by research grants from the National Science Foundation (BCS 0519127 and OISE-0730255; Josef P. Rauschecker) and from the German Research Foundation (DFG, SFB 471; University of Konstanz), and by a post-doctoral grant from the Baden-W&#x000FC;rttemberg Foundation, Germany (Jonas Obleser). Jonas Obleser is currently based at the Max Planck Institute for Human Cognitive and Brain Sciences, Leipzig, and is funded through the Max Planck Society, Germany. The authors are grateful to Laura E. Girton (Georgetown), Carsten Eulitz and Aditi Lahiri (Konstanz), Bj&#x000F6;rn Herrmann (Leipzig), and Angela D. Friederici (Leipzig) for their help and comments at various stages of this project. Elia Formisano and Lee Miller helped considerably improve this manuscript with their constructive suggestions.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Belin</surname> <given-names>P.</given-names></name> <name><surname>Zatorre</surname> <given-names>R. J.</given-names></name> <name><surname>Lafaille</surname> <given-names>P.</given-names></name> <name><surname>Ahad</surname> <given-names>P.</given-names></name> <name><surname>Pike</surname> <given-names>B.</given-names></name></person-group> (<year>2000</year>). <article-title>Voice-selective areas in human auditory cortex</article-title>. <source>Nature</source> <volume>403</volume>, <fpage>309</fpage>&#x02013;<lpage>312</lpage>.<pub-id pub-id-type="doi">10.1038/35002078</pub-id><pub-id pub-id-type="pmid">10659849</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Benjamini</surname> <given-names>Y.</given-names></name> <name><surname>Yekutieli</surname> <given-names>D.</given-names></name></person-group> (<year>2005</year>). <article-title>False discovery rate &#x02013; adjusted multiple confidence intervals for selected parameters</article-title>. <source>J. Am. Stat. Assoc.</source> <volume>100</volume>, <fpage>71</fpage>&#x02013;<lpage>81</lpage>.<pub-id pub-id-type="doi">10.1198/016214504000001907</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Binder</surname> <given-names>J. R.</given-names></name> <name><surname>Frost</surname> <given-names>J. A.</given-names></name> <name><surname>Hammeke</surname> <given-names>T. A.</given-names></name> <name><surname>Bellgowan</surname> <given-names>P. S.</given-names></name> <name><surname>Springer</surname> <given-names>J. A.</given-names></name> <name><surname>Kaufman</surname> <given-names>J. N.</given-names></name> <name><surname>Possing</surname> <given-names>E. T.</given-names></name></person-group> (<year>2000</year>). <article-title>Human temporal lobe activation by speech and nonspeech sounds</article-title>. <source>Cereb. Cortex</source> <volume>10</volume>, <fpage>512</fpage>&#x02013;<lpage>528</lpage>.<pub-id pub-id-type="doi">10.1093/cercor/10.5.512</pub-id><pub-id pub-id-type="pmid">10847601</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Binder</surname> <given-names>J. R.</given-names></name> <name><surname>Liebenthal</surname> <given-names>E.</given-names></name> <name><surname>Possing</surname> <given-names>E. T.</given-names></name> <name><surname>Medler</surname> <given-names>D. A.</given-names></name> <name><surname>Ward</surname> <given-names>B. D.</given-names></name></person-group> (<year>2004</year>). <article-title>Neural correlates of sensory and decision processes in auditory object identification</article-title>. <source>Nat. Neurosci.</source> <volume>7</volume>, <fpage>295</fpage>&#x02013;<lpage>301</lpage>.<pub-id pub-id-type="doi">10.1038/nn1198</pub-id><pub-id pub-id-type="pmid">14966525</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blumstein</surname> <given-names>S. E.</given-names></name> <name><surname>Stevens</surname> <given-names>K. N.</given-names></name></person-group> (<year>1980</year>). <article-title>Perceptual invariance and onset spectra for stop consonants in different vowel environments</article-title>. <source>J. Acoust. Soc. Am.</source> <volume>67</volume>, <fpage>648</fpage>&#x02013;<lpage>662</lpage>.<pub-id pub-id-type="doi">10.1121/1.383890</pub-id><pub-id pub-id-type="pmid">7358906</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brennan</surname> <given-names>J.</given-names></name> <name><surname>Nir</surname> <given-names>Y.</given-names></name> <name><surname>Hasson</surname> <given-names>U.</given-names></name> <name><surname>Malach</surname> <given-names>R.</given-names></name> <name><surname>Heeger</surname> <given-names>D. J.</given-names></name> <name><surname>Pylkkanen</surname> <given-names>L.</given-names></name></person-group> (<year>2010</year>). <article-title>Syntactic structure building in the anterior temporal lobe during natural story listening</article-title>. <source>Brain Lang.</source> [Epub ahead of print].<pub-id pub-id-type="doi">10.1016/j.bandl.2010.04.002</pub-id><pub-id pub-id-type="pmid">20472279</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Caramazza</surname> <given-names>A.</given-names></name> <name><surname>Chialant</surname> <given-names>D.</given-names></name> <name><surname>Capasso</surname> <given-names>R.</given-names></name> <name><surname>Miceli</surname> <given-names>G.</given-names></name></person-group> (<year>2000</year>). <article-title>Separable processing of consonants and vowels</article-title>. <source>Nature</source> <volume>403</volume>, <fpage>428</fpage>&#x02013;<lpage>430</lpage>.<pub-id pub-id-type="doi">10.1038/35000206</pub-id><pub-id pub-id-type="pmid">10667794</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Davis</surname> <given-names>M. H.</given-names></name> <name><surname>Johnsrude</surname> <given-names>I. S.</given-names></name></person-group> (<year>2003</year>). <article-title>Hierarchical processing in spoken language comprehension</article-title>. <source>J. Neurosci.</source> <volume>23</volume>, <fpage>3423</fpage>&#x02013;<lpage>3431</lpage>.<pub-id pub-id-type="pmid">12716950</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>De Martino</surname> <given-names>F.</given-names></name> <name><surname>Valente</surname> <given-names>G.</given-names></name> <name><surname>Staeren</surname> <given-names>N.</given-names></name> <name><surname>Ashburner</surname> <given-names>J.</given-names></name> <name><surname>Goebel</surname> <given-names>R.</given-names></name> <name><surname>Formisano</surname> <given-names>E.</given-names></name></person-group> (<year>2008</year>). <article-title>Combining multivariate voxel selection and Support Vector Machines for mapping and classification of fMRI spatial patterns</article-title>. <source>Neuroimage</source> <volume>43</volume>, <fpage>44</fpage>&#x02013;<lpage>58</lpage>.<pub-id pub-id-type="doi">10.1016/j.neuroimage.2008.06.037</pub-id><pub-id pub-id-type="pmid">18672070</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Engineer</surname> <given-names>C. T.</given-names></name> <name><surname>Perez</surname> <given-names>C. A.</given-names></name> <name><surname>Chen</surname> <given-names>Y. H.</given-names></name> <name><surname>Carraway</surname> <given-names>R. S.</given-names></name> <name><surname>Reed</surname> <given-names>A. C.</given-names></name> <name><surname>Shetake</surname> <given-names>J. A.</given-names></name> <name><surname>Jakkamsetti</surname> <given-names>V.</given-names></name> <name><surname>Chang</surname> <given-names>K. Q.</given-names></name> <name><surname>Kilgard</surname> <given-names>M. P.</given-names></name></person-group> (<year>2008</year>). <article-title>Cortical activity patterns predict speech discrimination ability</article-title>. <source>Nat. Neurosci.</source> <volume>11</volume>, <fpage>603</fpage>&#x02013;<lpage>608</lpage>.<pub-id pub-id-type="doi">10.1038/nn.2109</pub-id><pub-id pub-id-type="pmid">18425123</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Formisano</surname> <given-names>E.</given-names></name> <name><surname>De Martino</surname> <given-names>F.</given-names></name> <name><surname>Bonte</surname> <given-names>M.</given-names></name> <name><surname>Goebel</surname> <given-names>R.</given-names></name></person-group> (<year>2008</year>). <article-title>&#x0201C;Who&#x0201D; is saying &#x0201C;what&#x0201D;? Brain-based decoding of human voice and speech</article-title>. <source>Science</source> <volume>322</volume>, <fpage>970</fpage>&#x02013;<lpage>973</lpage>.<pub-id pub-id-type="doi">10.1126/science.1164318</pub-id><pub-id pub-id-type="pmid">18988858</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friederici</surname> <given-names>A. D.</given-names></name> <name><surname>Meyer</surname> <given-names>M.</given-names></name> <name><surname>von Cramon</surname> <given-names>D. Y.</given-names></name></person-group> (<year>2000</year>). <article-title>Auditory language comprehension: an event-related fMRI study on the processing of syntactic and lexical information</article-title>. <source>Brain Lang.</source> <volume>74</volume>, <fpage>289</fpage>&#x02013;<lpage>300</lpage>.<pub-id pub-id-type="doi">10.1006/brln.2000.2313</pub-id><pub-id pub-id-type="pmid">10950920</pub-id></citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Genovese</surname> <given-names>C. R.</given-names></name> <name><surname>Lazar</surname> <given-names>N. A.</given-names></name> <name><surname>Nichols</surname> <given-names>T.</given-names></name></person-group> (<year>2002</year>). <article-title>Thresholding of statistical maps in functional neuroimaging using the false discovery rate</article-title>. <source>Neuroimage</source> <volume>15</volume>, <fpage>870</fpage>&#x02013;<lpage>878</lpage>.<pub-id pub-id-type="doi">10.1006/nimg.2001.1037</pub-id><pub-id pub-id-type="pmid">11906227</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Haxby</surname> <given-names>J. V.</given-names></name> <name><surname>Gobbini</surname> <given-names>M. I.</given-names></name> <name><surname>Furey</surname> <given-names>M. L.</given-names></name> <name><surname>Ishai</surname> <given-names>A.</given-names></name> <name><surname>Schouten</surname> <given-names>J. L.</given-names></name> <name><surname>Pietrini</surname> <given-names>P.</given-names></name></person-group> (<year>2001</year>). <article-title>Distributed and overlapping representations of faces and objects in ventral temporal cortex</article-title>. <source>Science</source> <volume>293</volume>, <fpage>2425</fpage>&#x02013;<lpage>2430</lpage>.<pub-id pub-id-type="doi">10.1126/science.1063736</pub-id><pub-id pub-id-type="pmid">11577229</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Haynes</surname> <given-names>J. D.</given-names></name></person-group> (<year>2009</year>). <article-title>Decoding visual consciousness from human brain signals</article-title>. <source>Trends Cogn. Sci.</source> <volume>13</volume>, <fpage>194</fpage>&#x02013;<lpage>202</lpage>.<pub-id pub-id-type="doi">10.1016/j.tics.2009.02.004</pub-id><pub-id pub-id-type="pmid">19375378</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Haynes</surname> <given-names>J. D.</given-names></name> <name><surname>Rees</surname> <given-names>G.</given-names></name></person-group> (<year>2005</year>). <article-title>Predicting the orientation of invisible stimuli from activity in human primary visual cortex</article-title>. <source>Nat. Neurosci.</source> <volume>8</volume>, <fpage>686</fpage>&#x02013;<lpage>691</lpage>.<pub-id pub-id-type="doi">10.1038/nn1445</pub-id><pub-id pub-id-type="pmid">15852013</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Herrmann</surname> <given-names>B.</given-names></name> <name><surname>Obleser</surname> <given-names>J.</given-names></name> <name><surname>Kalberlah</surname> <given-names>C.</given-names></name> <name><surname>Haynes</surname> <given-names>J. D.</given-names></name> <name><surname>Friederici</surname> <given-names>A. D.</given-names></name></person-group> (in press). <article-title>Dissociable neural imprints of perception and grammar in auditory functional imaging</article-title>. <source>Hum. Brain Mapp</source>.</citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hickok</surname> <given-names>G.</given-names></name> <name><surname>Poeppel</surname> <given-names>D.</given-names></name></person-group> (<year>2007</year>). <article-title>The cortical organization of speech processing</article-title>. <source>Nat. Rev. Neurosci.</source> <volume>8</volume>, <fpage>393</fpage>&#x02013;<lpage>402</lpage>.<pub-id pub-id-type="doi">10.1038/nrn2113</pub-id><pub-id pub-id-type="pmid">17431404</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Humphries</surname> <given-names>C.</given-names></name> <name><surname>Liebenthal</surname> <given-names>E.</given-names></name> <name><surname>Binder</surname> <given-names>J. R.</given-names></name></person-group> (<year>2010</year>). <article-title>Tonotopic organization of human auditory cortex</article-title>. <source>Neuroimage</source> <volume>50</volume>, <fpage>1202</fpage>&#x02013;<lpage>1211</lpage>.<pub-id pub-id-type="doi">10.1016/j.neuroimage.2010.01.046</pub-id><pub-id pub-id-type="pmid">20096790</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kriegeskorte</surname> <given-names>N.</given-names></name> <name><surname>Bandettini</surname> <given-names>P.</given-names></name></person-group> (<year>2007</year>). <article-title>Combining the tools: activation-and information-based fMRI analysis</article-title>. <source>Neuroimage</source> <volume>38</volume>, <fpage>666</fpage>&#x02013;<lpage>668</lpage>.<pub-id pub-id-type="doi">10.1016/j.neuroimage.2007.06.030</pub-id><pub-id pub-id-type="pmid">17976583</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kriegeskorte</surname> <given-names>N.</given-names></name> <name><surname>Goebel</surname> <given-names>R.</given-names></name> <name><surname>Bandettini</surname> <given-names>P.</given-names></name></person-group> (<year>2006</year>). <article-title>Information-based functional brain mapping</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A.</source> <volume>103</volume>, <fpage>3863</fpage>&#x02013;<lpage>3868</lpage>.<pub-id pub-id-type="doi">10.1073/pnas.0600244103</pub-id><pub-id pub-id-type="pmid">16537458</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lahiri</surname> <given-names>A.</given-names></name> <name><surname>Gewirth</surname> <given-names>L.</given-names></name> <name><surname>Blumstein</surname> <given-names>S. E.</given-names></name></person-group> (<year>1984</year>). <article-title>A reconsideration of acoustic invariance for place of articulation in diffuse stop consonants: evidence from a cross-language study</article-title>. <source>J. A. coust. Soc. Am.</source> <volume>76</volume>, <fpage>391</fpage>&#x02013;<lpage>404</lpage>.<pub-id pub-id-type="doi">10.1121/1.391580</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Leaver</surname> <given-names>A. M.</given-names></name> <name><surname>Rauschecker</surname> <given-names>J. P.</given-names></name></person-group> (<year>2010</year>). <article-title>Cortical representation of natural complex sounds: effects of acoustic features and auditory object category</article-title>. <source>J. Neurosci.</source> <volume>30</volume>, <fpage>7604</fpage>&#x02013;<lpage>7612</lpage>.<pub-id pub-id-type="doi">10.1523/JNEUROSCI.0296-10.2010</pub-id><pub-id pub-id-type="pmid">20519535</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liebenthal</surname> <given-names>E.</given-names></name> <name><surname>Binder</surname> <given-names>J. R.</given-names></name> <name><surname>Spitzer</surname> <given-names>S. M.</given-names></name> <name><surname>Possing</surname> <given-names>E. T.</given-names></name> <name><surname>Medler</surname> <given-names>D. A.</given-names></name></person-group> (<year>2005</year>). <article-title>Neural Substrates of Phonemic Perception</article-title>. <source>Cereb. Cortex</source> <volume>15</volume>, <fpage>1621</fpage>&#x02013;<lpage>1631</lpage>.<pub-id pub-id-type="doi">10.1093/cercor/bhi040</pub-id><pub-id pub-id-type="pmid">15703256</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Misaki</surname> <given-names>M.</given-names></name> <name><surname>Kim</surname> <given-names>Y.</given-names></name> <name><surname>Bandettini</surname> <given-names>P. A.</given-names></name> <name><surname>Kriegeskorte</surname> <given-names>N.</given-names></name></person-group> (<year>2010</year>). <article-title>Comparison of multivariate classifiers and response normalizations for pattern-information fMRI</article-title>. <source>Neuroimage</source> <volume>53</volume>, <fpage>103</fpage>&#x02013;<lpage>118</lpage>.<pub-id pub-id-type="doi">10.1016/j.neuroimage.2010.05.051</pub-id><pub-id pub-id-type="pmid">20580933</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Morosan</surname> <given-names>P.</given-names></name> <name><surname>Rademacher</surname> <given-names>J.</given-names></name> <name><surname>Schleicher</surname> <given-names>A.</given-names></name> <name><surname>Amunts</surname> <given-names>K.</given-names></name> <name><surname>Schormann</surname> <given-names>T.</given-names></name> <name><surname>Zilles</surname> <given-names>K.</given-names></name></person-group> (<year>2001</year>). <article-title>Human primary auditory cortex: cytoarchitectonic subdivisions and mapping into a spatial reference system</article-title>. <source>Neuroimage</source> <volume>13</volume>, <fpage>684</fpage>&#x02013;<lpage>701</lpage>.<pub-id pub-id-type="doi">10.1006/nimg.2000.0715</pub-id><pub-id pub-id-type="pmid">11305897</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Norman</surname> <given-names>K. A.</given-names></name> <name><surname>Polyn</surname> <given-names>S. M.</given-names></name> <name><surname>Detre</surname> <given-names>G. J.</given-names></name> <name><surname>Haxby</surname> <given-names>J. V.</given-names></name></person-group> (<year>2006</year>). <article-title>Beyond mind-reading: multi-voxel pattern analysis of fMRI data</article-title>. <source>Trends Cogn. Sci.</source> <volume>10</volume>, <fpage>424</fpage>&#x02013;<lpage>430</lpage>.<pub-id pub-id-type="doi">10.1016/j.tics.2006.07.005</pub-id><pub-id pub-id-type="pmid">16899397</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Obleser</surname> <given-names>J.</given-names></name> <name><surname>Boecker</surname> <given-names>H.</given-names></name> <name><surname>Drzezga</surname> <given-names>A.</given-names></name> <name><surname>Haslinger</surname> <given-names>B.</given-names></name> <name><surname>Hennenlotter</surname> <given-names>A.</given-names></name> <name><surname>Roettinger</surname> <given-names>M.</given-names></name> <name><surname>Eulitz</surname> <given-names>C.</given-names></name> <name><surname>Rauschecker</surname> <given-names>J. P.</given-names></name></person-group> (<year>2006</year>). <article-title>Vowel sound extraction in anterior superior temporal cortex</article-title>. <source>Hum. Brain Mapp.</source> <volume>27</volume>, <fpage>562</fpage>&#x02013;<lpage>571</lpage>.<pub-id pub-id-type="doi">10.1002/hbm.20201</pub-id><pub-id pub-id-type="pmid">16281283</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Obleser</surname> <given-names>J.</given-names></name> <name><surname>Eisner</surname> <given-names>F.</given-names></name></person-group> (<year>2009</year>). <article-title>Pre-lexical abstraction of speech in the auditory cortex</article-title>. <source>Trends Cogn. Sci.</source> <volume>13</volume>, <fpage>14</fpage>&#x02013;<lpage>19</lpage>.<pub-id pub-id-type="doi">10.1016/j.tics.2008.09.005</pub-id><pub-id pub-id-type="pmid">19070534</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Obleser</surname> <given-names>J.</given-names></name> <name><surname>Eisner</surname> <given-names>F.</given-names></name> <name><surname>Kotz</surname> <given-names>S. A.</given-names></name></person-group> (<year>2008</year>). <article-title>Bilateral speech comprehension reflects differential sensitivity to spectral and temporal features</article-title>. <source>J. Neurosci.</source> <volume>28</volume>, <fpage>8116</fpage>&#x02013;<lpage>8123</lpage>.<pub-id pub-id-type="doi">10.1523/JNEUROSCI.1290-08.2008</pub-id><pub-id pub-id-type="pmid">18685036</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Obleser</surname> <given-names>J.</given-names></name> <name><surname>Elbert</surname> <given-names>T.</given-names></name> <name><surname>Eulitz</surname> <given-names>C.</given-names></name></person-group> (<year>2004a</year>). <article-title>Attentional influences on functional mapping of speech sounds in human auditory cortex</article-title>. <source>BMC Neurosci.</source> <volume>5</volume>, <fpage>24</fpage>. <pub-id pub-id-type="doi">10.1186/1471-2202-5-24</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Obleser</surname> <given-names>J.</given-names></name> <name><surname>Lahiri</surname> <given-names>A.</given-names></name> <name><surname>Eulitz</surname> <given-names>C.</given-names></name></person-group> (<year>2004b</year>). <article-title>Magnetic brain response mirrors extraction of phonological features from spoken vowels</article-title>. <source>J. Cogn. Neurosci.</source> <volume>16</volume>, <fpage>31</fpage>&#x02013;<lpage>39</lpage>.<pub-id pub-id-type="doi">10.1162/089892904322755539</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Obleser</surname> <given-names>J.</given-names></name> <name><surname>Lahiri</surname> <given-names>A.</given-names></name> <name><surname>Eulitz</surname> <given-names>C.</given-names></name></person-group> (<year>2003</year>). <article-title>Auditory-evoked magnetic field codes place of articulation in timing and topography around 100 milliseconds post syllable onset</article-title>. <source>Neuroimage</source> <volume>20</volume>, <fpage>1839</fpage>&#x02013;<lpage>1847</lpage>.<pub-id pub-id-type="doi">10.1016/j.neuroimage.2003.07.019</pub-id><pub-id pub-id-type="pmid">14642493</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Okada</surname> <given-names>K.</given-names></name> <name><surname>Rong</surname> <given-names>F.</given-names></name> <name><surname>Venezia</surname> <given-names>J.</given-names></name> <name><surname>Matchin</surname> <given-names>W.</given-names></name> <name><surname>Hsieh</surname> <given-names>I. H.</given-names></name> <name><surname>Saberi</surname> <given-names>K.</given-names></name> <name><surname>Serences</surname> <given-names>J. T.</given-names></name> <name><surname>Hickok</surname> <given-names>G.</given-names></name></person-group> (<year>2010</year>). <article-title>Hierarchical organization of human auditory cortex: evidence from acoustic invariance in the response to intelligible speech</article-title>. <source>Cereb. Cortex.</source> [Epub ahead of print].<pub-id pub-id-type="doi">10.1093/cercor/bhp318</pub-id><pub-id pub-id-type="pmid">20100898</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Petkov</surname> <given-names>C. I.</given-names></name> <name><surname>Kayser</surname> <given-names>C.</given-names></name> <name><surname>Steudel</surname> <given-names>T.</given-names></name> <name><surname>Whittingstall</surname> <given-names>K.</given-names></name> <name><surname>Augath</surname> <given-names>M.</given-names></name> <name><surname>Logothetis</surname> <given-names>N. K.</given-names></name></person-group> (<year>2008</year>). <article-title>A voice region in the monkey brain</article-title>. <source>Nat. Neurosci.</source> <volume>11</volume>, <fpage>367</fpage>&#x02013;<lpage>374</lpage>.<pub-id pub-id-type="doi">10.1038/nn2043</pub-id><pub-id pub-id-type="pmid">18264095</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Petkov</surname> <given-names>C. I.</given-names></name> <name><surname>Logothetis</surname> <given-names>N. K.</given-names></name> <name><surname>Obleser</surname> <given-names>J.</given-names></name></person-group> (<year>2009</year>). <article-title>Where are the human speech and voice regions, and do other animals have anything like them?</article-title> <source>Neuroscientist</source> <volume>15</volume>, <fpage>419</fpage>&#x02013;<lpage>429</lpage>.<pub-id pub-id-type="doi">10.1177/1073858408326430</pub-id><pub-id pub-id-type="pmid">19516047</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poeppel</surname> <given-names>D.</given-names></name></person-group> (<year>2003</year>). <article-title>The analysis of speech in different temporal integration windows: cerebral lateralization as &#x02018;asymmetric sampling in time&#x02019;</article-title>. <source>Speech Commun.</source> <volume>41</volume>, <fpage>245</fpage>&#x02013;<lpage>255</lpage>.<pub-id pub-id-type="doi">10.1016/S0167-6393(02)00107-3</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poremba</surname> <given-names>A.</given-names></name> <name><surname>Malloy</surname> <given-names>M.</given-names></name> <name><surname>Saunders</surname> <given-names>R. C.</given-names></name> <name><surname>Carson</surname> <given-names>R. E.</given-names></name> <name><surname>Herscovitch</surname> <given-names>P.</given-names></name> <name><surname>Mishkin</surname> <given-names>M.</given-names></name></person-group> (<year>2004</year>). <article-title>Species-specific calls evoke asymmetric activity in the monkey&#x00027;s temporal poles</article-title>. <source>Nature</source> <volume>427</volume>, <fpage>448</fpage>&#x02013;<lpage>451</lpage>.<pub-id pub-id-type="doi">10.1038/nature02268</pub-id><pub-id pub-id-type="pmid">14749833</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rademacher</surname> <given-names>J.</given-names></name> <name><surname>Morosan</surname> <given-names>P.</given-names></name> <name><surname>Schormann</surname> <given-names>T.</given-names></name> <name><surname>Schleicher</surname> <given-names>A.</given-names></name> <name><surname>Werner</surname> <given-names>C.</given-names></name> <name><surname>Freund</surname> <given-names>H. J.</given-names></name> <name><surname>Zilles</surname> <given-names>K.</given-names></name></person-group> (<year>2001</year>). <article-title>Probabilistic mapping and volume measurement of human primary auditory cortex</article-title>. <source>Neuroimage</source> <volume>13</volume>, <fpage>669</fpage>&#x02013;<lpage>683</lpage>.<pub-id pub-id-type="doi">10.1006/nimg.2000.0714</pub-id><pub-id pub-id-type="pmid">11305896</pub-id></citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Raizada</surname> <given-names>R. D.</given-names></name> <name><surname>Poldrack</surname> <given-names>R. A.</given-names></name></person-group> (<year>2007</year>). <article-title>Selective amplification of stimulus differences during categorical processing of speech</article-title>. <source>Neuron</source> <volume>56</volume>, <fpage>726</fpage>&#x02013;<lpage>740</lpage>.<pub-id pub-id-type="doi">10.1016/j.neuron.2007.11.001</pub-id><pub-id pub-id-type="pmid">18031688</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rauschecker</surname> <given-names>J. P.</given-names></name> <name><surname>Scott</surname> <given-names>S. K.</given-names></name></person-group> (<year>2009</year>). <article-title>Maps and streams in the auditory cortex: nonhuman primates illuminate human speech processing</article-title>. <source>Nat. Neurosci.</source> <volume>12</volume>, <fpage>718</fpage>&#x02013;<lpage>724</lpage>.<pub-id pub-id-type="doi">10.1038/nn.2331</pub-id><pub-id pub-id-type="pmid">19471271</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Read</surname> <given-names>H. L.</given-names></name> <name><surname>Winer</surname> <given-names>J. A.</given-names></name> <name><surname>Schreiner</surname> <given-names>C. E.</given-names></name></person-group> (<year>2002</year>). <article-title>Functional architecture of auditory cortex</article-title>. <source>Curr. Opin. Neurobiol.</source> <volume>12</volume>, <fpage>433</fpage>&#x02013;<lpage>440</lpage>.<pub-id pub-id-type="doi">10.1016/S0959-4388(02)00342-2</pub-id><pub-id pub-id-type="pmid">12139992</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rogalsky</surname> <given-names>C.</given-names></name> <name><surname>Hickok</surname> <given-names>G.</given-names></name></person-group> (<year>2009</year>). <article-title>Selective attention to semantic and syntactic features modulates sentence processing networks in anterior temporal cortex</article-title>. <source>Cereb. Cortex</source> <volume>19</volume>, <fpage>786</fpage>&#x02013;<lpage>796</lpage>.<pub-id pub-id-type="doi">10.1093/cercor/bhn126</pub-id><pub-id pub-id-type="pmid">18669589</pub-id></citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sch&#x000F6;nwiesner</surname> <given-names>M.</given-names></name> <name><surname>Rubsamen</surname> <given-names>R.</given-names></name> <name><surname>von Cramon</surname> <given-names>D. Y.</given-names></name></person-group> (<year>2005</year>). <article-title>Hemispheric asymmetry for spectral and temporal processing in the human antero-lateral auditory belt cortex</article-title>. <source>Eur. J. Neurosci.</source> <volume>22</volume>, <fpage>1521</fpage>&#x02013;<lpage>1528</lpage>.<pub-id pub-id-type="doi">10.1111/j.1460-9568.2005.04315.x</pub-id><pub-id pub-id-type="pmid">16190905</pub-id></citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schwartz</surname> <given-names>J.</given-names></name> <name><surname>Tallal</surname> <given-names>P.</given-names></name></person-group> (<year>1980</year>). <article-title>Rate of acoustic change may underlie hemispheric specialization for speech perception</article-title>. <source>Science</source> <volume>207</volume>, <fpage>1380</fpage>&#x02013;<lpage>1381</lpage>.<pub-id pub-id-type="pmid">7355297</pub-id></citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Scott</surname> <given-names>S. K.</given-names></name> <name><surname>Blank</surname> <given-names>C. C.</given-names></name> <name><surname>Rosen</surname> <given-names>S.</given-names></name> <name><surname>Wise</surname> <given-names>R. J.</given-names></name></person-group> (<year>2000</year>). <article-title>Identification of a pathway for intelligible speech in the left temporal lobe</article-title>. <source>Brain</source> <volume>123</volume>, <fpage>2400</fpage>&#x02013;<lpage>2406</lpage>.<pub-id pub-id-type="doi">10.1093/brain/123.12.2400</pub-id><pub-id pub-id-type="pmid">11099443</pub-id></citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shestakova</surname> <given-names>A.</given-names></name> <name><surname>Brattico</surname> <given-names>E.</given-names></name> <name><surname>Soloviev</surname> <given-names>A.</given-names></name> <name><surname>Klucharev</surname> <given-names>V.</given-names></name> <name><surname>Huotilainen</surname> <given-names>M.</given-names></name></person-group> (<year>2004</year>). <article-title>Orderly cortical representation of vowel categories presented by multiple exemplars</article-title>. <source>Brain Res. Cogn. Brain Res.</source> <volume>21</volume>, <fpage>342</fpage>&#x02013;<lpage>350</lpage>.<pub-id pub-id-type="doi">10.1016/j.cogbrainres.2004.06.011</pub-id><pub-id pub-id-type="pmid">15511650</pub-id></citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Slotnick</surname> <given-names>S. D.</given-names></name> <name><surname>Moo</surname> <given-names>L. R.</given-names></name> <name><surname>Segal</surname> <given-names>J. B.</given-names></name> <name><surname>Hart</surname> <given-names>J.</given-names> <suffix>Jr.</suffix></name></person-group> (<year>2003</year>). <article-title>Distinct prefrontal cortex activity associated with item memory and source memory for visual shapes</article-title>. <source>Brain Res. Cogn. Brain Res.</source> <volume>17</volume>, <fpage>75</fpage>&#x02013;<lpage>82</lpage>.<pub-id pub-id-type="doi">10.1016/S0926-6410(03)00082-X</pub-id><pub-id pub-id-type="pmid">12763194</pub-id></citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Steinschneider</surname> <given-names>M.</given-names></name> <name><surname>Reser</surname> <given-names>D.</given-names></name> <name><surname>Schroeder</surname> <given-names>C. E.</given-names></name> <name><surname>Arezzo</surname> <given-names>J. C.</given-names></name></person-group> (<year>1995</year>). <article-title>Tonotopic organization of responses reflecting stop consonant place of articulation in primary auditory cortex (A1) of the monkey</article-title>. <source>Brain Res.</source> <volume>674</volume>, <fpage>147</fpage>&#x02013;<lpage>152</lpage>.<pub-id pub-id-type="doi">10.1016/0006-8993(95)00008-E</pub-id><pub-id pub-id-type="pmid">7773684</pub-id></citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tian</surname> <given-names>B.</given-names></name> <name><surname>Reser</surname> <given-names>D.</given-names></name> <name><surname>Durham</surname> <given-names>A.</given-names></name> <name><surname>Kustov</surname> <given-names>A.</given-names></name> <name><surname>Rauschecker</surname> <given-names>J. P.</given-names></name></person-group> (<year>2001</year>). <article-title>Functional specialization in rhesus monkey auditory cortex</article-title>. <source>Science</source> <volume>292</volume>, <fpage>290</fpage>&#x02013;<lpage>293</lpage>.<pub-id pub-id-type="doi">10.1126/science.1058911</pub-id><pub-id pub-id-type="pmid">11303104</pub-id></citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>X.</given-names></name></person-group> (<year>2000</year>). <article-title>On cortical coding of vocal communication sounds in primates</article-title>. <source>Proc. Natl. Acad. Sci U.S.A.</source> <volume>97</volume>, <fpage>11843</fpage>&#x02013;<lpage>11849</lpage>.<pub-id pub-id-type="doi">10.1073/pnas.97.22.11843</pub-id><pub-id pub-id-type="pmid">11050218</pub-id></citation></ref>
<ref id="B52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>X.</given-names></name> <name><surname>Merzenich</surname> <given-names>M. M.</given-names></name> <name><surname>Beitel</surname> <given-names>R.</given-names></name> <name><surname>Schreiner</surname> <given-names>C. E.</given-names></name></person-group> (<year>1995</year>). <article-title>Representation of a species-specific vocalization in the primary auditory cortex of the common marmoset: temporal and spectral characteristics</article-title>. <source>J. Neurophysiol.</source> <volume>74</volume>, <fpage>2685</fpage>&#x02013;<lpage>2706</lpage>.<pub-id pub-id-type="pmid">8747224</pub-id></citation></ref>
<ref id="B53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Warren</surname> <given-names>J. D.</given-names></name> <name><surname>Jennings</surname> <given-names>A. R.</given-names></name> <name><surname>Griffiths</surname> <given-names>T. D.</given-names></name></person-group> (<year>2005</year>). <article-title>Analysis of the spectral envelope of sounds by the human brain</article-title>. <source>Neuroimage</source> <volume>24</volume>, <fpage>1052</fpage>&#x02013;<lpage>1057</lpage>.<pub-id pub-id-type="doi">10.1016/j.neuroimage.2004.10.031</pub-id><pub-id pub-id-type="pmid">15670682</pub-id></citation></ref>
<ref id="B54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wessinger</surname> <given-names>C. M.</given-names></name> <name><surname>VanMeter</surname> <given-names>J.</given-names></name> <name><surname>Tian</surname> <given-names>B.</given-names></name> <name><surname>Van Lare</surname> <given-names>J.</given-names></name> <name><surname>Pekar</surname> <given-names>J.</given-names></name> <name><surname>Rauschecker</surname> <given-names>J. P.</given-names></name></person-group> (<year>2001</year>). <article-title>Hierarchical organization of the human auditory cortex revealed by functional magnetic resonance imaging</article-title>. <source>J. Cogn. Neurosci.</source> <volume>13</volume>, <fpage>1</fpage>&#x02013;<lpage>7</lpage>.<pub-id pub-id-type="doi">10.1162/089892901564108</pub-id><pub-id pub-id-type="pmid">11224904</pub-id></citation></ref>
<ref id="B55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zatorre</surname> <given-names>R. J.</given-names></name> <name><surname>Belin</surname> <given-names>P.</given-names></name></person-group> (<year>2001</year>). <article-title>Spectral and temporal processing in human auditory cortex</article-title>. <source>Cereb. Cortex</source> <volume>11</volume>, <fpage>946</fpage>&#x02013;<lpage>953</lpage>.<pub-id pub-id-type="doi">10.1093/cercor/11.10.946</pub-id><pub-id pub-id-type="pmid">11549617</pub-id></citation></ref>
<ref id="B56"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zatorre</surname> <given-names>R. J.</given-names></name> <name><surname>Bouffard</surname> <given-names>M.</given-names></name> <name><surname>Belin</surname> <given-names>P.</given-names></name></person-group> (<year>2004</year>). <article-title>Sensitivity to auditory object features in human temporal neocortex</article-title>. <source>J. Neurosci</source> <volume>24</volume>, <fpage>3637</fpage>&#x02013;<lpage>3642</lpage>.<pub-id pub-id-type="doi">10.1523/JNEUROSCI.5458-03.2004</pub-id><pub-id pub-id-type="pmid">15071112</pub-id></citation></ref>
</ref-list>
<ref-list>
<title>References</title>
<ref id="B57"><citation citation-type="book"><person-group person-group-type="author"><name><surname>Farnetani</surname> <given-names>E.</given-names></name></person-group> (<year>1997</year>). <article-title>&#x0201C;Coarticulation and connected speech processes,&#x0201D;</article-title> in <source>The Handbook of Phonetic Sciences</source>, eds <person-group person-group-type="editor"><name><surname>Hardcastle</surname> <given-names>W. J.</given-names></name> <name><surname>Laver</surname> <given-names>J.</given-names></name></person-group> (<publisher-loc>Oxford</publisher-loc>: <publisher-name>Blackwell</publisher-name>), <fpage>371</fpage>&#x02013;<lpage>404</lpage>.</citation></ref>
<ref id="B58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fitch</surname> <given-names>R. H.</given-names></name> <name><surname>Miller</surname> <given-names>S.</given-names></name> <name><surname>Tallal</surname> <given-names>P.</given-names></name></person-group> (<year>1997</year>). <article-title>Neurobiology of speech perception</article-title>. <source>Annu. Rev. Neurosci</source> <volume>20</volume>, <fpage>331</fpage>&#x02013;<lpage>353</lpage>.<pub-id pub-id-type="doi">10.1146/annurev.neuro.20.1.331</pub-id><pub-id pub-id-type="pmid">9056717</pub-id></citation></ref>
</ref-list>
</back>
</article>