<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Neurosci.</journal-id>
<journal-title>Frontiers in Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1662-453X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fnins.2021.665767</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>An Analytical Framework of Tonal and Rhythmic Hierarchy in Natural Music Using the Multivariate Temporal Response Function</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Leahy</surname> <given-names>Jasmine</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="author-notes" rid="fn002"><sup>&#x2020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1102784/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Kim</surname> <given-names>Seung-Goo</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<xref ref-type="author-notes" rid="fn002"><sup>&#x2020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/67302/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Wan</surname> <given-names>Jie</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1240385/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Overath</surname> <given-names>Tobias</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<xref ref-type="aff" rid="aff4"><sup>4</sup></xref>
<xref ref-type="corresp" rid="c002"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/46967/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Psychology and Neuroscience, Duke University</institution>, <addr-line>Durham, NC</addr-line>, <country>United States</country></aff>
<aff id="aff2"><sup>2</sup><institution>Department of Cognitive Sciences, University of California, Irvine</institution>, <addr-line>Irvine, CA</addr-line>, <country>United States</country></aff>
<aff id="aff3"><sup>3</sup><institution>Duke Institute for Brain Sciences, Duke University</institution>, <addr-line>Durham, NC</addr-line>, <country>United States</country></aff>
<aff id="aff4"><sup>4</sup><institution>Center for Cognitive Neuroscience, Duke University</institution>, <addr-line>Durham, NC</addr-line>, <country>United States</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Matthew Elliott Sachs, Columbia University, United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Dan Zhang, Tsinghua University, China; Koichi Yokosawa, Hokkaido University, Japan</p></fn>
<corresp id="c001">&#x002A;Correspondence: Seung-Goo Kim, <email>solleo@gmail.com</email></corresp>
<corresp id="c002">Tobias Overath, <email>t.overath@duke.edu</email></corresp>
<fn fn-type="other" id="fn002"><p><sup>&#x2020;</sup>These authors have contributed equally to this work and share first authorship</p></fn>
<fn fn-type="other" id="fn004"><p>This article was submitted to Auditory Cognitive Neuroscience, a section of the journal Frontiers in Neuroscience</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>16</day>
<month>07</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>15</volume>
<elocation-id>665767</elocation-id>
<history>
<date date-type="received">
<day>08</day>
<month>02</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>24</day>
<month>06</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2021 Leahy, Kim, Wan and Overath.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Leahy, Kim, Wan and Overath</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Even without formal training, humans experience a wide range of emotions in response to changes in musical features, such as tonality and rhythm, during music listening. While many studies have investigated how isolated elements of tonal and rhythmic properties are processed in the human brain, it remains unclear whether these findings with such controlled stimuli are generalizable to complex stimuli in the real world. In the current study, we present an analytical framework of a linearized encoding analysis based on a set of music information retrieval features to investigate the rapid cortical encoding of tonal and rhythmic hierarchies in natural music. We applied this framework to a public domain EEG dataset (OpenMIIR) to deconvolve overlapping EEG responses to various musical features in continuous music. In particular, the proposed framework investigated the EEG encoding of the following features: <italic>tonal stability</italic>, <italic>key clarity</italic>, <italic>beat</italic>, and <italic>meter</italic>. This analysis revealed a differential spatiotemporal neural encoding of <italic>beat</italic> and <italic>meter</italic>, but not of <italic>tonal stability</italic> and <italic>key clarity</italic>. The results demonstrate that this framework can uncover associations of ongoing brain activity with relevant musical features, which could be further extended to other relevant measures such as time-resolved emotional responses in future studies.</p>
</abstract>
<kwd-group>
<kwd>linearized encoding analysis</kwd>
<kwd>electroencephalography</kwd>
<kwd>tonal hierarchy</kwd>
<kwd>rhythmic hierarchy</kwd>
<kwd>naturalistic paradigm</kwd>
</kwd-group>
<contract-num rid="cn001">R21DC016386</contract-num>
<contract-sponsor id="cn001">National Institutes of Health<named-content content-type="fundref-id">10.13039/100000002</named-content></contract-sponsor>
<counts>
<fig-count count="5"/>
<table-count count="1"/>
<equation-count count="9"/>
<ref-count count="62"/>
<page-count count="13"/>
<word-count count="0"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1">
<title>Introduction</title>
<p>Music is a universal auditory experience known to evoke intense feelings. Even without musical training, humans not only connect to it on an emotional level but can also generate expectations as they listen to it (<xref ref-type="bibr" rid="B33">Koelsch et al., 2000</xref>). We gather clues from what we are listening to in real-time combined with internalized musical patterns, or schema, from our respective cultural settings to guess what will happen next, which ultimately results in a change in our emotions. Schemata consist of musical features, such as tonality (i.e., pitches and their relationship to one another) and rhythm. However, tonality has often been studied using heavily contrived chord progressions instead of more natural, original music in order to impose rigorous controls on the experiment (<xref ref-type="bibr" rid="B17">Fishman et al., 2001</xref>; <xref ref-type="bibr" rid="B41">Loui and Wessel, 2007</xref>; <xref ref-type="bibr" rid="B32">Koelsch and Jentschke, 2010</xref>). Likewise, beat perception studies have favored simplistic, isolated rhythms over complex patterns found in everyday music (<xref ref-type="bibr" rid="B52">Snyder and Large, 2005</xref>; <xref ref-type="bibr" rid="B18">Fujioka et al., 2009</xref>). Therefore, designs that take advantage of the multiple, complex features of natural music stimuli are needed to confirm the results of these experiments.</p>
<p>In order to devise a framework that can account for these complexities, we first considered how different musical features build up over the course of a piece of music. In everyday music, tonality and rhythm are constructed hierarchically, meaning some pitches in certain positions (e.g., in a bar) have more importance than others (<xref ref-type="bibr" rid="B37">Krumhansl and Shepard, 1979</xref>). One way that listeners assess this importance is <italic>via</italic> the temporal positions of pitches. Tones that occur at rhythmically critical moments in a piece allow us to more easily anticipate what we should hear next and when (<xref ref-type="bibr" rid="B35">Krumhansl, 1990</xref>; <xref ref-type="bibr" rid="B36">Krumhansl and Cuddy, 2010</xref>). This type of beat perception is considered hierarchical in the sense that it involves multiple layers of perception which interact with one another, namely beat and meter. Beat refers to the onset of every beat in a given measure, whereas meter refers to the importance of the beats relative to a given time signature (e.g., 4/4). Music listening has repeatedly been linked with activation of the motor cortex, in particular relating to anticipation of the beat (<xref ref-type="bibr" rid="B60">Zatorre et al., 2007</xref>; <xref ref-type="bibr" rid="B8">Chen et al., 2008</xref>; <xref ref-type="bibr" rid="B21">Gordon et al., 2018</xref>). The clarity of the beat matters during music perception as well; during moments of high beat saliency, functional connectivity increases from the basal ganglia and thalamus to the auditory and sensorimotor cortices and cerebellum, while low beat saliency correlates with increased connectivity between the auditory and motor cortices, indicating that we participate in an active search to find the beat when it becomes less predictable (<xref ref-type="bibr" rid="B57">Toiviainen et al., 2020</xref>). EEG studies, in particular, have shed light on how humans entrain beat and meter on both a micro-scale (e.g., milliseconds) (<xref ref-type="bibr" rid="B52">Snyder and Large, 2005</xref>; <xref ref-type="bibr" rid="B18">Fujioka et al., 2009</xref>) and macro-scale (e.g., years of genre-specific musical training) (<xref ref-type="bibr" rid="B4">Bianco et al., 2018</xref>). For example, it only requires a brief musical sequence to observe beta band activity (14&#x2013;30 Hz) that increases after each tone, then decreases, creating beta oscillations synchronized to the beat of the music (<xref ref-type="bibr" rid="B18">Fujioka et al., 2009</xref>). Gamma band activity (&#x223C;30&#x2013;60 Hz) also increases after each tone, even when a tone that was supposed to occur is omitted, suggesting that gamma oscillations represent an endogenous mechanism of beat anticipation (<xref ref-type="bibr" rid="B18">Fujioka et al., 2009</xref>). It was further found that phase-locked, evoked gamma band activity increases about 50 ms after tone onset and diminishes when tones are omitted, showing larger responses during accented beats vs. weak ones, which suggests a neural correlate for meter (<xref ref-type="bibr" rid="B52">Snyder and Large, 2005</xref>). Therefore, the aim of our study was to set up a continuous music framework that is not only able to detect encoding of beat and meter, but also able to distinguish between the two.</p>
<p>Tonality is another key component of real-life music listening. We learn what notes or chords will come next in a piece of music based, in part, on the statistical distribution, or frequency, of tones or sequences of tones (<xref ref-type="bibr" rid="B37">Krumhansl and Shepard, 1979</xref>). From these observations, <xref ref-type="bibr" rid="B36">Krumhansl and Cuddy (2010)</xref> derived the concept of tonal hierarchy, which describes the relative importance of tones in a musical context. By organizing tones in this way, humans assemble a psychological representation of the music based on tonality and rhythm. A few studies have attempted to develop multivariate frameworks that account for this prediction-driven, hierarchical nature of music. For example, <xref ref-type="bibr" rid="B12">Di Liberto et al. (2020)</xref> used EEG paired with continuous music stimuli to investigate the relative contributions of acoustic vs. melodic features of music to the cortical encoding of melodic expectation. However, they used monophonic melodies, rather than harmonic, complex music that we would hear in everyday life. They analyzed the EEG data with a useful tool for continuous stimuli, the Multivariate Temporal Response Function (mTRF) MATLAB toolbox, which maps stimulus features to EEG responses by estimating linear transfer functions (<xref ref-type="bibr" rid="B10">Crosse et al., 2016</xref>). <xref ref-type="bibr" rid="B56">Sturm et al. (2015)</xref> also used ridge regression with temporal embedding to calculate correlations between brain signal and music. Even though they used natural, complex piano music, they chose the power slope of the audio signal as a predictor, which is considered a basic acoustic measure that underlies more complex features such as beat and meter.</p>
<p>Building on the groundwork of these previous multivariate music analyses, we used the mTRF to analyze high-level tonal and rhythmic features of natural, continuous music stimuli extracted with the Music Information Retrieval (MIR) MATLAB toolbox (<xref ref-type="bibr" rid="B39">Lartillot and Toiviainen, 2007</xref>). The proposed framework aims to better understand how we process everyday music.</p>
<p>In an attempt to model Krumhansl&#x2019;s hierarchical organization of musical features, we also expanded on the features provided in the MIR toolbox to further enhance ecological validity. For example, <italic>key clarity</italic>, which measures how tonally similar a given frame of music is to a given key signature, has been used in several studies (<xref ref-type="bibr" rid="B2">Alluri et al., 2012</xref>; <xref ref-type="bibr" rid="B56">Sturm et al., 2015</xref>; <xref ref-type="bibr" rid="B7">Burunat et al., 2016</xref>), yet may not provide an accurate measurement of a musical event&#x2019;s tonality within the context of the entire musical excerpt. This motivated us to develop a novel feature called <italic>tonal stability</italic>, which contextualizes a particular musical event with respect to the tonal history thus far. <italic>Tonal stability</italic> quantifies the tonal hierarchy of a piece of music by taking the angular similarity between the <italic>key strength</italic> of a certain frame and the averaged <italic>key strength</italic> up until that frame. This allows us to determine how stable a musical event (or a frame) is within a given tonal hierarchy. In other words, it calculates how related the chord implied in an individual frame is to the overarching key, which is derived from a cumulative moving average. By continuously measuring local changes in tonal key centers with respect to the whole musical excerpt, we approximated the ongoing perception of tonal stability. To our knowledge, no prior study has developed such an analytical framework for combining MIR toolbox features with the mTRF to investigate how tonal and rhythmic features are encoded in the EEG signal during the listening of natural music.</p>
<p>We applied our framework to a public domain EEG dataset, the Open Music Imagery Information Retrieval dataset (<xref ref-type="bibr" rid="B54">Stober, 2017</xref>), to test the differential cortical encoding of tonal and rhythmic hierarchies. Using model comparisons, we inferred the contribution of individual features in EEG prediction. We show novel ecological evidence confirming and expanding Krumhansl&#x2019;s theory on how frequency and placement of musical features affect our perception and predictions (<xref ref-type="bibr" rid="B36">Krumhansl and Cuddy, 2010</xref>).</p>
</sec>
<sec id="S2">
<title>Materials and Methods</title>
<p>The approach we used in the current study is known as linearized modeling of a sensory system (<xref ref-type="bibr" rid="B59">Wu et al., 2006</xref>), which has been successfully applied to M/EEG data (<xref ref-type="bibr" rid="B38">Lalor et al., 2006</xref>; <xref ref-type="bibr" rid="B11">Di Liberto et al., 2015</xref>; <xref ref-type="bibr" rid="B5">Brodbeck et al., 2018</xref>) as well as fMRI data (<xref ref-type="bibr" rid="B31">Kay et al., 2008</xref>; <xref ref-type="bibr" rid="B27">Huth et al., 2016</xref>) in response to naturalistic visual and auditory stimuli. The key idea of the approach is a linearization function (<xref ref-type="bibr" rid="B59">Wu et al., 2006</xref>), which captures the nonlinearity of stimulus-response mapping and provides an efficient parameterization of relevant aspects of a stimulus that can be linearly associated with its corresponding response. In this section, we will explain our linearization functions (i.e., musical features), the specifications of the analyzed public data, and practical details of the analysis, which was carried out using MATLAB (<ext-link ext-link-type="uri" xlink:href="https://scicrunch.org/resolver/RRID:SCR_001622">RRID:SCR_001622</ext-link>; R2020a) unless otherwise noted.</p>
<sec id="S2.SS1">
<title>Musical Features</title>
<p>We considered a variety of features for the construction of our analytical models. The foundational feature in all models was the temporal <italic>envelope</italic> of the auditory stimulus, which contains low-level acoustic features such as amplitude variations. For tonal hierarchy, we used <italic>key clarity</italic> and <italic>tonal stability</italic> as our two additional features. For rhythmic hierarchy, we looked at the onset of every <italic>beat</italic> and their relative strengths within the given <italic>meter</italic>.</p>
<sec id="S2.SS1.SSS1">
<title>Acoustic Feature</title>
<p>A whole-spectrum <italic>envelope</italic> was calculated as the absolute value of the Hilbert transform of the musical signal. The envelope was down-sampled to the EEG&#x2019;s sampling rate after anti-aliasing high-pass filtering. This feature describing whole-spectrum acoustic energy served as a baseline for other models adding tonal and rhythmic features.</p>
</sec>
<sec id="S2.SS1.SSS2">
<title>Tonal Features</title>
<p>As for high-level tonal features, we computed <italic>key clarity</italic> and <italic>tonal stability</italic>. Key clarity was derived from the MIR toolbox<sup><xref ref-type="fn" rid="footnote1">1</xref></sup> (v1.7) function mirkeystrength, which computes a 24-dimensional vector of Pearson correlation coefficients corresponding to each of the 24 possible keys (12 major and 12 minor), which is called a <italic>key strength</italic> vector (<xref ref-type="bibr" rid="B20">G&#x00F3;mez, 2006</xref>). <italic>Key clarity</italic> is defined by the maximal correlation coefficient, which measures how strongly a certain key is implied in a given frame of interest (<xref ref-type="bibr" rid="B39">Lartillot and Toiviainen, 2007</xref>).</p>
<p>Our novel <italic>tonal stability</italic> feature was designed to contextualize the <italic>key strength</italic> with respect to the overall <italic>key strength</italic> of a given musical piece. It is computed with an angular similarity between the <italic>key strength</italic> vector of a single frame and a cumulative average of <italic>key strength</italic> vectors up until the adjacent previous frame as:</p>
<disp-formula id="S2.E1">
<label>(1)</label>
<mml:math id="M1">
<mml:mrow>
<mml:mrow>
<mml:mi mathvariant="normal">s</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="normal">t</mml:mi>
<mml:mo rspace="5.8pt" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo rspace="5.8pt">=</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mpadded width="+3.3pt">
<mml:mfrac>
<mml:mrow>
<mml:msup>
<mml:mi>cos</mml:mi>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>cos</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x03B8;</mml:mi>
</mml:mrow>
<mml:mi mathvariant="normal">&#x03C0;</mml:mi>
</mml:mfrac>
</mml:mpadded>
</mml:mrow>
<mml:mo rspace="5.8pt">=</mml:mo>
<mml:mrow>
<mml:mn>1</mml:mn>
<mml:mo>-</mml:mo>
<mml:mrow>
<mml:mfrac>
<mml:mn>1</mml:mn>
<mml:mi mathvariant="normal">&#x03C0;</mml:mi>
</mml:mfrac>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mi>cos</mml:mi>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2061;</mml:mo>
<mml:mfrac>
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mtext mathvariant="bold">v</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="normal">t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo>&#x22C5;</mml:mo>
<mml:mover accent="true">
<mml:mtext mathvariant="bold">v</mml:mtext>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">t</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mo fence="true">||</mml:mo>
<mml:mrow>
<mml:mtext mathvariant="bold">v</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="normal">t</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo fence="true">||</mml:mo>
</mml:mrow>
<mml:mo>&#x22C5;</mml:mo>
<mml:mrow>
<mml:mo fence="true">||</mml:mo>
<mml:mrow>
<mml:mover accent="true">
<mml:mtext mathvariant="bold">v</mml:mtext>
<mml:mo>&#x00AF;</mml:mo>
</mml:mover>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">t</mml:mi>
<mml:mo>-</mml:mo>
<mml:mn>1</mml:mn>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo fence="true">||</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mfrac>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where s(t) is <italic>tonal stability</italic> of the <italic>t</italic>-th frame, is an angle between the two <italic>key strength</italic> vectors, <bold>v</bold>(t) is a <italic>key strength</italic> vector of the <italic>t</italic>-th frame, and <inline-formula><mml:math id="INEQ3"><mml:mrow><mml:mrow><mml:mover accent="true"><mml:mtext mathvariant="bold">v</mml:mtext><mml:mo>&#x00AF;</mml:mo></mml:mover><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mtext>j</mml:mtext><mml:mo rspace="5.8pt" stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo rspace="5.8pt">=</mml:mo><mml:mrow><mml:msubsup><mml:mo largeop="true" symmetric="true">&#x2211;</mml:mo><mml:mrow><mml:mpadded width="+3.3pt"><mml:mi>i</mml:mi></mml:mpadded><mml:mo rspace="5.8pt">=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>j</mml:mi></mml:msubsup><mml:mrow><mml:mrow><mml:mtext mathvariant="bold">v</mml:mtext><mml:mo>&#x2062;</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi>i</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>/</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:mrow></mml:mrow></mml:math></inline-formula> is a cumulative moving average of <italic>key strength</italic> vectors from the first to the <italic>j</italic>-th frame. The angular similarity is bounded between 0 and 1, inclusively (1 when two vectors are parallel, 0.5 when orthogonal, and 0 when opposite). Thus, the <italic>tonal stability</italic> is also bounded between 0 and 1: 0 when key strength vectors are in opposite directions (i.e., implied keys are most distant on the cycle of fifths; in other words, they share few common tones).</p>
<p>Using the tonal hierarchy profile (<xref ref-type="bibr" rid="B37">Krumhansl and Shepard, 1979</xref>) as an ideal chromagram, which yields a maximal <italic>key strength</italic> of one (<xref ref-type="fig" rid="F1">Figure 1A</xref>), it can be shown that if a chromagram implies a distant key (e.g., C-major key in the F#-major key context), its <italic>tonal stability</italic> would be close to zero. A geometrical appreciation of the relations of <italic>key strength</italic> vectors can be made by a low-dimensional projection using principal component analysis (PCA). The first two principal components explained 65% of the total variance of all <italic>key strength</italic> vectors. When the <italic>key strength</italic> vectors of the 12 major keys are projected to the 2-dimensional plane of the first two principal components (<xref ref-type="fig" rid="F1">Figure 1B</xref>), it becomes clear that the <italic>key strength</italic> vectors of C-major and F#-major are geometrically opposing. Therefore, the (high-dimensional) angular similarity between them would be close to zero (<xref ref-type="fig" rid="F1">Figure 1C</xref>, marked by an arrow; not exactly zero because of higher dimensions that are not visualized), which is our definition of the <italic>tonal stability</italic> feature. On the other hand, the <italic>key clarity</italic> can be seen as the maximal projection to any of the 24 possible dimensions (i.e., maximal intensity projection). Therefore, it is constant regardless of context. In other words, the <italic>tonal stability</italic> quantifies how tonally stable a particular frame is within the context of the entire piece, whereas <italic>key clarity</italic> describes how strongly a tonal structure is implicated in an absolute sense (see <xref ref-type="supplementary-material" rid="DS1">Supplementary Figure 1</xref> for an example comparison).</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Key strength and tonal stability. <bold>(A)</bold> <italic>Key strength</italic> of the C-major profile (red) and F#-major profile (blue) are computed for all 24 target keys (i.e., Pearson correlation between profiles). The profiles were used as ideal chromagrams that yield a maximal <italic>key clarity</italic> of one. Upper cases represent major keys, and lower cases represent minor keys. <bold>(B)</bold> For geometrical intuition of <italic>tonal stability</italic>, the <italic>key strength</italic> vectors of 12 major keys are projected on the 2-dimensional plane of the first two principal components, which together explained 65% of the total variance. <bold>(C)</bold> <italic>Tonal stability</italic> of the C-major profile with respect to all 24 reference keys (i.e., in all contexts) are computed (i.e., the angular similarity between <italic>key strength</italic> vectors) and sorted in descending order. Note that <italic>key clarity</italic> of the C-major profile is always one while its <italic>tonal stability</italic> varies depending on the reference key (i.e., context).</p></caption>
<graphic xlink:href="fnins-15-665767-g001.tif"/>
</fig>
<p>The length of a time window to compute spectrograms should be long enough to cover the lower bound of pitch (i.e., 30 Hz; <xref ref-type="bibr" rid="B48">Pressnitzer et al., 2001</xref>) but also not too long to exceed the physiologically relevant spectral range. In the current work, we used a sparse encoding of tonal features based on the estimated beats and measures (see section &#x201C;Rhythmic Features&#x201D;). Specifically, at each beat (or measure), a time window was defined from the current to the next beat (or measure). For each time window, the spectrogram, cochleogram, and <italic>key strength</italic> vectors were estimated using the MIR function mirkeystrength, and the <italic>key clarity</italic> and <italic>tonal stability</italic> were calculated as described above. The approach of the sparse encoding is similar to assigning the &#x201C;semantic dissimilarity&#x201D; value of a word at the onset in a natural speech study, where N400-like temporal response functions (TRFs) were found (<xref ref-type="bibr" rid="B6">Broderick et al., 2018</xref>), and modeling the melodic entropy at the onset of a note (<xref ref-type="bibr" rid="B12">Di Liberto et al., 2020</xref>). Previous studies have found an early component (i.e., ERAN; <xref ref-type="bibr" rid="B34">Koelsch et al., 2003</xref>) in response to violations within local tonal contexts and a late component (i.e., N400; <xref ref-type="bibr" rid="B61">Zhang et al., 2018</xref>) during more global contexts. Therefore, <italic>tonal stability</italic> was expected to be encoded within these latencies.</p>
</sec>
<sec id="S2.SS1.SSS3">
<title>Rhythmic Features</title>
<p>As low-level rhythmic feature, we used <italic>beats</italic> (<xref ref-type="bibr" rid="B22">Grahn and McAuley, 2009</xref>; <xref ref-type="bibr" rid="B54">Stober, 2017</xref>). <italic>Beats</italic> were extracted using the dynamic beat tracker in the Librosa library<sup><xref ref-type="fn" rid="footnote2">2</xref></sup> and included in the shared public data. We modeled <italic>beats</italic> using a unit impulse function (i.e., 1&#x2019;s at beats, 0&#x2019;s otherwise).</p>
<p>As high-level rhythmic feature, we used <italic>meter</italic>, which was based on <italic>beats</italic>. We weighted the strength of each beat in a musical excerpt, according to a beat accent system that is most prevalent in Western classical music, by separating beats into three tiers: strong, middle, and weak (<xref ref-type="bibr" rid="B23">Grahn and Rowe, 2009</xref>; <xref ref-type="bibr" rid="B58">Vuust and Witek, 2014</xref>). A separate unit impulse function was created for each of the three levels. Note that the tiers correspond to the strength of a beat, not the position (or phase) within a measure. The breakdown applies as follows:</p>
<p>4/4 meter signature: beat 1=strong; beat 2=weak; beat 3=middle; and beat 4=weak.</p>
<p>3/4 meter signature: beat 1=strong; beat 2=weak; and beat 3=weak.</p>
</sec>
</sec>
<sec id="S2.SS2">
<title>OpenMIIR Dataset</title>
<p>We used the public domain Open Music Imagery Information Retrieval Dataset available on Github<sup><xref ref-type="fn" rid="footnote3">3</xref></sup>, which is designed to facilitate music cognition research involving EEG and the extraction of musical features. Given that we only analyzed a subset of the dataset, we will only summarize the relevant materials and methods. Complete details of the experimental procedure can be found in the original study (<xref ref-type="bibr" rid="B54">Stober, 2017</xref>).</p>
<sec id="S2.SS2.SSS1">
<title>Participants</title>
<p>Data was collected from ten participants. One participant was excluded from the dataset due to coughing and movement-related artifacts, resulting in a total of nine participants. Seven participants were female, and two were male. The average age of the participants was 23. Participants filled out a questionnaire asking about their musical playing and listening background. Seven out of the nine participants were musicians, which was defined as having engaged in a regular, daily practice of a musical instrument (including voice) for one or more years. The average number of years of daily musical practice was 5.4 years. The average number of formal years of musical training was 4.9 years.</p>
<p>Prior to the EEG recording, participants were asked to name and rate how familiar they were with the 12 stimuli of the experiment. Also, before the EEG experiment, they were asked to tap/clap along to the beat, which was then given a score by the researcher based on accuracy. Seven participants were given 100% on their ability to tap along to the beat, and two were given a 92%. All participants were familiar with 80% or more of the musical stimuli.</p>
</sec>
<sec id="S2.SS2.SSS2">
<title>Stimuli</title>
<p>There were 12 different, highly familiar musical excerpts that ranged between 6.9 and 13.9 s, with an average duration of 10.5 s each. Exactly half of the songs had a 3/4 time signature, and the other songs had a 4/4 time signature. <xref ref-type="table" rid="T1">Table 1</xref> lists the popular songs that the stimuli were taken from. The tonal features of these stimuli are shown in <xref ref-type="fig" rid="F2">Figure 2</xref>. The two features were not significantly correlated in any of the stimuli (minimum uncorrected-<italic>p</italic> = 0.08).</p>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>Descriptive statistics of tonal features.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Stim#</td>
<td valign="top" align="center">Title</td>
<td valign="top" align="center">Duration (sec)</td>
<td valign="top" align="center">BPM</td>
<td valign="top" align="center">BPB</td>
<td valign="top" align="center">Key clarity (mean &#x00B1; SD)</td>
<td valign="top" align="center">Tonal stability (mean &#x00B1; SD)</td>
<td valign="top" align="center">Corr. (<italic>p</italic>-value)</td>
<td valign="top" align="center">Key clarity (mean &#x00B1; SD)</td>
<td valign="top" align="center">Tonal stability (mean &#x00B1; SD)</td>
<td valign="top" align="center">Corr. (<italic>p</italic>-value)</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">1</td>
<td valign="top" align="center">Chim Chim Cheree (lyrics)</td>
<td valign="top" align="center">13.3</td>
<td valign="top" align="center">213</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0.53 &#x00B1; 0.15</td>
<td valign="top" align="center">0.63 &#x00B1; 0.19</td>
<td valign="top" align="center">0.17 (0.26)</td>
<td valign="top" align="center">0.52 &#x00B1; 0.13</td>
<td valign="top" align="center">0.60 &#x00B1; 0.24</td>
<td valign="top" align="center">0.16 (0.57)</td>
</tr>
<tr>
<td valign="top" align="left">2</td>
<td valign="top" align="center">Take me out to the ballgame (lyrics)</td>
<td valign="top" align="center">7.7</td>
<td valign="top" align="center">189</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0.56 &#x00B1; 0.14</td>
<td valign="top" align="center">0.63 &#x00B1; 0.18</td>
<td valign="top" align="center">0.19 (0.42)</td>
<td valign="top" align="center">0.59 &#x00B1; 0.15</td>
<td valign="top" align="center">0.60 &#x00B1; 0.29</td>
<td valign="top" align="center">0.64 (0.12)</td>
</tr>
<tr>
<td valign="top" align="left">3</td>
<td valign="top" align="center">Jingle Bells (lyrics)</td>
<td valign="top" align="center">9.7</td>
<td valign="top" align="center">200</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">0.51 &#x00B1; 0.16</td>
<td valign="top" align="center">0.65 &#x00B1; 0.18</td>
<td valign="top" align="center">&#x2212;0.09 (0.66)</td>
<td valign="top" align="center">0.55 &#x00B1; 0.12</td>
<td valign="top" align="center">0.58 &#x00B1; 0.28</td>
<td valign="top" align="center">0.14 (0.73)</td>
</tr>
<tr>
<td valign="top" align="left">4</td>
<td valign="top" align="center">Mary Had a Little Lamb (lyrics)</td>
<td valign="top" align="center">11.6</td>
<td valign="top" align="center">160</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">0.65 &#x00B1; 0.11</td>
<td valign="top" align="center">0.68 &#x00B1; 0.16</td>
<td valign="top" align="center">0.25 (0.19)</td>
<td valign="top" align="center">0.67 &#x00B1; 0.12</td>
<td valign="top" align="center">0.64 &#x00B1; 0.28</td>
<td valign="top" align="center">0.10 (0.81)</td>
</tr>
<tr>
<td valign="top" align="left">11</td>
<td valign="top" align="center">Chim Chim Cheree (no lyrics)</td>
<td valign="top" align="center">13.9</td>
<td valign="top" align="center">206</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0.69 &#x00B1; 0.10</td>
<td valign="top" align="center">0.72 &#x00B1; 0.21</td>
<td valign="top" align="center">&#x2212;0.00 (1.00)</td>
<td valign="top" align="center">0.70 &#x00B1; 0.11</td>
<td valign="top" align="center">0.69 &#x00B1; 0.27</td>
<td valign="top" align="center">&#x2212;0.34 (0.22)</td>
</tr>
<tr>
<td valign="top" align="left">12</td>
<td valign="top" align="center">Take me out to the ballgame (no lyrics)</td>
<td valign="top" align="center">7.9</td>
<td valign="top" align="center">185</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0.65 &#x00B1; 0.12</td>
<td valign="top" align="center">0.69 &#x00B1; 0.18</td>
<td valign="top" align="center">0.02 (0.95)</td>
<td valign="top" align="center">0.71 &#x00B1; 0.04</td>
<td valign="top" align="center">0.67 &#x00B1; 0.28</td>
<td valign="top" align="center">&#x2212;0.08 (0.85)</td>
</tr>
<tr>
<td valign="top" align="left">13</td>
<td valign="top" align="center">Jingle Bells (no lyrics)</td>
<td valign="top" align="center">9.0</td>
<td valign="top" align="center">200</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">0.60 &#x00B1; 0.12</td>
<td valign="top" align="center">0.72 &#x00B1; 0.19</td>
<td valign="top" align="center">0.34 (0.08)</td>
<td valign="top" align="center">0.54 &#x00B1; 0.10</td>
<td valign="top" align="center">0.63 &#x00B1; 0.31</td>
<td valign="top" align="center">0.14 (0.76)</td>
</tr>
<tr>
<td valign="top" align="left">14</td>
<td valign="top" align="center">Mary Had a Little Lamb (no lyrics)</td>
<td valign="top" align="center">12.2</td>
<td valign="top" align="center">160</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">0.76 &#x00B1; 0.09</td>
<td valign="top" align="center">0.81 &#x00B1; 0.16</td>
<td valign="top" align="center">0.15 (0.44)</td>
<td valign="top" align="center">0.71 &#x00B1; 0.10</td>
<td valign="top" align="center">0.78 &#x00B1; 0.32</td>
<td valign="top" align="center">&#x2212;0.33 (0.43)</td>
</tr>
<tr>
<td valign="top" align="left">21</td>
<td valign="top" align="center">Emperor Waltz</td>
<td valign="top" align="center">8.3</td>
<td valign="top" align="center">175</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0.76 &#x00B1; 0.11</td>
<td valign="top" align="center">0.76 &#x00B1; 0.19</td>
<td valign="top" align="center">0.14 (0.54)</td>
<td valign="top" align="center">0.78 &#x00B1; 0.14</td>
<td valign="top" align="center">0.71 &#x00B1; 0.30</td>
<td valign="top" align="center">&#x2212;0.15 (0.72)</td>
</tr>
<tr>
<td valign="top" align="left">22</td>
<td valign="top" align="center">Harry Potter theme</td>
<td valign="top" align="center">16.0</td>
<td valign="top" align="center">166</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0.67 &#x00B1; 0.16</td>
<td valign="top" align="center">0.68 &#x00B1; 0.26</td>
<td valign="top" align="center">&#x2212;0.03 (0.84)</td>
<td valign="top" align="center">0.72 &#x00B1; 0.13</td>
<td valign="top" align="center">0.63 &#x00B1; 0.31</td>
<td valign="top" align="center">&#x2212;0.12 (0.69)</td>
</tr>
<tr>
<td valign="top" align="left">23</td>
<td valign="top" align="center">Star Wars theme</td>
<td valign="top" align="center">9.2</td>
<td valign="top" align="center">104</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">0.66 &#x00B1; 0.16</td>
<td valign="top" align="center">0.70 &#x00B1; 0.25</td>
<td valign="top" align="center">0.19 (0.50)</td>
<td valign="top" align="center">0.65 &#x00B1; 0.11</td>
<td valign="top" align="center">0.68 &#x00B1; 0.46</td>
<td valign="top" align="center">0.48 (0.52)</td>
</tr>
<tr>
<td valign="top" align="left">24</td>
<td valign="top" align="center">Eine Kleine Nachtmusik</td>
<td valign="top" align="center">6.9</td>
<td valign="top" align="center">140</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">0.64 &#x00B1; 0.07</td>
<td valign="top" align="center">0.67 &#x00B1; 0.23</td>
<td valign="top" align="center">0.10 (0.75)</td>
<td valign="top" align="center">0.69 &#x00B1; 0.10</td>
<td valign="top" align="center">0.51 &#x00B1; 0.36</td>
<td valign="top" align="center">&#x2212;0.09 (0.91)</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<attrib><italic>BPM, beats per minute; BPB, beats per bar; Corr., Pearson correlation coefficient between <italic>key clarity</italic> and <italic>tonal stability</italic>.</italic></attrib>
</table-wrap-foot>
</table-wrap>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>Tonal features of musical stimuli. <italic>Envelope</italic> (gray), <italic>key clarity</italic> (green), and <italic>tonal stability</italic> (orange) are shown. Envelopes outside the analysis window are shown in dashed lines. Stimulus IDs are noted on the left.</p></caption>
<graphic xlink:href="fnins-15-665767-g002.tif"/>
</fig>
</sec>
<sec id="S2.SS2.SSS3">
<title>Procedure</title>
<p>We analyzed the &#x201C;Perception&#x201D; condition, which was the first out of four experimental conditions. The rest of the conditions involved musical imagery tasks, which we did not include in our analysis. Each condition consisted of five blocks. All 12 stimuli were played in a randomized order once per block. This resulted in a total of 60 trials for each condition (i.e., five repetitions per stimulus). In each trial, a stimulus was preceded by two measures of cue beats.</p>
</sec>
<sec id="S2.SS2.SSS4">
<title>Data Acquisition and Preprocessing</title>
<p>Neural signals were measured during the experiment using a BioSemi Active-Two EEG system in 64 channels at a sampling rate of 512 Hz. Independent components associated with ocular and cardiac artifacts were detected using the MNE-python<sup><xref ref-type="fn" rid="footnote4">4</xref></sup> (<ext-link ext-link-type="uri" xlink:href="https://scicrunch.org/resolver/RRID:SCR_005972">RRID:SCR_005972</ext-link>; v0.20.7) (<xref ref-type="bibr" rid="B25">Gramfort et al., 2013</xref>), of which demixing matrices were also included in the open dataset. After projecting out the artifact-related components using mne.preprocessing.ica.apply, the EEG signal was converted to handle in EEGLAB<sup><xref ref-type="fn" rid="footnote5">5</xref></sup> (<ext-link ext-link-type="uri" xlink:href="https://scicrunch.org/resolver/RRID:SCR_007292">RRID:SCR_007292</ext-link>; v14.1.2). Then, the data was bandpass-filtered between 1 and 8 Hz using Hamming windowed sinc finite impulse response (FIR) filter using pop_eegfiltnew as the low-frequency activity was previously found to encode music-related information (<xref ref-type="bibr" rid="B12">Di Liberto et al., 2020</xref>). Trials were epoched using pop_epoch between 100 ms after music onset (i.e., after beat cues) and 100 ms before music offset with a window length of 200 ms for tonal feature extraction. The EEG signal was then down-sampled to 128 Hz (pop_resample) and normalized by <italic>Z</italic>-scoring each trial.</p>
</sec>
</sec>
<sec id="S2.SS3">
<title>mTRF Analysis</title>
<sec id="S2.SS3.SSS1">
<title>Model Prediction</title>
<p>The linearized encoding analysis was carried out using the mTRF MATLAB Toolbox<sup><xref ref-type="fn" rid="footnote6">6</xref></sup> (v2.1) created by <xref ref-type="bibr" rid="B10">Crosse et al. (2016)</xref>. In a FIR model, we fit a set of lagged stimulus features to response timeseries to estimate time-varying causal impacts of features to the response timeseries:</p>
<disp-formula id="S2.Ex1">
<mml:math id="M2">
<mml:mrow>
<mml:mrow>
<mml:mi mathvariant="normal">y</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="normal">t</mml:mi>
<mml:mo rspace="5.8pt" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo rspace="5.8pt">=</mml:mo>
<mml:mrow>
<mml:mstyle displaystyle="true">
<mml:munderover>
<mml:mo largeop="true" movablelimits="false" symmetric="true">&#x2211;</mml:mo>
<mml:mrow>
<mml:mpadded width="+3.3pt">
<mml:mi>d</mml:mi>
</mml:mpadded>
<mml:mo rspace="5.8pt">=</mml:mo>
<mml:mn>0</mml:mn>
</mml:mrow>
<mml:mi>D</mml:mi>
</mml:munderover>
</mml:mstyle>
<mml:mrow>
<mml:mi>x</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mrow>
<mml:mi>t</mml:mi>
<mml:mo>-</mml:mo>
<mml:mi>d</mml:mi>
</mml:mrow>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>b</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi>d</mml:mi>
<mml:mo stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where y(t) and x(t) are a response and a feature at a time point <italic>t</italic>, respectively, and <italic>b</italic>(<italic>d</italic>) is a weight that represents the impact of a feature at a delay <italic>d</italic>. A timeseries of these weights (i.e., a transfer function or a kernel of a linear filter) is called a TRF. In the mTRF encoding analysis, we use a regularized regression (e.g., ridge) to estimate TRFs where multicollinearity exists among multiple features. The encoding analysis is performed at each channel at a time (i.e., multiple independent variables and a univariate dependent variable). The validity of the estimated TRFs is often tested <italic>via</italic> cross-validation (i.e., convolving test features with a kernel estimated from a training set to predict test responses).</p>
<p>When considering multiple features, the FIR model can be expressed in a matrix form:</p>
<disp-formula id="S2.E2">
<label>(2)</label>
<mml:math id="M3">
<mml:mrow>
<mml:mtext mathvariant="bold">y</mml:mtext>
<mml:mo rspace="5.8pt">=</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mo>&#x2062;</mml:mo>
<mml:mi mathvariant="normal">&#x03B2;</mml:mi>
</mml:mrow>
<mml:mo>+</mml:mo>
<mml:mi mathvariant="normal">&#x03B5;</mml:mi>
</mml:mrow>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <bold>y</bold> &#x2208; <italic>&#x211D;</italic><sup><italic>T</italic> &#x00D7; 1</sup> is an EEG response vector from a given channel over <italic>T</italic> time points, <bold>X</bold> &#x2208; <italic>&#x211D;</italic><sup><italic>T</italic> &#x00D7; <italic>PD</italic></sup> is a feature matrix of which columns are <italic>P</italic> features lagged over <italic>D</italic> time points (i.e., a Toeplitz matrix), &#x03B2; &#x2208; <italic>&#x211D;</italic><sup>1 &#x00D7; <italic>PD</italic></sup> is a vector of unknown weights, and &#x03B5; &#x2208; <italic>&#x211D;</italic><sup><italic>T</italic> &#x00D7; 1</sup> is a vector of Gaussian noise with unknown serial correlation. Note that a feature set could consist of multiple sub-features (e.g., 16-channel cochleogram and 3-channel meter). The vector &#x03B2; is concatenated weights over <italic>D</italic> delays for <italic>P</italic> features. In the current analysis, we column-wise normalized <bold>y</bold> and <bold>X</bold> by taking <italic>Z</italic>-scores per trial.</p>
<p>A ridge solution of Eq. 2 is given (<xref ref-type="bibr" rid="B26">Hoerl and Kennard, 1970</xref>) as:</p>
<disp-formula id="S2.E3">
<label>(3)</label>
<mml:math id="M4">
<mml:mrow>
<mml:mrow>
<mml:mrow>
<mml:mover accent="true">
<mml:mi mathvariant="normal">&#x03B2;</mml:mi>
<mml:mo>^</mml:mo>
</mml:mover>
<mml:mo>&#x2062;</mml:mo>
<mml:mrow>
<mml:mo stretchy="false">(</mml:mo>
<mml:mi mathvariant="normal">&#x03BB;</mml:mi>
<mml:mo rspace="5.8pt" stretchy="false">)</mml:mo>
</mml:mrow>
</mml:mrow>
<mml:mo rspace="5.8pt">=</mml:mo>
<mml:mrow>
<mml:msup>
<mml:mrow>
<mml:mo>(</mml:mo>
<mml:mrow>
<mml:mrow>
<mml:msup>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">T</mml:mi>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mtext mathvariant="bold">X</mml:mtext>
</mml:mrow>
<mml:mo rspace="5.8pt">+</mml:mo>
<mml:mrow>
<mml:mi mathvariant="normal">&#x03BB;</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mtext mathvariant="bold">I</mml:mtext>
</mml:mrow>
</mml:mrow>
<mml:mo>)</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>-</mml:mo>
<mml:mn mathvariant="bold">1</mml:mn>
</mml:mrow>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:msup>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mi mathvariant="bold">T</mml:mi>
</mml:msup>
<mml:mo>&#x2062;</mml:mo>
<mml:mtext mathvariant="bold">y</mml:mtext>
</mml:mrow>
</mml:mrow>
<mml:mo>,</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <bold>I</bold> &#x2208; <italic>&#x211D;</italic><sup><italic>PD</italic> &#x00D7; <italic>PD</italic></sup> is an identity matrix and &#x03BB;0 is a regularization parameter that penalizes (i.e., shrinks) estimates. That is, the ridge estimates are dependent on the selection of regularization. The lambda was optimized on training data (i.e., a lambda that yields the maximal prediction accuracy for each channel), and the validity of this model was tested on testing data (i.e., predicting EEG response based on given features) through the leave-one-out cross-validation scheme using mTRFcrossval, mTRFtrain, and mTRFevalute. The prediction accuracy was measured by the Pearson correlation coefficient. Specifically, we used 79 delays from &#x2212;150 ms to 450 ms and 21 loglinearly spaced lambda values from 2<sup>&#x2212;10</sup> to 2<sup>10</sup>. We discarded time points where the kernel exceeded trial boundaries (i.e., valid boundary condition) to avoid zero-padding artifacts (e.g., high peaks at zero-lag from short trials).</p>
</sec>
<sec id="S2.SS3.SSS2">
<title>Model Comparison</title>
<p>We created multiple models with varying terms and compared prediction accuracies to infer the significance of encoding of a specific feature in the responses. The families of models were:</p>
<disp-formula id="S2.E4">
<label>(4)</label>
<mml:math id="M5">
<mml:mrow>
<mml:mtext mathvariant="bold">y</mml:mtext>
<mml:mo rspace="5.8pt">=</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mtext>[</mml:mtext>
<mml:msub>
<mml:mi mathvariant="normal">&#x03B2;</mml:mi>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>n</mml:mi>
<mml:mo>&#x2062;</mml:mo>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:msub>
<mml:mo rspace="5.8pt" stretchy="false">]</mml:mo>
<mml:mo rspace="5.8pt">+</mml:mo>
<mml:mi mathvariant="normal">&#x03B5;</mml:mi>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S2.E5">
<label>(5-1)</label>
<mml:math id="M6">
<mml:mrow>
<mml:mo>y&#x00A0;=&#x00A0;</mml:mo>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="bold">&#x00A0;X</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>b</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mo>&#x03B2;</mml:mo>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mo>&#x03B2;</mml:mo>
<mml:mrow>
<mml:mi>b</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>t</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo>&#x2062;</mml:mo>
<mml:mo>&#x03B5;</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S2.E6">
<label>(5-2)</label>
<mml:math id="M7">
<mml:mrow>
<mml:mtext>y&#x00A0;=&#x00A0;</mml:mtext>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="bold">&#x00A0;X</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mo>&#x03B2;</mml:mo>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mo>&#x03B2;</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo>&#x2062;</mml:mo>
<mml:mo>&#x03B5;</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S2.E7">
<label>(6-1)</label>
<mml:math id="M8">
<mml:mrow>
<mml:mtext>y&#x00A0;=&#x00A0;</mml:mtext>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="bold">&#x00A0;X</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mtext mathvariant="bold">&#x00A0;X</mml:mtext>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mo>&#x03B2;</mml:mo>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mo>&#x03B2;</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mo>&#x03B2;</mml:mo>
<mml:mrow>
<mml:mi>c</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>r</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo>&#x2062;</mml:mo>
<mml:mo>&#x03B5;</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<disp-formula id="S2.E8">
<label>(6-2)</label>
<mml:math id="M9">
<mml:mrow>
<mml:mtext>y&#x00A0;=&#x00A0;</mml:mtext>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:msub>
<mml:mtext mathvariant="bold">X</mml:mtext>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mrow>
<mml:mtext mathvariant="bold">&#x00A0;X</mml:mtext>
</mml:mrow>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:msub>
<mml:msub>
<mml:mtext mathvariant="bold">&#x00A0;X</mml:mtext>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>b</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mrow>
<mml:mo>[</mml:mo>
<mml:mrow>
<mml:mtable>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mo>&#x03B2;</mml:mo>
<mml:mrow>
<mml:mi>e</mml:mi>
<mml:mi>n</mml:mi>
<mml:mi>v</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mo>&#x03B2;</mml:mo>
<mml:mrow>
<mml:mi>m</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>e</mml:mi>
<mml:mi>r</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
<mml:mtr>
<mml:mtd>
<mml:mrow>
<mml:msub>
<mml:mo>&#x03B2;</mml:mo>
<mml:mrow>
<mml:mi>s</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>a</mml:mi>
<mml:mi>b</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>l</mml:mi>
<mml:mi>i</mml:mi>
<mml:mi>t</mml:mi>
<mml:mi>y</mml:mi>
</mml:mrow>
</mml:msub>
</mml:mrow>
</mml:mtd>
</mml:mtr>
</mml:mtable>
</mml:mrow>
<mml:mo>]</mml:mo>
</mml:mrow>
<mml:mo>&#x2062;</mml:mo>
<mml:mo>+</mml:mo>
<mml:mo>&#x2062;</mml:mo>
<mml:mo>&#x03B5;</mml:mo>
</mml:mrow>
</mml:math>
</disp-formula>
<p>where <bold>X<sub>i</sub></bold> and &#x03B2;<bold><sub>i</sub></bold> are a Toeplitz matrix and a weight vector for the <italic>i</italic>-th feature, respectively. Equation 4 served as a baseline model and Eq. 5 are rhythmic models and Eq. 6 are tonal models while covarying rhythmic features. Comparisons of interest were: (a) Eq. 5-1 vs. Eq. 4, (b) Eq. 5-2 vs. Eq. 5-1, (c) Eq. 6-1 vs. Eq. 5-2, and (d) Eq. 6-2 vs. Eq. 5-2. Note that the comparisons were made to infer the effect of the addition of each feature despite their multicollinearity. That is, if there is no uniquely explained variance by the last term, the full model (with the last term) cannot yield greater prediction accuracy than the reduced model (without the last term).</p>
<p>Cluster-based Monte Carlo permutation test (<xref ref-type="bibr" rid="B43">Maris and Oostenveld, 2007</xref>), using ft_statistics_montecarlo in FieldTrip<sup><xref ref-type="fn" rid="footnote7">7</xref></sup> (<ext-link ext-link-type="uri" xlink:href="https://scicrunch.org/resolver/RRID:SCR_004849">RRID:SCR_004849</ext-link>; v20180903), was used to calculate cluster-wise <italic>p</italic>-values of one paired <italic>t</italic>-test on differences in prediction accuracies across all channels with the summed <italic>t</italic>-statistics as a cluster statistic. 10,000 permutations with replacement were made to generate null distributions. In permutation tests, a cluster-forming threshold does not affect the family-wise error rate (FWER) but only sensitivity (see <xref ref-type="bibr" rid="B42">Maris, 2019</xref> for formal proof). Thus, clusters were defined at an arbitrary threshold of the alpha-level of 0.05, and the cluster-wise <italic>p</italic>-values are thresholded at the alpha-level of 0.05 to control the FWER to 0.05.</p>
<p>To estimate the variation of the point estimate of a prediction accuracy difference, we bootstrapped cluster-mean prediction accuracies for 10,000 times to compute 95% confidence intervals. For the visualization of results, modified versions of topoplot in EEGLAB and cat_plot_boxplot in CAT12<sup><xref ref-type="fn" rid="footnote8">8</xref></sup> are used.</p>
</sec>
<sec id="S2.SS3.SSS3">
<title>Control Analysis</title>
<p>To demonstrate the false positive control and the sensitivity of the current procedure, we randomized the phases of envelopes (<xref ref-type="bibr" rid="B44">Menon and Levitin, 2005</xref>; <xref ref-type="bibr" rid="B1">Abrams et al., 2013</xref>; <xref ref-type="bibr" rid="B16">Farbood et al., 2015</xref>; <xref ref-type="bibr" rid="B30">Kaneshiro et al., 2020</xref>) to create control features with disrupted temporal structure, but with identical spectra. If the prediction is not due to the encoding of temporal information, this control feature (i.e., phase-randomized envelope) would be expected to explain the EEG data as well as the original envelope. Specifically, the phases of envelopes were randomized <italic>via</italic> fast Fourier transform (FFT) and inverse FFT for each stimulus. That is, within each randomization, the randomized envelope was identical throughout repeated representations over trials. MATLAB&#x2019;s fft and ifft were used. The phase randomization, model optimization, and model evaluation processes were repeated 50 times across all participants. Then, the prediction accuracies averaged across phase-randomizations were compared with the prediction accuracies with the actual envelopes using the cluster-based Monte Carlo permutation test with the same alpha-levels as in the main analysis.</p>
</sec>
</sec>
</sec>
<sec id="S3">
<title>Results</title>
<sec id="S3.SS1">
<title>Envelope Tracking</title>
<p>The control analysis revealed that the mTRF analysis sensitively detects envelope tracking compared to models with phase-randomized envelopes (<xref ref-type="fig" rid="F3">Figure 3</xref>). In a cluster with 38 channels over the central and frontal scalp regions, the prediction accuracy with the observed envelopes was significantly higher than randomized envelopes [cluster-mean r<sub>rand</sub> = 0.0517; r<sub>obs</sub> = 0.0670; r<sub>obs</sub> &#x2212; r<sub>rand</sub> = 0.0153, 95% CI = (0.0091, 0.0217); summary statistics &#x03A3;T = 143.7; cluster-<italic>p</italic> = 0.0001]. As discussed above (see section &#x201C;Model Comparison&#x201D;), the higher prediction accuracy of the full model than that of the reduced model (or the null model) indicates that the term of the full model that differs from the reduced (or null) model adds a unique contribution to the prediction, reflecting the neural encoding of the corresponding information. Here, the results suggest that the sound envelope is encoded in the cluster. Note that the peaks at the zero-lag in the TRFs (<xref ref-type="fig" rid="F3">Figure 3E</xref>) are due to the free boundary condition (zero-padding at the boundaries of trials; note that the &#x201C;condition&#x201D; here refers to a mathematical constraint and not relevant to experimental conditions), which predicted trial-onset responses in phase-randomized models. When a weaker null model without the trial-onset was compared (i.e., valid boundary condition), the testing revealed increased prediction accuracy in 56 electrodes (cluster-<italic>p</italic> = 0.0001), presumably reflecting the widespread auditory activity <italic>via</italic> volume conduction (figure not shown).</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Envelope encoding. <bold>(A,B)</bold> Mean prediction accuracies averaged across subjects are shown in topoplots for a null model (E<sub>rand</sub>, phase-randomized envelope) and a faithful model (E<sub>obs</sub>, observed envelope), respectively. <bold>(C)</bold> <italic>t</italic>-statistics comparing differences in prediction accuracies are shown. Channels included in significant clusters (cluster-<italic>p</italic> &#x003C; 0.05) are marked in white. <bold>(D)</bold> Prediction accuracies averaged within the cluster with the smallest <italic>p</italic>-value are plotted for each participant. <bold>(E)</bold> Temporal response functions averaged within the cluster are shown. Shades mark one standard error of the mean across participants.</p></caption>
<graphic xlink:href="fnins-15-665767-g003.tif"/>
</fig>
</sec>
<sec id="S3.SS2">
<title>Rhythmic Hierarchy</title>
<p>With respect to the low-level rhythmic feature, the analysis revealed significant encoding of <italic>beat</italic> (Eq. 5-1 vs. Eq. 4; <xref ref-type="fig" rid="F4">Figure 4</xref>) in a cluster of 20 central channels [cluster-mean r<sub>reduced</sub> = 0.0314; r<sub>full</sub> = 0.0341; r<sub>full &#x2013;</sub> r<sub>reduced</sub> = 0.0027, 95% CI = (0.0013, 0.0042); &#x03A3;T = 52.4; cluster-<italic>p</italic> = 0.0234]. Similarly to the envelope tracking results, a significant increase of prediction accuracy indicates a unique contribution of <italic>beat</italic> in addition to <italic>envelope</italic>.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Beat encoding. <bold>(A,B)</bold> Mean prediction accuracies of a reduced model (E, Envelope) and a full model (E + B, Envelope + Beat), respectively. <bold>(C)</bold> <italic>t</italic>-statistics comparing differences in prediction accuracies are shown. Channels included in significant clusters (cluster-<italic>p</italic> &#x003C; 0.05) are marked in white. <bold>(D)</bold> Prediction accuracies averaged within the cluster with the smallest <italic>p</italic>-value are plotted for each subject. <bold>(E,F)</bold> Temporal response functions of features averaged across electrodes within the cluster are shown. TRFs are <italic>Z</italic>-scored across lags for different regularizations across electrodes/participants.</p></caption>
<graphic xlink:href="fnins-15-665767-g004.tif"/>
</fig>
<p>With respect to the high-level rhythmic feature, the analysis revealed significant encoding of <italic>meter</italic> (Eq. 5-2 vs. Eq. 5-1; <xref ref-type="fig" rid="F5">Figure 5</xref>) in a cluster of 16 frontal and central channels [cluster-mean r<sub>reduced</sub> = 0.0337; r<sub>full</sub> = 0.0398; r<sub>full &#x2013;</sub> r<sub>reduced</sub> = 0.0062, (0.0023, 0.0099); &#x03A3;T = 34.6; cluster-<italic>p</italic> = 0.0137]. Likewise, a significant increase of prediction accuracy indicates a unique contribution of <italic>meter</italic> in addition to <italic>envelope</italic> and <italic>beat</italic>. The TRFs for <italic>meter</italic> showed different patterns by accents.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>Meter encoding. <bold>(A,B)</bold> Mean prediction accuracies of a reduced model (E + B, Envelope + Beat) and a full model (E + B + M, Envelope + Beat + Meter), respectively. <bold>(C)</bold> <italic>t</italic>-statistics comparing differences in prediction accuracies are shown. Channels included in significant clusters (cluster-<italic>p</italic> &#x003C; 0.05) are marked in white. <bold>(D)</bold> Prediction accuracies averaged within the cluster with the smallest <italic>p</italic>-value are plotted for each subject. <bold>(E,F)</bold> Temporal response functions of features averaged within the cluster are shown. <bold>(G)</bold> Temporal response functions of the meter features (strong beat, middle beat, and weak beat; e.g., 4/4/: strong-weak-middle-weak) averaged within the cluster are shown. TRFs are <italic>Z</italic>-scored across lags for different regularizations across electrodes/participants.</p></caption>
<graphic xlink:href="fnins-15-665767-g005.tif"/>
</fig>
</sec>
<sec id="S3.SS3">
<title>Tonal Hierarchy</title>
<p>We did not find a significant increase of prediction accuracy for either <italic>key clarity</italic> or <italic>tonal stability</italic> calculated on each beat or measure (Eq. 6-1 vs. Eq. 5-2, minimum cluster-<italic>p</italic> = 0.1211; Eq. 6-2 vs. Eq. 5-2, minimum cluster-<italic>p</italic> = 0.0762; <xref ref-type="supplementary-material" rid="DS1">Supplementary Figures 2</xref>&#x2013;<xref ref-type="supplementary-material" rid="DS1">5</xref>).</p>
</sec>
</sec>
<sec id="S4">
<title>Discussion</title>
<sec id="S4.SS1">
<title>Validity of the Proposed Framework</title>
<p>The view that individual elements in natural music may not produce the same effect as they do in isolation is not new. It has been claimed that music is not an objective entity but rather something that is experienced and perceived, raising the need for a dynamic, event-based processing framework (<xref ref-type="bibr" rid="B49">Reybrouck, 2005</xref>). The fundamental issue, however, is that often diverse musical features covary to maximize emotional effect (e.g., slow and elegiac melodies in Niccol&#x00F2; Paganini&#x2019;s Caprice for Solo Violin Op. 1 No. 3 in E minor and energetic arpeggios and triple stops in No. 1 in E major; or subdued, low vocals in Nirvana&#x2019;s melancholic &#x201C;Something In The Way&#x201D; and loud, angry drums in &#x201C;Smells Like Teen Spirit&#x201D;). Unless hundreds (if not thousands) of natural stimuli are used (<xref ref-type="bibr" rid="B15">Eerola et al., 2009</xref>; <xref ref-type="bibr" rid="B9">Cowen et al., 2020</xref>), it is impossible to tease out the effect of one element (or an independent component of elements) from another with a small number of stimuli. For this reason, it has been an established tradition to isolate and orthogonalize musical features or acoustic properties to study their effects in music psychology and cognitive neuroscience of music. However, now that computational models can translate naturalistic stimuli into relevant features (i.e., linearizing functions), recent human neuroimaging studies have shown that it is possible to analyze complex interactions among natural features while taking advantage of the salience of naturalistic stimuli to evoke intense emotions and provide ecologically valid contexts (<xref ref-type="bibr" rid="B19">Goldberg et al., 2014</xref>; <xref ref-type="bibr" rid="B53">Sonkusare et al., 2019</xref>; <xref ref-type="bibr" rid="B28">J&#x00E4;&#x00E4;skel&#x00E4;inen et al., 2021</xref>).</p>
<p>In the current study, we demonstrated a simple yet powerful framework of a linearized encoding analysis by combining the MIR toolbox (a battery of model-based features) and mTRF Toolbox (FIR modeling with ridge regression). First, we showed that ridge regression successfully predicted envelope-triggered cortical responses in the ongoing EEG signal in comparison to null models with a phase-randomized envelope. Furthermore, our proposed framework detected cortical encoding of rhythmic, but not tonal, features while listening to naturalistic music. In addition, the estimated transfer functions and the spatial distribution of the prediction accuracies made neuroscientific interpretations readily available. These findings differentiate themselves from previous studies using similar regression analyses that only used either monophonic music or simple, low-level acoustic features, such as note onset (<xref ref-type="bibr" rid="B56">Sturm et al., 2015</xref>; <xref ref-type="bibr" rid="B12">Di Liberto et al., 2020</xref>).</p>
</sec>
<sec id="S4.SS2">
<title>Cortical Encoding of Musical Features</title>
<p>We showed cortical encoding of beats and meter during the listening of every day, continuous musical examples. This was observed most strongly over frontal and central EEG channels, which have long been implicated as markers of auditory processing activity (<xref ref-type="bibr" rid="B45">N&#x00E4;&#x00E4;t&#x00E4;nen and Picton, 1987</xref>; <xref ref-type="bibr" rid="B62">Zouridakis et al., 1998</xref>; <xref ref-type="bibr" rid="B55">Stropahl et al., 2018</xref>). However, <italic>key clarity</italic> and <italic>tonal stability</italic> were not conclusively represented in the cortical signal in our models.</p>
<p>Unlike the tonal features, both of the rhythmic features (<italic>beat</italic> and <italic>meter</italic>) were encoded strongly in the neural signal. The TRF for <italic>beat</italic> showed a steady periodic signal, consistent with the finding in the original OpenMIIR dataset publication by <xref ref-type="bibr" rid="B54">Stober (2017)</xref> that the peaks of the event-related potentials (ERPs) corresponded to the beat of the music. This means that both the ERPs in the study by <xref ref-type="bibr" rid="B54">Stober (2017)</xref> and our <italic>beat</italic> TRFs displayed large peaks at zero-lag, implying that beats may be anticipated. The possibility of an anticipatory mechanism of beats is consistent with the view that humans may possess an endogenous mechanism of beat anticipation that is active even when tones are unexpectedly omitted (<xref ref-type="bibr" rid="B18">Fujioka et al., 2009</xref>). The relatively early latency of the additional TRF peaks between 100 and 200 ms suggests that beats may be processed in a bottom-up fashion as well. Humans engage in an active search for the beat when it becomes less predictable by adaptively shifting their predictions based on the saliency of the beats in the music, suggesting that beats also provide useful exogenous cues (<xref ref-type="bibr" rid="B57">Toiviainen et al., 2020</xref>). The use of continuous music and EEG in the proposed framework lends itself particularly well to determining these various mechanisms of beat perception.</p>
<p>It has also been shown that different populations of neurons entrain to beats and meter (<xref ref-type="bibr" rid="B46">Nozaradan et al., 2017</xref>). Moreover, phase-locked gamma band activity has further suggested a unique neural correlate to meter (<xref ref-type="bibr" rid="B52">Snyder and Large, 2005</xref>). Extending these previous findings, the current results in the low frequency band (1&#x2013;8 Hz) revealed this dichotomy between beats and meter through their different topologies. <italic>Beat</italic> was encoded over a tight cluster of central channels, but <italic>meter</italic> was encoded over a large cluster of frontal channels. The significant increase in prediction accuracy observed over widespread frontal channels for <italic>meter</italic> might suggest a distant source although it is not possible to uniquely determine the source location only from the sensor topography (i.e., inverse problem). That is, the topography also could be due to widely spread but synchronized cortical sources. However, there is evidence based on deep brain stimulation and scalp recording that EEG is sensitive to subcortical sources (<xref ref-type="bibr" rid="B51">Seeber et al., 2019</xref>). The putamen, in particular, has been proposed as a region of meter entrainment, while the cortical supplementary motor area is more associated with beats (<xref ref-type="bibr" rid="B46">Nozaradan et al., 2017</xref>; <xref ref-type="bibr" rid="B40">Li et al., 2019</xref>). The distinct topologies observed between the beats and meter features are especially intriguing given the relatively short duration of each stimulus (10.5 s on average).</p>
<p>It was unexpected that neither of the tonal features was significantly correlated with the EEG signal, given that previous studies suggested that information about these tonal structures is reflected in non-invasive neural recordings. For instance, previous ERP studies showed stronger responses to deviant harmonies than normative ones (<xref ref-type="bibr" rid="B3">Besson and Fa&#x00EF;ta, 1995</xref>; <xref ref-type="bibr" rid="B29">Janata, 1995</xref>; <xref ref-type="bibr" rid="B34">Koelsch et al., 2003</xref>). Additionally, in a recent MEG study (<xref ref-type="bibr" rid="B50">Sankaran et al., 2020</xref>), a representational similarity analysis revealed that distinctive cortical activity patterns at the early stage (around 200 ms) reflected the absolute pitch (i.e., fundamental frequencies) of presented tones, whereas late stages (after 200 ms onward) reflected their relative pitch with respect to the established tonal context (i.e., tonal hierarchy) during the listening of isolated chord sequences and probe tones played by a synthesized piano. In a study with more naturalistic musical stimuli (<xref ref-type="bibr" rid="B12">Di Liberto et al., 2020</xref>), the cortical encoding of melodic expectation, which is defined by how surprising a pitch or note onset is within a given melody, was shown using EEG and the TRF during the listening of monophonic MIDI piano excerpts generated from J. S. Bach Chorales. With respect to <italic>key clarity</italic>, it was shown that <italic>key clarity</italic> correlates significantly with behavioral ratings (<xref ref-type="bibr" rid="B13">Eerola, 2012</xref>) and is anti-correlated with the fMRI signal timeseries in specific brain regions, including the Rolandic Operculum, insula, and precentral gyrus, while listening to modern Argentine Tango (<xref ref-type="bibr" rid="B2">Alluri et al., 2012</xref>). In a replication study with identical stimuli (<xref ref-type="bibr" rid="B7">Burunat et al., 2016</xref>), <italic>key clarity</italic> showed scattered encoding patterns across all brain regions with weaker magnitudes of correlations, although such an association with evoked EEG responses (or the absence thereof) has not been previously reported. One possibility for the current negative finding with respect to tonal features is that the musical stimuli in the current dataset might not have been optimal for our interest in the tonal analysis given their tonal simplicity (see section &#x201C;Limitations&#x201D; for further discussion).</p>
</sec>
<sec id="S4.SS3">
<title>Limitations</title>
<p>The stimuli were relatively short in duration (10-s long on average) and often repetitious in nature. These stimulus characteristics limited the ability to observe the response to larger changes in <italic>key clarity</italic> and <italic>tonal stability</italic>. For instance, the ranges of standard deviation of <italic>key clarity</italic> and <italic>tonal stability</italic> were (0.0667, 0.1638) and (0.1570, 0.2607), respectively, when calculated on beats. These were narrower than typical musical stimulus sets [e.g., 360 emotional soundtrack 15-s excerpts (<xref ref-type="bibr" rid="B14">Eerola and Vuoskoski, 2011</xref>); (0.0423, 0.2303) and (0.11882, 0.3441) for <italic>key clarity</italic> and <italic>tonal stability</italic>, respectively]. These limitations (short lengths and limited variation in tonality) might have contributed to negative findings in the current study. Another limitation in the dataset was the small number of participants (<italic>n</italic> = 9), which limited statistical power. Future neuro-music public datasets (e.g., the one developed by <xref ref-type="bibr" rid="B24">Grahn et al., 2018</xref>) may want to consider using longer, more dynamic musical excerpts, especially ones that have increased dramatic shifts in tonality with more participants. The dataset also did not contain simultaneous behavioral ratings of the music, which resulted in us being unable to analyze our data alongside measures such as emotion.</p>
<p>One limitation in our analysis is that we used a single regularization parameter for all features, as currently implemented in the mTRF Toolbox. However, it has been shown that using independent regularization for each feature set (&#x201C;banded ridge&#x201D;) can improve the prediction and interpretability of joint modeling in fMRI encoding analysis (<xref ref-type="bibr" rid="B47">Nunez-Elizalde et al., 2019</xref>). Thus, it is expected that a systematic investigation on the merits of banded ridge regression in mTRF analysis on M/EEG data would benefit the community.</p>
</sec>
</sec>
<sec id="S5">
<title>Future Directions and Conclusion</title>
<p>Ultimately, we hope that this framework can serve two broad purposes. The first is for it to enhance the ecological validity of future music experiments. The second is for it to be used as a tool that can be paired with other metrics of interest. Emotion is perhaps the most fitting application of this framework, given the special ability of music to make us experience intense feelings. Combining the current analytic framework with behavioral measures like emotion will be especially useful because it could shed light on what factors interact with our anticipation of tonality and rhythm during music listening. In particular, when combined with continuous behavioral measures, such as emotion or tension, this might 1 day be used to elucidate how changes in certain musical features make us happy or sad, which could deepen our knowledge of how music can be used therapeutically or clinically. Furthermore, some current limitations of the <italic>tonal stability</italic> measure provide future researchers with opportunities for innovation. Looking forward, it would be useful to create a <italic>tonal stability</italic> measure that can account for multiple (shifting) tonal centers within a single piece of music.</p>
<p>In summary, we presented an analytical framework to investigate tonal and rhythmic hierarchy encoded in neural signals while listening to homophonic music. Though the model did not demonstrate the presence of the proposed <italic>tonal stability</italic> measure, it did successfully capture cortical encoding of rhythmic hierarchy. Moreover, the framework was able to differentiate the spatial encoding of low/high-level features, as represented by the separate encoding of beat and meter, suggesting distinct neural processes. The current framework is applicable to any form of music by directly feeding audio signals into the linearizing model. In addition, it has the possibility of including other time-resolved measures to appropriately address the complexity and multivariate nature of music and other affective naturalistic stimuli. This will bring us to a more complete understanding of how tonality and rhythm are processed over time and why the anticipation and perception of these features can induce a variety of emotional responses within us.</p>
</sec>
<sec id="S6">
<title>Data Availability Statement</title>
<p>The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/<xref ref-type="supplementary-material" rid="DS1">Supplementary Material</xref>.</p>
</sec>
<sec id="S7">
<title>Ethics Statement</title>
<p>The studies involving human participants were reviewed and approved by the Ethics Board at the University of Western Ontario. The patients/participants provided their written informed consent to participate in this study.</p>
</sec>
<sec id="S8">
<title>Author Contributions</title>
<p>JL and S-GK conceived the ideas, developed the analytic framework, analyzed the public data, and wrote the first draft together. S-GK formulated models and wrote code for analysis and visualization. JW and TO contributed to conceiving ideas, interpreting results, and writing the manuscript. All authors contributed to the article and approved the submitted version.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<fn-group>
<fn fn-type="financial-disclosure">
<p><bold>Funding.</bold> This work was supported by Duke University startup funds to TO.</p>
</fn>
</fn-group>
<ack>
<p>We thank Sebastian Stober for sharing the OpenMIIR dataset, code, and documentation. We also thank Michael Broderick for technical discussion on the mTRF analysis. The editor and reviewers helped us in improving the earlier version of the manuscript.</p>
</ack>
<sec id="S11" sec-type="supplementary material"><title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fnins.2021.665767/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fnins.2021.665767/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.PDF" id="DS1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Abrams</surname> <given-names>D. A.</given-names></name> <name><surname>Ryali</surname> <given-names>S.</given-names></name> <name><surname>Chen</surname> <given-names>T.</given-names></name> <name><surname>Chordia</surname> <given-names>P.</given-names></name> <name><surname>Khouzam</surname> <given-names>A.</given-names></name> <name><surname>Levitin</surname> <given-names>D. J.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>Inter-subject synchronization of brain responses during natural music listening.</article-title> <source><italic>Eur. J. Neurosci.</italic></source> <volume>37</volume> <fpage>1458</fpage>&#x2013;<lpage>1469</lpage>. <pub-id pub-id-type="doi">10.1111/ejn.12173</pub-id> <pub-id pub-id-type="pmid">23578016</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alluri</surname> <given-names>V.</given-names></name> <name><surname>Toiviainen</surname> <given-names>P.</given-names></name> <name><surname>J&#x00E4;&#x00E4;skel&#x00E4;inen</surname> <given-names>I. P.</given-names></name> <name><surname>Glerean</surname> <given-names>E.</given-names></name> <name><surname>Sams</surname> <given-names>M.</given-names></name> <name><surname>Brattico</surname> <given-names>E.</given-names></name></person-group> (<year>2012</year>). <article-title>Large-scale brain networks emerge from dynamic processing of musical timbre, key and rhythm.</article-title> <source><italic>NeuroImage</italic></source> <volume>59</volume> <fpage>3677</fpage>&#x2013;<lpage>3689</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2011.11.019</pub-id> <pub-id pub-id-type="pmid">22116038</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Besson</surname> <given-names>M.</given-names></name> <name><surname>Fa&#x00EF;ta</surname> <given-names>F.</given-names></name></person-group> (<year>1995</year>). <article-title>An event-related potential (ERP) study of musical expectancy: comparison of musicians with nonmusicians.</article-title> <source><italic>J. Exp. Psychol.</italic></source> <volume>21</volume>:<issue>1278</issue>. <pub-id pub-id-type="doi">10.1037/0096-1523.21.6.1278</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bianco</surname> <given-names>R.</given-names></name> <name><surname>Novembre</surname> <given-names>G.</given-names></name> <name><surname>Keller</surname> <given-names>P. E.</given-names></name> <name><surname>Villringer</surname> <given-names>A.</given-names></name> <name><surname>Sammler</surname> <given-names>D.</given-names></name></person-group> (<year>2018</year>). <article-title>Musical genre-dependent behavioural and EEG signatures of action planning. A comparison between classical and jazz pianists.</article-title> <source><italic>Neuroimage</italic></source> <volume>169</volume> <fpage>383</fpage>&#x2013;<lpage>394</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2017.12.058</pub-id> <pub-id pub-id-type="pmid">29277649</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brodbeck</surname> <given-names>C.</given-names></name> <name><surname>Presacco</surname> <given-names>A.</given-names></name> <name><surname>Simon</surname> <given-names>J. Z.</given-names></name></person-group> (<year>2018</year>). <article-title>Neural source dynamics of brain responses to continuous stimuli: speech processing from acoustics to comprehension.</article-title> <source><italic>NeuroImage</italic></source> <volume>172</volume> <fpage>162</fpage>&#x2013;<lpage>174</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2018.01.042</pub-id> <pub-id pub-id-type="pmid">29366698</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Broderick</surname> <given-names>M. P.</given-names></name> <name><surname>Anderson</surname> <given-names>A. J.</given-names></name> <name><surname>Di Liberto</surname> <given-names>G. M.</given-names></name> <name><surname>Crosse</surname> <given-names>M. J.</given-names></name> <name><surname>Lalor</surname> <given-names>E. C.</given-names></name></person-group> (<year>2018</year>). <article-title>Electrophysiological correlates of semantic dissimilarity reflect the comprehension of natural, narrative speech.</article-title> <source><italic>Curr. Biol.</italic></source> <volume>28</volume> <fpage>803</fpage>&#x2013;<lpage>809.e3</lpage>.</citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Burunat</surname> <given-names>I.</given-names></name> <name><surname>Toiviainen</surname> <given-names>P.</given-names></name> <name><surname>Alluri</surname> <given-names>V.</given-names></name> <name><surname>Bogert</surname> <given-names>B.</given-names></name> <name><surname>Ristaniemi</surname> <given-names>T.</given-names></name> <name><surname>Sams</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>The reliability of continuous brain responses during naturalistic listening to music.</article-title> <source><italic>Neuroimage</italic></source> <volume>124</volume> <fpage>224</fpage>&#x2013;<lpage>231</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2015.09.005</pub-id> <pub-id pub-id-type="pmid">26364862</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>J. L.</given-names></name> <name><surname>Penhune</surname> <given-names>V. B.</given-names></name> <name><surname>Zatorre</surname> <given-names>R. J.</given-names></name></person-group> (<year>2008</year>). <article-title>Listening to musical rhythms recruits motor regions of the brain.</article-title> <source><italic>Cereb. Cortex</italic></source> <volume>18</volume> <fpage>2844</fpage>&#x2013;<lpage>2854</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bhn042</pub-id> <pub-id pub-id-type="pmid">18388350</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cowen</surname> <given-names>A. S.</given-names></name> <name><surname>Fang</surname> <given-names>X.</given-names></name> <name><surname>Sauter</surname> <given-names>D.</given-names></name> <name><surname>Keltner</surname> <given-names>D.</given-names></name></person-group> (<year>2020</year>). <article-title>What music makes us feel: at least 13 dimensions organize subjective experiences associated with music across different cultures.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>117</volume> <fpage>1924</fpage>&#x2013;<lpage>1934</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1910704117</pub-id> <pub-id pub-id-type="pmid">31907316</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Crosse</surname> <given-names>M. J.</given-names></name> <name><surname>Di Liberto</surname> <given-names>G. M.</given-names></name> <name><surname>Bednar</surname> <given-names>A.</given-names></name> <name><surname>Lalor</surname> <given-names>E. C.</given-names></name></person-group> (<year>2016</year>). <article-title>The multivariate temporal response function (mTRF) toolbox: a MATLAB toolbox for relating neural signals to continuous stimuli.</article-title> <source><italic>Front. Hum. Neurosci.</italic></source> <volume>10</volume>:<issue>604</issue>. <pub-id pub-id-type="doi">10.3389/fnhum.2016.00604</pub-id> <pub-id pub-id-type="pmid">27965557</pub-id></citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Di Liberto</surname> <given-names>G. M.</given-names></name> <name><surname>O&#x2019;sullivan</surname> <given-names>J. A.</given-names></name> <name><surname>Lalor</surname> <given-names>E. C.</given-names></name></person-group> (<year>2015</year>). <article-title>Low-frequency cortical entrainment to speech reflects phoneme-level processing.</article-title> <source><italic>Curr. Biol.</italic></source> <volume>25</volume> <fpage>2457</fpage>&#x2013;<lpage>2465</lpage>. <pub-id pub-id-type="doi">10.1016/j.cub.2015.08.030</pub-id> <pub-id pub-id-type="pmid">26412129</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Di Liberto</surname> <given-names>G. M.</given-names></name> <name><surname>Pelofi</surname> <given-names>C.</given-names></name> <name><surname>Bianco</surname> <given-names>R.</given-names></name> <name><surname>Patel</surname> <given-names>P.</given-names></name> <name><surname>Mehta</surname> <given-names>A. D.</given-names></name> <name><surname>Herrero</surname> <given-names>J. L.</given-names></name><etal/></person-group> (<year>2020</year>). <article-title>Cortical encoding of melodic expectations in human temporal cortex.</article-title> <source><italic>eLife</italic></source> <volume>9</volume>:<issue>e51784</issue>.</citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eerola</surname> <given-names>T.</given-names></name></person-group> (<year>2012</year>). <article-title>Modeling listeners&#x2019; emotional response to music.</article-title> <source><italic>Top. Cogn. Sci.</italic></source> <volume>4</volume> <fpage>607</fpage>&#x2013;<lpage>624</lpage>. <pub-id pub-id-type="doi">10.1111/j.1756-8765.2012.01188.x</pub-id> <pub-id pub-id-type="pmid">22389191</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eerola</surname> <given-names>T.</given-names></name> <name><surname>Vuoskoski</surname> <given-names>J. K.</given-names></name></person-group> (<year>2011</year>). <article-title>A comparison of the discrete and dimensional models of emotion in music.</article-title> <source><italic>Psychol. Music</italic></source> <volume>39</volume> <fpage>18</fpage>&#x2013;<lpage>49</lpage>. <pub-id pub-id-type="doi">10.1177/0305735610362821</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eerola</surname> <given-names>T.</given-names></name> <name><surname>Lartillot</surname> <given-names>O.</given-names></name> <name><surname>Toiviainen</surname> <given-names>P.</given-names></name></person-group> (<year>2009</year>). &#x201C;<article-title>Prediction of multidimensional emotional ratings in music from audio using multivariate regression models</article-title>,&#x201D; in <source><italic>Proceedings of the International Society for Music Information Retrieval (ISMIR)</italic></source> (<publisher-loc>Kobe</publisher-loc>), <fpage>621</fpage>&#x2013;<lpage>626</lpage>.</citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Farbood</surname> <given-names>M. M.</given-names></name> <name><surname>Heeger</surname> <given-names>D. J.</given-names></name> <name><surname>Marcus</surname> <given-names>G.</given-names></name> <name><surname>Hasson</surname> <given-names>U.</given-names></name> <name><surname>Lerner</surname> <given-names>Y.</given-names></name></person-group> (<year>2015</year>). <article-title>The neural processing of hierarchical structure in music and speech at different timescales.</article-title> <source><italic>Front. Neurosci.</italic></source> <volume>9</volume>:<issue>157</issue>. <pub-id pub-id-type="doi">10.3389/fnins.2015.00157</pub-id> <pub-id pub-id-type="pmid">26029037</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fishman</surname> <given-names>Y. I.</given-names></name> <name><surname>Volkov</surname> <given-names>I. O.</given-names></name> <name><surname>Noh</surname> <given-names>M. D.</given-names></name> <name><surname>Garell</surname> <given-names>P. C.</given-names></name> <name><surname>Bakken</surname> <given-names>H.</given-names></name> <name><surname>Arezzo</surname> <given-names>J. C.</given-names></name><etal/></person-group> (<year>2001</year>). <article-title>Consonance and dissonance of musical chords: neural correlates in auditory cortex of monkeys and humans.</article-title> <source><italic>J. Neurophysiol.</italic></source> <volume>86</volume> <fpage>2761</fpage>&#x2013;<lpage>2788</lpage>. <pub-id pub-id-type="doi">10.1152/jn.2001.86.6.2761</pub-id> <pub-id pub-id-type="pmid">11731536</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fujioka</surname> <given-names>T.</given-names></name> <name><surname>Trainor</surname> <given-names>L. J.</given-names></name> <name><surname>Large</surname> <given-names>E. W.</given-names></name> <name><surname>Ross</surname> <given-names>B.</given-names></name></person-group> (<year>2009</year>). <article-title>Beta and gamma rhythms in human auditory cortex during musical beat processing.</article-title> <source><italic>Ann. N. Y. Acad. Sci.</italic></source> <volume>1169</volume> <fpage>89</fpage>&#x2013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1111/j.1749-6632.2009.04779.x</pub-id> <pub-id pub-id-type="pmid">19673759</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Goldberg</surname> <given-names>H.</given-names></name> <name><surname>Preminger</surname> <given-names>S.</given-names></name> <name><surname>Malach</surname> <given-names>R.</given-names></name></person-group> (<year>2014</year>). <article-title>The emotion&#x2013;action link? Naturalistic emotional stimuli preferentially activate the human dorsal visual stream.</article-title> <source><italic>NeuroImage</italic></source> <volume>84</volume> <fpage>254</fpage>&#x2013;<lpage>264</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2013.08.032</pub-id> <pub-id pub-id-type="pmid">23994457</pub-id></citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>G&#x00F3;mez</surname> <given-names>E.</given-names></name></person-group> (<year>2006</year>). <article-title>Tonal description of polyphonic audio for music content processing.</article-title> <source><italic>Informs J. Comput.</italic></source> <volume>18</volume> <fpage>294</fpage>&#x2013;<lpage>304</lpage>. <pub-id pub-id-type="doi">10.1287/ijoc.1040.0126</pub-id> <pub-id pub-id-type="pmid">19642375</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gordon</surname> <given-names>C. L.</given-names></name> <name><surname>Cobb</surname> <given-names>P. R.</given-names></name> <name><surname>Balasubramaniam</surname> <given-names>R.</given-names></name></person-group> (<year>2018</year>). <article-title>Recruitment of the motor system during music listening: an ALE meta-analysis of fMRI data.</article-title> <source><italic>PLoS One</italic></source> <volume>13</volume>:<issue>e0207213</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0207213</pub-id> <pub-id pub-id-type="pmid">30452442</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grahn</surname> <given-names>J. A.</given-names></name> <name><surname>McAuley</surname> <given-names>J. D.</given-names></name></person-group> (<year>2009</year>). <article-title>Neural bases of individual differences in beat perception.</article-title> <source><italic>NeuroImage</italic></source> <volume>47</volume> <fpage>1894</fpage>&#x2013;<lpage>1903</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2009.04.039</pub-id> <pub-id pub-id-type="pmid">19376241</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grahn</surname> <given-names>J. A.</given-names></name> <name><surname>Rowe</surname> <given-names>J. B.</given-names></name></person-group> (<year>2009</year>). <article-title>Feeling the beat: premotor and striatal interactions in musicians and nonmusicians during beat perception.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>29</volume> <fpage>7540</fpage>&#x2013;<lpage>7548</lpage>. <pub-id pub-id-type="doi">10.1523/jneurosci.2018-08.2009</pub-id> <pub-id pub-id-type="pmid">19515922</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Grahn</surname> <given-names>J.</given-names></name> <name><surname>Diedrichsen</surname> <given-names>J.</given-names></name> <name><surname>Gati</surname> <given-names>J.</given-names></name> <name><surname>Henry</surname> <given-names>M.</given-names></name> <name><surname>Zatorre</surname> <given-names>R.</given-names></name> <name><surname>Poline</surname> <given-names>J.-B.</given-names></name><etal/></person-group> (<year>2018</year>). <source><italic>OMMABA: The Open Multimodal Music and Auditory Brain Archive Project Summaries.</italic></source> <publisher-loc>London, ON</publisher-loc>: <publisher-name>Western University</publisher-name>.</citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gramfort</surname> <given-names>A.</given-names></name> <name><surname>Luessi</surname> <given-names>M.</given-names></name> <name><surname>Larson</surname> <given-names>E.</given-names></name> <name><surname>Engemann</surname> <given-names>D.</given-names></name> <name><surname>Strohmeier</surname> <given-names>D.</given-names></name> <name><surname>Brodbeck</surname> <given-names>C.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>MEG and EEG data analysis with MNE-Python.</article-title> <source><italic>Front. Neurosci.</italic></source> <volume>7</volume>:<issue>267</issue>. <pub-id pub-id-type="doi">10.3389/fnins.2013.00267</pub-id> <pub-id pub-id-type="pmid">24431986</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hoerl</surname> <given-names>A. E.</given-names></name> <name><surname>Kennard</surname> <given-names>R. W.</given-names></name></person-group> (<year>1970</year>). <article-title>Ridge regression: biased estimation for nonorthogonal problems.</article-title> <source><italic>Technometrics</italic></source> <volume>12</volume> <fpage>55</fpage>&#x2013;<lpage>67</lpage>. <pub-id pub-id-type="doi">10.1080/00401706.1970.10488634</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Huth</surname> <given-names>A. G.</given-names></name> <name><surname>De Heer</surname> <given-names>W. A.</given-names></name> <name><surname>Griffiths</surname> <given-names>T. L.</given-names></name> <name><surname>Theunissen</surname> <given-names>F. E.</given-names></name> <name><surname>Gallant</surname> <given-names>J. L.</given-names></name></person-group> (<year>2016</year>). <article-title>Natural speech reveals the semantic maps that tile human cerebral cortex.</article-title> <source><italic>Nature</italic></source> <volume>532</volume>:<issue>453</issue>. <pub-id pub-id-type="doi">10.1038/nature17637</pub-id> <pub-id pub-id-type="pmid">27121839</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>J&#x00E4;&#x00E4;skel&#x00E4;inen</surname> <given-names>I. P.</given-names></name> <name><surname>Sams</surname> <given-names>M.</given-names></name> <name><surname>Glerean</surname> <given-names>E.</given-names></name> <name><surname>Ahveninen</surname> <given-names>J.</given-names></name></person-group> (<year>2021</year>). <article-title>Movies and narratives as naturalistic stimuli in neuroimaging.</article-title> <source><italic>NeuroImage</italic></source> <volume>224</volume>:<issue>117445</issue>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2020.117445</pub-id> <pub-id pub-id-type="pmid">33059053</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Janata</surname> <given-names>P.</given-names></name></person-group> (<year>1995</year>). <article-title>ERP measures assay the degree of expectancy violation of harmonic contexts in music.</article-title> <source><italic>J. Cogn. Neurosci.</italic></source> <volume>7</volume> <fpage>153</fpage>&#x2013;<lpage>164</lpage>. <pub-id pub-id-type="doi">10.1162/jocn.1995.7.2.153</pub-id> <pub-id pub-id-type="pmid">23961821</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kaneshiro</surname> <given-names>B.</given-names></name> <name><surname>Nguyen</surname> <given-names>D. T.</given-names></name> <name><surname>Norcia</surname> <given-names>A. M.</given-names></name> <name><surname>Dmochowski</surname> <given-names>J. P.</given-names></name> <name><surname>Berger</surname> <given-names>J.</given-names></name></person-group> (<year>2020</year>). <article-title>Natural music evokes correlated EEG responses reflecting temporal structure and beat.</article-title> <source><italic>NeuroImage</italic></source> <volume>214</volume>:<issue>116559</issue>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2020.116559</pub-id> <pub-id pub-id-type="pmid">31978543</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kay</surname> <given-names>K. N.</given-names></name> <name><surname>Naselaris</surname> <given-names>T.</given-names></name> <name><surname>Prenger</surname> <given-names>R. J.</given-names></name> <name><surname>Gallant</surname> <given-names>J. L.</given-names></name></person-group> (<year>2008</year>). <article-title>Identifying natural images from human brain activity.</article-title> <source><italic>Nature</italic></source> <volume>452</volume> <fpage>352</fpage>&#x2013;<lpage>355</lpage>. <pub-id pub-id-type="doi">10.1038/nature06713</pub-id> <pub-id pub-id-type="pmid">18322462</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koelsch</surname> <given-names>S.</given-names></name> <name><surname>Jentschke</surname> <given-names>S.</given-names></name></person-group> (<year>2010</year>). <article-title>Differences in electric brain responses to melodies and chords.</article-title> <source><italic>J. Cogn. Neurosci.</italic></source> <volume>22</volume> <fpage>2251</fpage>&#x2013;<lpage>2262</lpage>. <pub-id pub-id-type="doi">10.1162/jocn.2009.21338</pub-id> <pub-id pub-id-type="pmid">19702466</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koelsch</surname> <given-names>S.</given-names></name> <name><surname>Gunter</surname> <given-names>T.</given-names></name> <name><surname>Friederici</surname> <given-names>A. D.</given-names></name> <name><surname>Schr&#x00F6;ger</surname> <given-names>E.</given-names></name></person-group> (<year>2000</year>). <article-title>Brain indices of music processing: &#x201C;nonmusicians&#x201D; are musical.</article-title> <source><italic>J. Cogn. Neurosci.</italic></source> <volume>12</volume> <fpage>520</fpage>&#x2013;<lpage>541</lpage>. <pub-id pub-id-type="doi">10.1162/089892900562183</pub-id> <pub-id pub-id-type="pmid">10931776</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Koelsch</surname> <given-names>S.</given-names></name> <name><surname>Gunter</surname> <given-names>T.</given-names></name> <name><surname>Schr&#x00F6;ger</surname> <given-names>E.</given-names></name> <name><surname>Friederici</surname> <given-names>A. D.</given-names></name></person-group> (<year>2003</year>). <article-title>Processing tonal modulations: an ERP study.</article-title> <source><italic>J. Cogn. Neurosci.</italic></source> <volume>15</volume> <fpage>1149</fpage>&#x2013;<lpage>1159</lpage>. <pub-id pub-id-type="doi">10.1162/089892903322598111</pub-id> <pub-id pub-id-type="pmid">14709233</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Krumhansl</surname> <given-names>C. L.</given-names></name></person-group> (<year>1990</year>). <article-title>Tonal hierarchies and rare intervals in music cognition.</article-title> <source><italic>Music Percept.</italic></source> <volume>7</volume> <fpage>309</fpage>&#x2013;<lpage>324</lpage>. <pub-id pub-id-type="doi">10.2307/40285467</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Krumhansl</surname> <given-names>C. L.</given-names></name> <name><surname>Cuddy</surname> <given-names>L. L.</given-names></name></person-group> (<year>2010</year>). &#x201C;<article-title>A theory of tonal hierarchies in music</article-title>,&#x201D; in <source><italic>Music Perception</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Riess Jones</surname> <given-names>M.</given-names></name> <name><surname>Fay</surname> <given-names>R. R.</given-names></name> <name><surname>Popper</surname> <given-names>A. N.</given-names></name></person-group> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>51</fpage>&#x2013;<lpage>87</lpage>. <pub-id pub-id-type="doi">10.1007/978-1-4419-6114-3_3</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Krumhansl</surname> <given-names>C. L.</given-names></name> <name><surname>Shepard</surname> <given-names>R. N.</given-names></name></person-group> (<year>1979</year>). <article-title>Quantification of the hierarchy of tonal functions within a diatonic context.</article-title> <source><italic>J. Exp. Psychol.</italic></source> <volume>5</volume>:<issue>579</issue>. <pub-id pub-id-type="doi">10.1037/0096-1523.5.4.579</pub-id> <pub-id pub-id-type="pmid">528960</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lalor</surname> <given-names>E. C.</given-names></name> <name><surname>Pearlmutter</surname> <given-names>B. A.</given-names></name> <name><surname>Reilly</surname> <given-names>R. B.</given-names></name> <name><surname>Mcdarby</surname> <given-names>G.</given-names></name> <name><surname>Foxe</surname> <given-names>J. J.</given-names></name></person-group> (<year>2006</year>). <article-title>The VESPA: a method for the rapid estimation of a visual evoked potential.</article-title> <source><italic>NeuroImage</italic></source> <volume>32</volume> <fpage>1549</fpage>&#x2013;<lpage>1561</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2006.05.054</pub-id> <pub-id pub-id-type="pmid">16875844</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lartillot</surname> <given-names>O.</given-names></name> <name><surname>Toiviainen</surname> <given-names>P.</given-names></name></person-group> (<year>2007</year>). &#x201C;<article-title>A Matlab toolbox for musical feature extraction from audio</article-title>,&#x201D; in <source><italic>Proceedings of the International Conference on Digital Audio Effects (DAFx)</italic></source> (<publisher-loc>Bordeaux</publisher-loc>), <fpage>237</fpage>&#x2013;<lpage>244</lpage>.</citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>Q.</given-names></name> <name><surname>Liu</surname> <given-names>G.</given-names></name> <name><surname>Wei</surname> <given-names>D.</given-names></name> <name><surname>Liu</surname> <given-names>Y.</given-names></name> <name><surname>Yuan</surname> <given-names>G.</given-names></name> <name><surname>Wang</surname> <given-names>G.</given-names></name></person-group> (<year>2019</year>). <article-title>Distinct neuronal entrainment to beat and meter: revealed by simultaneous EEG-fMRI.</article-title> <source><italic>NeuroImage</italic></source> <volume>194</volume> <fpage>128</fpage>&#x2013;<lpage>135</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2019.03.039</pub-id> <pub-id pub-id-type="pmid">30914384</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Loui</surname> <given-names>P.</given-names></name> <name><surname>Wessel</surname> <given-names>D.</given-names></name></person-group> (<year>2007</year>). <article-title>Harmonic expectation and affect in Western music: effects of attention and training.</article-title> <source><italic>Percept. Psychophys.</italic></source> <volume>69</volume> <fpage>1084</fpage>&#x2013;<lpage>1092</lpage>. <pub-id pub-id-type="doi">10.3758/bf03193946</pub-id> <pub-id pub-id-type="pmid">18038947</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Maris</surname> <given-names>E.</given-names></name></person-group> (<year>2019</year>). <article-title>Enlarging the scope of randomization and permutation tests in neuroimaging and neuroscience.</article-title> <source><italic>bioRxiv</italic></source> [<comment>Preprint</comment>] <pub-id pub-id-type="doi">10.1101/685560v4</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Maris</surname> <given-names>E.</given-names></name> <name><surname>Oostenveld</surname> <given-names>R.</given-names></name></person-group> (<year>2007</year>). <article-title>Nonparametric statistical testing of EEG- and MEG-data.</article-title> <source><italic>J. Neurosci. Methods</italic></source> <volume>164</volume> <fpage>177</fpage>&#x2013;<lpage>190</lpage>. <pub-id pub-id-type="doi">10.1016/j.jneumeth.2007.03.024</pub-id> <pub-id pub-id-type="pmid">17517438</pub-id></citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Menon</surname> <given-names>V.</given-names></name> <name><surname>Levitin</surname> <given-names>D. J.</given-names></name></person-group> (<year>2005</year>). <article-title>The rewards of music listening: response and physiological connectivity of the mesolimbic system.</article-title> <source><italic>Neuroimage</italic></source> <volume>28</volume> <fpage>175</fpage>&#x2013;<lpage>184</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2005.05.053</pub-id> <pub-id pub-id-type="pmid">16023376</pub-id></citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>N&#x00E4;&#x00E4;t&#x00E4;nen</surname> <given-names>R.</given-names></name> <name><surname>Picton</surname> <given-names>T.</given-names></name></person-group> (<year>1987</year>). <article-title>The N1 wave of the human electric and magnetic response to sound: a review and an analysis of the component structure.</article-title> <source><italic>Psychophysiology</italic></source> <volume>24</volume> <fpage>375</fpage>&#x2013;<lpage>425</lpage>. <pub-id pub-id-type="doi">10.1111/j.1469-8986.1987.tb00311.x</pub-id> <pub-id pub-id-type="pmid">3615753</pub-id></citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nozaradan</surname> <given-names>S.</given-names></name> <name><surname>Schwartze</surname> <given-names>M.</given-names></name> <name><surname>Obermeier</surname> <given-names>C.</given-names></name> <name><surname>Kotz</surname> <given-names>S. A.</given-names></name></person-group> (<year>2017</year>). <article-title>Specific contributions of basal ganglia and cerebellum to the neural tracking of rhythm.</article-title> <source><italic>Cortex</italic></source> <volume>95</volume> <fpage>156</fpage>&#x2013;<lpage>168</lpage>. <pub-id pub-id-type="doi">10.1016/j.cortex.2017.08.015</pub-id> <pub-id pub-id-type="pmid">28910668</pub-id></citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nunez-Elizalde</surname> <given-names>A. O.</given-names></name> <name><surname>Huth</surname> <given-names>A. G.</given-names></name> <name><surname>Gallant</surname> <given-names>J. L.</given-names></name></person-group> (<year>2019</year>). <article-title>Voxelwise encoding models with non-spherical multivariate normal priors.</article-title> <source><italic>NeuroImage</italic></source> <volume>197</volume> <fpage>482</fpage>&#x2013;<lpage>492</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2019.04.012</pub-id> <pub-id pub-id-type="pmid">31075394</pub-id></citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pressnitzer</surname> <given-names>D.</given-names></name> <name><surname>Patterson</surname> <given-names>R. D.</given-names></name> <name><surname>Krumbholz</surname> <given-names>K.</given-names></name></person-group> (<year>2001</year>). <article-title>The lower limit of melodic pitch.</article-title> <source><italic>J. Acoust. Soc. Am.</italic></source> <volume>109</volume> <fpage>2074</fpage>&#x2013;<lpage>2084</lpage>. <pub-id pub-id-type="doi">10.1121/1.1359797</pub-id></citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Reybrouck</surname> <given-names>M.</given-names></name></person-group> (<year>2005</year>). <article-title>A biosemiotic and ecological approach to music cognition: event perception between auditory listening and cognitive economy.</article-title> <source><italic>Axiomathes</italic></source> <volume>15</volume> <fpage>229</fpage>&#x2013;<lpage>266</lpage>. <pub-id pub-id-type="doi">10.1007/s10516-004-6679-4</pub-id></citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sankaran</surname> <given-names>N.</given-names></name> <name><surname>Carlson</surname> <given-names>T. A.</given-names></name> <name><surname>Thompson</surname> <given-names>W. F.</given-names></name></person-group> (<year>2020</year>). <article-title>The rapid emergence of musical pitch structure in human cortex.</article-title> <source><italic>J. Neurosci.</italic></source> <volume>40</volume>:<issue>2108</issue>. <pub-id pub-id-type="doi">10.1523/jneurosci.1399-19.2020</pub-id> <pub-id pub-id-type="pmid">32001611</pub-id></citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Seeber</surname> <given-names>M.</given-names></name> <name><surname>Cantonas</surname> <given-names>L.-M.</given-names></name> <name><surname>Hoevels</surname> <given-names>M.</given-names></name> <name><surname>Sesia</surname> <given-names>T.</given-names></name> <name><surname>Visser-Vandewalle</surname> <given-names>V.</given-names></name> <name><surname>Michel</surname> <given-names>C. M.</given-names></name></person-group> (<year>2019</year>). <article-title>Subcortical electrophysiological activity is detectable with high-density EEG source imaging.</article-title> <source><italic>Nat. Commun.</italic></source> <volume>10</volume>:<issue>753</issue>.</citation></ref>
<ref id="B52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Snyder</surname> <given-names>J. S.</given-names></name> <name><surname>Large</surname> <given-names>E. W.</given-names></name></person-group> (<year>2005</year>). <article-title>Gamma-band activity reflects the metric structure of rhythmic tone sequences.</article-title> <source><italic>Cogn. Brain Res.</italic></source> <volume>24</volume> <fpage>117</fpage>&#x2013;<lpage>126</lpage>. <pub-id pub-id-type="doi">10.1016/j.cogbrainres.2004.12.014</pub-id> <pub-id pub-id-type="pmid">15922164</pub-id></citation></ref>
<ref id="B53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sonkusare</surname> <given-names>S.</given-names></name> <name><surname>Breakspear</surname> <given-names>M.</given-names></name> <name><surname>Guo</surname> <given-names>C.</given-names></name></person-group> (<year>2019</year>). <article-title>Naturalistic stimuli in neuroscience: critically acclaimed.</article-title> <source><italic>Trends Cogn. Sci.</italic></source> <volume>23</volume> <fpage>699</fpage>&#x2013;<lpage>714</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2019.05.004</pub-id> <pub-id pub-id-type="pmid">31257145</pub-id></citation></ref>
<ref id="B54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stober</surname> <given-names>S.</given-names></name></person-group> (<year>2017</year>). <article-title>Toward studying music cognition with information retrieval techniques: lessons learned from the OpenMIIR initiative.</article-title> <source><italic>Front. Psychol.</italic></source> <volume>8</volume>:<issue>1255</issue>. <pub-id pub-id-type="doi">10.3389/fpsyg.2017.01255</pub-id> <pub-id pub-id-type="pmid">28824478</pub-id></citation></ref>
<ref id="B55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stropahl</surname> <given-names>M.</given-names></name> <name><surname>Bauer</surname> <given-names>A.-K. R.</given-names></name> <name><surname>Debener</surname> <given-names>S.</given-names></name> <name><surname>Bleichner</surname> <given-names>M. G.</given-names></name></person-group> (<year>2018</year>). <article-title>Source-modeling auditory processes of EEG data using EEGLAB and brainstorm.</article-title> <source><italic>Front. Neurosci.</italic></source> <volume>12</volume>:<issue>309</issue>. <pub-id pub-id-type="doi">10.3389/fnins.2018.00309</pub-id> <pub-id pub-id-type="pmid">29867321</pub-id></citation></ref>
<ref id="B56"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sturm</surname> <given-names>I.</given-names></name> <name><surname>D&#x00E4;hne</surname> <given-names>S.</given-names></name> <name><surname>Blankertz</surname> <given-names>B.</given-names></name> <name><surname>Curio</surname> <given-names>G.</given-names></name></person-group> (<year>2015</year>). <article-title>Multi-variate EEG analysis as a novel tool to examine brain responses to naturalistic music stimuli.</article-title> <source><italic>PLoS One</italic></source> <volume>10</volume>:<issue>e0141281</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0141281</pub-id> <pub-id pub-id-type="pmid">26510120</pub-id></citation></ref>
<ref id="B57"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Toiviainen</surname> <given-names>P.</given-names></name> <name><surname>Burunat</surname> <given-names>I.</given-names></name> <name><surname>Brattico</surname> <given-names>E.</given-names></name> <name><surname>Vuust</surname> <given-names>P.</given-names></name> <name><surname>Alluri</surname> <given-names>V.</given-names></name></person-group> (<year>2020</year>). <article-title>The chronnectome of musical beat.</article-title> <source><italic>Neuroimage</italic></source> <volume>216</volume>:<issue>116191</issue>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2019.116191</pub-id> <pub-id pub-id-type="pmid">31525500</pub-id></citation></ref>
<ref id="B58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vuust</surname> <given-names>P.</given-names></name> <name><surname>Witek</surname> <given-names>M. A.</given-names></name></person-group> (<year>2014</year>). <article-title>Rhythmic complexity and predictive coding: a novel approach to modeling rhythm and meter perception in music.</article-title> <source><italic>Front. Psychol.</italic></source> <volume>5</volume>:<issue>1111</issue>. <pub-id pub-id-type="doi">10.3389/fpsyg.2014.01111</pub-id> <pub-id pub-id-type="pmid">25324813</pub-id></citation></ref>
<ref id="B59"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wu</surname> <given-names>M. C.-K.</given-names></name> <name><surname>David</surname> <given-names>S. V.</given-names></name> <name><surname>Gallant</surname> <given-names>J. L.</given-names></name></person-group> (<year>2006</year>). <article-title>Complete functional characterization of sensory neurons by system identification.</article-title> <source><italic>Annu. Rev. Neurosci.</italic></source> <volume>29</volume> <fpage>477</fpage>&#x2013;<lpage>505</lpage>. <pub-id pub-id-type="doi">10.1146/annurev.neuro.29.051605.113024</pub-id> <pub-id pub-id-type="pmid">16776594</pub-id></citation></ref>
<ref id="B60"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zatorre</surname> <given-names>R. J.</given-names></name> <name><surname>Chen</surname> <given-names>J. L.</given-names></name> <name><surname>Penhune</surname> <given-names>V. B.</given-names></name></person-group> (<year>2007</year>). <article-title>When the brain plays music: auditory-motor interactions in music perception and production.</article-title> <source><italic>Nat. Rev. Neurosci.</italic></source> <volume>8</volume> <fpage>547</fpage>&#x2013;<lpage>558</lpage>. <pub-id pub-id-type="doi">10.1038/nrn2152</pub-id> <pub-id pub-id-type="pmid">17585307</pub-id></citation></ref>
<ref id="B61"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Zhou</surname> <given-names>X.</given-names></name> <name><surname>Chang</surname> <given-names>R.</given-names></name> <name><surname>Yang</surname> <given-names>Y.</given-names></name></person-group> (<year>2018</year>). <article-title>Effects of global and local contexts on chord processing: an ERP study.</article-title> <source><italic>Neuropsychologia</italic></source> <volume>109</volume> <fpage>149</fpage>&#x2013;<lpage>154</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuropsychologia.2017.12.016</pub-id> <pub-id pub-id-type="pmid">29246486</pub-id></citation></ref>
<ref id="B62"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zouridakis</surname> <given-names>G.</given-names></name> <name><surname>Simos</surname> <given-names>P. G.</given-names></name> <name><surname>Papanicolaou</surname> <given-names>A. C.</given-names></name></person-group> (<year>1998</year>). <article-title>Multiple bilaterally asymmetric cortical sources account for the auditory N1m component.</article-title> <source><italic>Brain Topogr.</italic></source> <volume>10</volume> <fpage>183</fpage>&#x2013;<lpage>189</lpage>.</citation></ref>
</ref-list>
<fn-group>
<fn id="footnote1">
<label>1</label>
<p><ext-link ext-link-type="uri" xlink:href="http://bit.ly/mirtoolbox">http://bit.ly/mirtoolbox</ext-link></p></fn>
<fn id="footnote2">
<label>2</label>
<p><ext-link ext-link-type="uri" xlink:href="https://github.com/bmcfee/librosa">https://github.com/bmcfee/librosa</ext-link></p></fn>
<fn id="footnote3">
<label>3</label>
<p><ext-link ext-link-type="uri" xlink:href="https://github.com/sstober/openmiir">https://github.com/sstober/openmiir</ext-link></p></fn>
<fn id="footnote4">
<label>4</label>
<p><ext-link ext-link-type="uri" xlink:href="https://mne.tools/stable/index.html">https://mne.tools/stable/index.html</ext-link></p></fn>
<fn id="footnote5">
<label>5</label>
<p><ext-link ext-link-type="uri" xlink:href="https://sccn.ucsd.edu/eeglab/index.php">https://sccn.ucsd.edu/eeglab/index.php</ext-link></p></fn>
<fn id="footnote6">
<label>6</label>
<p><ext-link ext-link-type="uri" xlink:href="https://github.com/mickcrosse/mTRF-Toolbox">https://github.com/mickcrosse/mTRF-Toolbox</ext-link></p></fn>
<fn id="footnote7">
<label>7</label>
<p><ext-link ext-link-type="uri" xlink:href="https://www.fieldtriptoolbox.org/">https://www.fieldtriptoolbox.org/</ext-link></p></fn>
<fn id="footnote8">
<label>8</label>
<p><ext-link ext-link-type="uri" xlink:href="http://www.neuro.uni-jena.de/cat/">http://www.neuro.uni-jena.de/cat/</ext-link></p></fn>
</fn-group>
</back>
</article>