<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Neurosci.</journal-id>
<journal-title>Frontiers in Neuroscience</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Neurosci.</abbrev-journal-title>
<issn pub-type="epub">1662-453X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fnins.2017.00550</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Kernel-Based Relevance Analysis with Enhanced Interpretability for Detection of Brain Activity Patterns</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Alvarez-Meza</surname> <given-names>Andres M.</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/414307/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Orozco-Gutierrez</surname> <given-names>Alvaro</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/480573/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Castellanos-Dominguez</surname> <given-names>German</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/291489/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Automatics Research G., Universidad Tecnologica de Pereira</institution>, <addr-line>Pereira</addr-line>, <country>Colombia</country></aff>
<aff id="aff2"><sup>2</sup><institution>Signal Processing and Recognition G., Universidad Nacional de Colombia</institution>, <addr-line>Manizales</addr-line>, <country>Colombia</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Jose Manuel Ferrandez, Universidad Polit&#x000E9;cnica de Cartagena, Spain</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Dezhong Yao, University of Electronic Science and Technology of China, China; Manuel Grana, University of the Basque Country (UPV/EHU), Spain</p></fn>
<fn fn-type="corresp" id="fn001"><p>&#x0002A;Correspondence: Andres M. Alvarez-Meza <email>andres.alvarez1&#x00040;utp.edu.co</email></p></fn>
<fn fn-type="other" id="fn002"><p>This article was submitted to Brain Imaging Methods, a section of the journal Frontiers in Neuroscience</p></fn></author-notes>
<pub-date pub-type="epub">
<day>06</day>
<month>10</month>
<year>2017</year>
</pub-date>
<pub-date pub-type="collection">
<year>2017</year>
</pub-date>
<volume>11</volume>
<elocation-id>550</elocation-id>
<history>
<date date-type="received">
<day>23</day>
<month>05</month>
<year>2017</year>
</date>
<date date-type="accepted">
<day>20</day>
<month>09</month>
<year>2017</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2017 Alvarez-Meza, Orozco-Gutierrez and Castellanos-Dominguez.</copyright-statement>
<copyright-year>2017</copyright-year>
<copyright-holder>Alvarez-Meza, Orozco-Gutierrez and Castellanos-Dominguez</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract><p>We introduce <italic>Enhanced Kernel-based Relevance Analysis</italic> (EKRA) that aims to support the automatic identification of brain activity patterns using electroencephalographic recordings. EKRA is a data-driven strategy that incorporates two kernel functions to take advantage of the available joint information, associating neural responses to a given stimulus condition. Regarding this, a <italic>Centered Kernel Alignment functional</italic> is adjusted to learning the linear projection that best discriminates the input feature set, optimizing the required free parameters automatically. Our approach is carried out in two scenarios: (i) feature selection by computing a relevance vector from extracted neural features to facilitating the physiological interpretation of a given brain activity task, and (ii) enhanced feature selection to perform an additional transformation of relevant features aiming to improve the overall identification accuracy. Accordingly, we provide an alternative feature relevance analysis strategy that allows improving the system performance while favoring the data interpretability. For the validation purpose, EKRA is tested in two well-known tasks of brain activity: motor imagery discrimination and epileptic seizure detection. The obtained results show that the EKRA approach estimates a relevant representation space extracted from the provided supervised information, emphasizing the salient input features. As a result, our proposal outperforms the state-of-the-art methods regarding brain activity discrimination accuracy with the benefit of enhanced physiological interpretation about the task at hand.</p></abstract>
<kwd-group>
<kwd>relevance analysis</kwd>
<kwd>kernel method</kwd>
<kwd>brain activity</kwd>
<kwd>motor imagery</kwd>
<kwd>epileptic seizure detection</kwd>
</kwd-group>
<contract-sponsor id="cn001">Departamento Administrativo de Ciencia, Tecnolog&#x000ED;a e Innovaci&#x000F3;n<named-content content-type="fundref-id">10.13039/100007637</named-content></contract-sponsor>
<counts>
<fig-count count="5"/>
<table-count count="3"/>
<equation-count count="8"/>
<ref-count count="66"/>
<page-count count="14"/>
<word-count count="10472"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>The electroencephalogram (EEG) is the electrical activity of neurons in subcortical structures recorded by a noninvasive electrode array placed on the brain scalp surface. Because of their high temporal resolution and low cost, the biological EEG recordings have been found to be effective in many neurophysiological applications related to brain-computer interfaces (Nicolas-Alonso and Gomez-Gil, <xref ref-type="bibr" rid="B47">2012</xref>), automated diagnosis of neurological diseases like epilepsy (Faust et al., <xref ref-type="bibr" rid="B25">2015</xref>), neuromarketing (Vecchiato et al., <xref ref-type="bibr" rid="B55">2015</xref>), and sensorimotor restoration (Pisotta et al., <xref ref-type="bibr" rid="B48">2015</xref>), just to mention a few examples. As part of the data analysis in these applications, however, it is essential to manage massive amounts of the input feature space that frequently holds large dimensions (&#x0007E;10<sup>2</sup>&#x02013;10<sup>3</sup> features) and limited numbers of samples (up to a few dozen) (Wang et al., <xref ref-type="bibr" rid="B56">2015</xref>). As a consequence of high-dimensional data, most of the machine learning algorithms may cause inefficiency and low accuracy (Fang et al., <xref ref-type="bibr" rid="B24">2016</xref>).</p>
<p>In practice, feature extraction from EEG recordings is a particular way of data reduction, which is carried out to represent the brain states, enabling an efficient pattern identification and translation of mental states. For the goal to feed the machine learning classifiers, a variety of EEG features may be extracted. Thus, the linear extraction methods are widespread, which are more applicable to stationary signal processing, such as linear Fourier-based spectral analysis, auto-regressive models, Time-Frequency Distributions (Al-Fahoum and Al-Fraihat, <xref ref-type="bibr" rid="B3">2014</xref>). In the case of sudden and transient signal changes, more popular methods are Wavelet decomposition (Duque-Mu&#x000F1;oz et al., <xref ref-type="bibr" rid="B22">2014</xref>) and empirical mode decomposition (Zhang et al., <xref ref-type="bibr" rid="B64">2016</xref>). Due to the high non-stationary and non-linearity of EEG data, nonlinear extraction methods (e.g., Entropic and complexity measures) are employed that usually provide a higher classification accuracy, but at the cost of increased computational burden (Acharya et al., <xref ref-type="bibr" rid="B1">2012</xref>; Chen et al., <xref ref-type="bibr" rid="B16">2015</xref>), without mentioning their complicated suitability in multi-channel setups (Chen et al., <xref ref-type="bibr" rid="B17">2016</xref>). Therefore, each method has particular strengths and weaknesses, meaning that the effectiveness of each feature extraction method depends on the application.</p>
<sec>
<title>1.1. Related work</title>
<p>There are two alternative approaches to addressing the problem of a large amount of EEG data (Naeem et al., <xref ref-type="bibr" rid="B45">2009</xref>): (i) <italic>Channel selection</italic> that intends to choose a subset of electrodes contributing the most to the desired performance. Besides of avoiding redundancy of non-focal/unnecessary channels, this procedure makes visual EEG monitoring more practical when the number of needed channels becomes very few (Alotaiby et al., <xref ref-type="bibr" rid="B5">2015</xref>). A significant disadvantage of decreasing the number of EEG channels is the unrealistic assumption that cortical activity is produced by EEG signals coming only from its immediate vicinity (Haufe et al., <xref ref-type="bibr" rid="B34">2014</xref>). (ii) <italic>Dimensionality Reduction</italic> that projects the original feature space into a smaller space representation, aiming to reduce the overwhelming number of extracted features (Birjandtalab et al., <xref ref-type="bibr" rid="B11">2017</xref>).</p>
<p>Although either approach to dimensionality reduction can be performed separately, there is a growing interest in minimizing together the number of channels and features to be handled by the classification algorithms (Martinez-Leon et al., <xref ref-type="bibr" rid="B43">2015</xref>). According to the way the input data points are mapped into a lower-dimensional space, dimensionality reduction methods can be categorized as linear or non-linear. The former approaches (like Principal Component Analysis (Zajacova et al., <xref ref-type="bibr" rid="B61">2015</xref>), Discriminant and Common Spatial Patterns (Liao et al., <xref ref-type="bibr" rid="B41">2007</xref>; Zhang et al., <xref ref-type="bibr" rid="B65">2015</xref>), and Spatio-Spectral Decomposition) are popular choices for either EEG representation case (channels or features) with the benefit of computational efficiency, numerical stabilization, and denoising capability. Nevertheless, they face a deficiency, namely, the feature spaces extracted from EEG signals can induce significant and complex variations regarding the nonlinearity and sparsity of the manifolds that hardly can be encoded by linear decompositions (Sturm et al., <xref ref-type="bibr" rid="B52">2016</xref>). Moreover, based on their contribution to a linear regression model, linear dimensionality reduction methods usually select the most compact and relevant set of features, which might not be the best option for a non-linear classifier (Adeli et al., <xref ref-type="bibr" rid="B2">2017</xref>).</p>
<p>In turn, the non-linear mapping can more precisely preserve the information about the local neighborhoods of data-points by introducing either locally linearized structures or pairwise distances along the subtle non-linear manifold, attempting to unfold more complex high-dimensional data as separable groups (Lee and Verleysen, <xref ref-type="bibr" rid="B40">2007</xref>). Among machine learning approaches to dimensionality reduction, the Kernel-based analysis is promising because of the following properties (Chu et al., <xref ref-type="bibr" rid="B18">2011</xref>): (i) kernel methods apply a non-linear mapping to a higher dimensional space where the original non-linear data become linear or near-linear. (ii) The Kernel trick decreases the computational complexity of high dimensional data as the parameter evaluation domain is lessened from the explicit feature space into the Kernel space. In practice, an open issue is the definition of the kernel transformation that can be more connected with the appropriate type of application nonlinearity (Zimmer et al., <xref ref-type="bibr" rid="B66">2015</xref>). Nevertheless, more efforts are spent in the development of a metric learning that allows a kernel to adjust the importance of individual features of tasks under consideration, usually exploiting a given amount of supervisory information (Hurtado-Rinc&#x000F3;n et al., <xref ref-type="bibr" rid="B37">2016</xref>). Hence, the kernel-based relevance analysis can handle the estimated weights to highlight the features or dimensions relevant for improving the classification performance (Brockmeier et al., <xref ref-type="bibr" rid="B13">2014</xref>).</p>
</sec>
<sec>
<title>1.2. Our contribution</title>
<p>Devoted to channel selection and dimension reduction of EEG data, some issues remain open in employing Kernel-based metric learning algorithms: (i) Adaptation to the complex neural relationships is far from being an easy task, in particular, in taking advantage of the supervised information by non-linear methods (Fukumizu et al., <xref ref-type="bibr" rid="B27">2004</xref>). (ii) A direct physiological interpretability from a non-linear-based mapping is not always possible. (iii) The selected or reduced feature sets with the smallest size can provide a high rate of false alarms and missed detections. This situation hinders a solid interpretation of the mechanisms underlying the problem (Gajic et al., <xref ref-type="bibr" rid="B28">2014</xref>). (iv) The computational complexity is a strong constraint because of the extensive processing time and parameter tuning (mainly performed using heuristic methods). Thus, there is a need for identifying the most discriminating features by finding a trade-off between system complexity and accuracy (Bhattacharyya et al., <xref ref-type="bibr" rid="B10">2014</xref>). (v) High variability of channel performance across the subjects (Feess et al., <xref ref-type="bibr" rid="B26">2013</xref>). Due to the personal peculiarities, the high inter-subject variability poses one of the biggest challenges in brain activity identification.</p>
<p>In this work, we develop a kernel-based approach for enhanced feature relevance analysis, termed <italic>Enhanced Kernel Relevance Analysis</italic> (EKRA), aiming to identify brain activity patterns automatically. In particular, the proposed relevance analysis comprises two kernel functions to take advantage of the actual joint information, attaching neural responses to a given stimulus/conditions. In this regard, we employ a <italic>Center Kernel Alignment</italic> (CKA)-based functional to learn a linear projection that encodes all discriminative input features, benefiting from the non-linear notion of similarity behind the studied kernels (Cortes et al., <xref ref-type="bibr" rid="B19">2012</xref>). The present approach is an extension of our previous <italic>Kernel Relevance Analysis</italic> strategy describe in (Arias-Mora et al., <xref ref-type="bibr" rid="B8">2015</xref>; Hurtado-Rinc&#x000F3;n et al., <xref ref-type="bibr" rid="B37">2016</xref>). In particular, EKRA can be implemented as both feature selection (EKRA-S) and enhanced feature selection (EKRA-ES) tool. The former introduces a feature relevance vector index to quantify the contribution of each input feature for discriminating the possible stimulus/conditions. The latter adapts the relevance index vector to compute a representation space favoring the brain activity patterns separability without redundancy. Besides, an iterative gradient descent optimization is applied to calculate the EKRA projection matrix and the required kernel free parameters. From the attained results we conclude that EKRA allows finding a suitable feature representation space by ensuring the identification of brain activity patterns, at the same time, favoring the physiological interpretation of the studied phenomenon. Indeed, our proposal outperforms state-of-the-art approaches that carry out the multivariate feature selection and dimensionality reduction.</p>
<p>The rest of the paper is organized as follows: In Section 2, we develop the theoretical background of the introduced EKRA. Section 3 describes the experimental set-up for Identification of Brain Activity Patterns, Section 4 discussed the obtained results, and the concluding remarks are outlined finally in Section 5.</p>
</sec>
</sec>
<sec sec-type="methods" id="s2">
<title>2. Methods</title>
<sec>
<title>2.1. Feature extraction using centered kernel alignment</title>
<p>Given that neural responses to brain activity tasks are contained in a domain <inline-formula><mml:math id="M1"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow></mml:math></inline-formula>, a kernel <inline-formula><mml:math id="M2"><mml:msub><mml:mrow><mml:mi>&#x003BA;</mml:mi></mml:mrow><mml:mrow><mml:mi>X</mml:mi></mml:mrow></mml:msub><mml:mo>:</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow><mml:mo>&#x000D7;</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow><mml:mo>&#x02192;</mml:mo><mml:mi>&#x0211D;</mml:mi></mml:math></inline-formula> is assumed to be a positive-definite function, which reflects an implicit mapping <inline-formula><mml:math id="M3"><mml:mi>&#x003D5;</mml:mi><mml:mo>:</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow><mml:mo>&#x02192;</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">H</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>X</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>, associating an element <inline-formula><mml:math id="M4"><mml:mi>x</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow></mml:math></inline-formula> with the element <inline-formula><mml:math id="M5"><mml:mi>&#x003D5;</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">H</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>X</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> that belongs to the Reproducing Kernel Hilbert Space (RKHS), <inline-formula><mml:math id="M6"><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">H</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>X</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. For associating elements, kernel functions are built using several bivariate measures of similarity, which are based on the inner product between samples contained in a RKHS. Although, various functions have been tested, the Gaussian function is preferred in pattern classification and machine learning applications since it aims at finding an RKHS with universal approximating ability, not to mention its mathematical tractability (Liu et al., <xref ref-type="bibr" rid="B42">2011</xref>; Brockmeier et al., <xref ref-type="bibr" rid="B14">2013</xref>). For Gaussian kernels, each pairwise similarity distance between samples <inline-formula><mml:math id="M7"><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup><mml:mo>&#x02208;</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow></mml:math></inline-formula> is computed as follows (Wang et al., <xref ref-type="bibr" rid="B57">2016</xref>):</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M8"><mml:mrow><mml:msub><mml:mi>&#x003BA;</mml:mi><mml:mi>X</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo>;</mml:mo><mml:mi>&#x003C3;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mi>exp</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msup><mml:mi>d</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo stretchy='false'>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:mo>/</mml:mo><mml:mn>2</mml:mn><mml:msup><mml:mi>&#x003C3;</mml:mi><mml:mn>2</mml:mn></mml:msup><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M9"><mml:mo class="qopname">d</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo>&#x000B7;</mml:mo><mml:mo>,</mml:mo><mml:mo>&#x000B7;</mml:mo></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>:</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow><mml:mo>&#x000D7;</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow><mml:mo>&#x021A6;</mml:mo><mml:mi>&#x0211D;</mml:mi></mml:math></inline-formula> is a distance operator defined on the neural response domain <inline-formula><mml:math id="M10"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow></mml:math></inline-formula>, and &#x003C3; &#x02208; &#x0211D;<sup>&#x0002B;</sup> is the kernel bandwidth that rules the observation window within the similarity distance is assessed. Likewise, on the neural stimulus/condition space <inline-formula><mml:math id="M11"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">L</mml:mi></mml:mrow></mml:math></inline-formula>, which contains the target membership of the neural responses (e.g., brain activity task labels), we also set a positive definite kernel <inline-formula><mml:math id="M12"><mml:msub><mml:mrow><mml:mi>&#x003BA;</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi></mml:mrow></mml:msub><mml:mo>:</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">L</mml:mi></mml:mrow><mml:mo>&#x000D7;</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">L</mml:mi></mml:mrow><mml:mo>&#x021A6;</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">H</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>L</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula> that performs the a non-linear mapping <inline-formula><mml:math id="M13"><mml:mi>&#x003C6;</mml:mi><mml:mo>:</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">L</mml:mi></mml:mrow><mml:mo>&#x021A6;</mml:mo><mml:msub><mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">H</mml:mi></mml:mrow></mml:mrow><mml:mrow><mml:mi>L</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. Thus, provided a sample set <inline-formula><mml:math id="M14"><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup><mml:mo>&#x02208;</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">L</mml:mi></mml:mrow></mml:math></inline-formula>, the pairwise similarity for neural stimuli/conditions is defined like <inline-formula><mml:math id="M15"><mml:msub><mml:mrow><mml:mi>&#x003BA;</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003C0;</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:msup><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:msub></mml:math></inline-formula>, where delta function is <inline-formula><mml:math id="M16"><mml:msub><mml:mrow><mml:mi>&#x003C0;</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:msup><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula> if <italic>l</italic>&#x0003D;<italic>l</italic>&#x02032;, otherwise, <inline-formula><mml:math id="M17"><mml:msub><mml:mrow><mml:mi>&#x003C0;</mml:mi></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:msup><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:math></inline-formula>.</p>
<p>It is worth noting that each defined above kernel reflects a different notion of similarity. So, we apply two kernel functions sequentially to assess the shared information between the neural responses to a particular stimulus/condition and the corresponding labels. Therefore, we must still evaluate how well the kernel function, &#x003BA;<sub><italic>X</italic></sub>, matches the target kernel of labels, &#x003BA;<sub><italic>L</italic></sub>. To this end, we introduce a kernel target alignment to appraise the similarity between a couple of characterizing kernel functions, employing the inner product of both kernels to estimate the dependence between the jointly sampled data (Gretton et al., <xref ref-type="bibr" rid="B32">2005</xref>). Thus, the statistical alignment between &#x003BA;<sub><italic>X</italic></sub> and &#x003BA;<sub><italic>L</italic></sub> (termed <italic>Centered Kernel Alignment</italic> &#x02013; CKA) is computed as their normalized inner product averaged across all realization pairs as below (Cortes et al., <xref ref-type="bibr" rid="B19">2012</xref>):</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M18"><mml:mrow><mml:mi>&#x003C1;</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003BA;</mml:mi><mml:mi>X</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x003BA;</mml:mi><mml:mi>L</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mi>x</mml:mi><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo>,</mml:mo><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi><mml:mo>&#x02032;</mml:mo></mml:mrow></mml:msub><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>&#x003BA;</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mi>X</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo>;</mml:mo><mml:mi>&#x003C3;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:msub><mml:mover accent='true'><mml:mi>&#x003BA;</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mi>L</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007D;</mml:mo></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:msub><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mi>x</mml:mi><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo></mml:mrow></mml:msub><mml:mo>&#x0007B;</mml:mo><mml:msubsup><mml:mover accent='true'><mml:mi>&#x003BA;</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mi>X</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mo stretchy='false'>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo>;</mml:mo><mml:mi>&#x003C3;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007D;</mml:mo><mml:msub><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:msup><mml:mi>l</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup></mml:mrow></mml:msub><mml:mo>&#x0007B;</mml:mo><mml:msubsup><mml:mover accent='true'><mml:mi>&#x003BA;</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mi>L</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mo stretchy='false'>(</mml:mo><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007D;</mml:mo></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
<p>where notation &#x1D53C;<sub><italic>x</italic></sub>{&#x000B7;} stands for the expectation value operator calculated over the random variable <inline-formula><mml:math id="M19"><mml:mi>x</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow></mml:math></inline-formula>, and notation <inline-formula><mml:math id="M20"><mml:mover accent="true"><mml:mrow><mml:mi>&#x003BA;</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo>&#x000B7;</mml:mo></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> stands for the centered version of each kernel that is estimated as follows, respectively:</p>
<disp-formula id="E3"><label>(3a)</label><mml:math id="M21"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msub><mml:mover accent='true'><mml:mi>&#x003BA;</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mi>X</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo>;</mml:mo><mml:mi>&#x003C3;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x003BA;</mml:mi><mml:mi>X</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo>;</mml:mo><mml:mi>&#x003C3;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo></mml:mrow></mml:msub><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>&#x003BA;</mml:mi><mml:mi>X</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo>;</mml:mo><mml:mi>&#x003C3;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007D;</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x1D53C;</mml:mi><mml:mi>x</mml:mi></mml:msub><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>&#x003BA;</mml:mi><mml:mi>X</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo>;</mml:mo><mml:mi>&#x003C3;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007D;</mml:mo><mml:mo>+</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:msub><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mi>x</mml:mi><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo></mml:mrow></mml:msub><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>&#x003BA;</mml:mi><mml:mi>X</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>x</mml:mi><mml:mo>,</mml:mo><mml:mi>x</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo>;</mml:mo><mml:mi>&#x003C3;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007D;</mml:mo><mml:mo>,</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E4"><label>(3b)</label><mml:math id="M22"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:msub><mml:mover accent='true'><mml:mi>&#x003BA;</mml:mi><mml:mo>&#x000AF;</mml:mo></mml:mover><mml:mi>L</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x003BA;</mml:mi><mml:mi>L</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x02032;</mml:mo></mml:mrow></mml:msub><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>&#x003BA;</mml:mi><mml:mi>L</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>l</mml:mi><mml:mi>l</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007D;</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x1D53C;</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>&#x003BA;</mml:mi><mml:mi>L</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007D;</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>+</mml:mo><mml:msub><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mi>l</mml:mi><mml:mi>l</mml:mi><mml:mo>&#x02032;</mml:mo></mml:mrow></mml:msub><mml:mo>&#x0007B;</mml:mo><mml:msub><mml:mi>&#x003BA;</mml:mi><mml:mi>L</mml:mi></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi>l</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi><mml:mo>&#x02032;</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0007D;</mml:mo><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Hence, the larger the similar pairs between the <inline-formula><mml:math id="M23"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow></mml:math></inline-formula> and <inline-formula><mml:math id="M24"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">L</mml:mi></mml:mrow></mml:math></inline-formula> spaces, the higher their CKA alignment value &#x003C1;&#x02208;&#x0211D;[0, 1].</p>
<p>In practice, provided an input representation set <italic><bold>X</bold></italic>&#x02208;&#x0211D;<sup><italic>N</italic>&#x000D7;<italic>P</italic></sup> (with <inline-formula><mml:math id="M25"><mml:mstyle mathvariant="bold"><mml:mi>X</mml:mi></mml:mstyle><mml:mo>&#x02282;</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">X</mml:mi></mml:mrow><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>P</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>) together with a vector of respective stimulus/condition labels <italic><bold>l</bold></italic>&#x02208;&#x02124;<sup><italic>N</italic></sup> (<inline-formula><mml:math id="M26"><mml:mstyle mathvariant="bold"><mml:mi>l</mml:mi></mml:mstyle><mml:mo>&#x02282;</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">L</mml:mi></mml:mrow><mml:mo>&#x02208;</mml:mo><mml:mi>&#x02124;</mml:mi></mml:math></inline-formula>), we extract each characterizing kernel matrix, <inline-formula><mml:math id="M27"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>K</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>X</mml:mi></mml:mstyle></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> and <inline-formula><mml:math id="M28"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>K</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>l</mml:mi></mml:mstyle></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula>. The former matrix holds elements <inline-formula><mml:math id="M29"><mml:msubsup><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>X</mml:mi></mml:mstyle></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003BA;</mml:mi></mml:mrow><mml:mrow><mml:mi>X</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>x</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>x</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> with <inline-formula><mml:math id="M30"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>x</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>x</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:mi>X</mml:mi></mml:mstyle></mml:math></inline-formula>, while the latter matrix has elements <inline-formula><mml:math id="M31"><mml:msubsup><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>l</mml:mi></mml:mstyle></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003BA;</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> with <inline-formula><mml:math id="M32"><mml:msub><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mstyle mathvariant="bold"><mml:mi>l</mml:mi></mml:mstyle></mml:math></inline-formula> (<italic>n, n</italic>&#x02032;&#x02208;&#x02115;[1, <italic>N</italic>]), where <italic>N</italic>&#x02208;&#x02115; is the number of neural response samples and <italic>P</italic>&#x02208;&#x02115; is the amount of estimated features. Using Equation (3), then, the empirical estimate for the CKA alignment can be calculated as follows:</p>
<disp-formula id="E5"><label>(4)</label><mml:math id="M33"><mml:mrow><mml:mover accent='true'><mml:mi>&#x003C1;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>K</mml:mi><mml:mi>X</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>K</mml:mi><mml:mi>l</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mo>&#x02329;</mml:mo><mml:msub><mml:mrow><mml:mover accent='true'><mml:mi>K</mml:mi><mml:mo stretchy='true'>&#x000AF;</mml:mo></mml:mover></mml:mrow><mml:mi>X</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mover accent='true'><mml:mi>K</mml:mi><mml:mo stretchy='true'>&#x000AF;</mml:mo></mml:mover></mml:mrow><mml:mi>l</mml:mi></mml:msub><mml:mo>&#x0232A;</mml:mo></mml:mrow><mml:mtext style="font-family:courier">F</mml:mtext></mml:msub></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:msub><mml:mrow><mml:mo>&#x02329;</mml:mo><mml:msub><mml:mrow><mml:mover accent='true'><mml:mi>K</mml:mi><mml:mo stretchy='true'>&#x000AF;</mml:mo></mml:mover></mml:mrow><mml:mi>X</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mover accent='true'><mml:mi>K</mml:mi><mml:mo stretchy='true'>&#x000AF;</mml:mo></mml:mover></mml:mrow><mml:mi>X</mml:mi></mml:msub><mml:mo>&#x0232A;</mml:mo></mml:mrow><mml:mtext style="font-family:courier">F</mml:mtext></mml:msub><mml:msub><mml:mrow><mml:mo>&#x02329;</mml:mo><mml:msub><mml:mrow><mml:mover accent='true'><mml:mi>K</mml:mi><mml:mo stretchy='true'>&#x000AF;</mml:mo></mml:mover></mml:mrow><mml:mi>l</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mover accent='true'><mml:mi>K</mml:mi><mml:mo stretchy='true'>&#x000AF;</mml:mo></mml:mover></mml:mrow><mml:mi>l</mml:mi></mml:msub><mml:mo>&#x0232A;</mml:mo></mml:mrow><mml:mtext style="font-family:courier">F</mml:mtext></mml:msub></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
<p>where notation <inline-formula><mml:math id="M34"><mml:mover accent="true"><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>K</mml:mi></mml:mstyle></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:math></inline-formula> stands for the centered kernel matrix calculated as <inline-formula><mml:math id="M35"><mml:mover accent="true"><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>K</mml:mi></mml:mstyle></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mover accent="true"><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>I</mml:mi></mml:mstyle></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:mstyle mathvariant="bold-italic"><mml:mi>K</mml:mi></mml:mstyle><mml:mover accent="true"><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>I</mml:mi></mml:mstyle></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover></mml:math></inline-formula>, where <inline-formula><mml:math id="M36"><mml:mover accent="true"><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>I</mml:mi></mml:mstyle></mml:mrow><mml:mo>&#x0007E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mstyle mathvariant="bold-italic"><mml:mi>I</mml:mi></mml:mstyle><mml:mo>-</mml:mo><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mn>1</mml:mn></mml:mstyle></mml:mrow><mml:mrow><mml:mo>&#x022A4;</mml:mo></mml:mrow></mml:msup><mml:mstyle mathvariant="bold-italic"><mml:mn>1</mml:mn></mml:mstyle><mml:mo>/</mml:mo><mml:mi>N</mml:mi></mml:math></inline-formula> is the empirical centering matrix, <italic><bold>I</bold></italic>&#x02208;&#x0211D;<sup><italic>N</italic>&#x000D7;<italic>N</italic></sup> is the identity matrix, and <bold>1</bold>&#x02208;&#x0211D;<sup><italic>N</italic></sup> is the all-ones vector. Notation &#x02329;&#x000B7;, &#x000B7;&#x0232A;<sub><monospace>F</monospace></sub> denotes the matrix-based Frobenius norm.</p>
<p>Consequently, the alignment in Equation (4) is a data-driven estimator that, from the input matrix <italic><bold>X</bold></italic>, allows quantifying the similarity between the input sample space and the stimulus/condition labels <italic><bold>l</bold></italic>.</p>
</sec>
<sec>
<title>2.2. Enhanced kernel-based relevance analysis</title>
<p>For the implementation of selected Gaussian kernel &#x003BA;<sub><italic>X</italic></sub>, we further rely on the Mahalanobis distance to perform the pairwise comparison between samples <italic><bold>x</bold></italic><sub><italic>n</italic></sub> and <inline-formula><mml:math id="M37"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>x</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:msup><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup></mml:mrow></mml:msub></mml:math></inline-formula>. Namely, the distance function in Equation (4)is fixed as follows:</p>
<disp-formula id="E6"><label>(5)</label><mml:math id="M38"><mml:mrow><mml:msubsup><mml:mtext>d</mml:mtext><mml:mi>A</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:mi>n</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:msup><mml:mi>n</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:mi>n</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:msup><mml:mi>n</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:msup><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mo>&#x022A4;</mml:mo></mml:msup><mml:msup><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:mi>n</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>x</mml:mi></mml:mstyle><mml:msup><mml:mi>n</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mo>&#x022A4;</mml:mo></mml:msup><mml:mo>,</mml:mo><mml:mo>&#x02200;</mml:mo><mml:mi>n</mml:mi><mml:mo>,</mml:mo><mml:msup><mml:mi>n</mml:mi><mml:mo>&#x02032;</mml:mo></mml:msup><mml:mo>&#x02208;</mml:mo><mml:mi>N</mml:mi></mml:mrow></mml:math></disp-formula>
<p>where matrix <italic><bold>A</bold></italic>&#x02208;&#x0211D;<sup><italic>P</italic>&#x000D7;<italic>M</italic></sup> holds the linear projection in the form <italic><bold>y</bold></italic><sub><italic>n</italic></sub> &#x0003D; <italic><bold>x</bold></italic><sub><italic>n</italic></sub><italic><bold>A</bold></italic>, with <inline-formula><mml:math id="M39"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>y</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>M</mml:mi></mml:mrow></mml:msup><mml:mo>,</mml:mo></mml:math></inline-formula> being <italic><bold>AA</bold></italic><sup>&#x022A4;</sup> the corresponding inverse covariance matrix, assuming <italic>M</italic>&#x02264;<italic>P</italic>. Therefore, intending to compute the projection matrix <italic><bold>A</bold></italic>, the formulation of CKA-based optimizing function in Equation (4) can be integrated into the following kernel-based learning algorithm:</p>
<disp-formula id="E7"><label>(6)</label><mml:math id="M40"><mml:mrow><mml:mover accent='true'><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mo stretchy='true'>&#x0005E;</mml:mo></mml:mover><mml:mo>=</mml:mo><mml:mi>arg</mml:mi><mml:mtext>&#x000A0;</mml:mtext><mml:munder><mml:mrow><mml:mi>max</mml:mi></mml:mrow><mml:mrow><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle></mml:mrow></mml:munder><mml:mi>log</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mover accent='true'><mml:mi>&#x003C1;</mml:mi><mml:mo>&#x0005E;</mml:mo></mml:mover><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>K</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>X</mml:mi></mml:mstyle></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>A</mml:mi></mml:mstyle><mml:mo>,</mml:mo><mml:mi>&#x003C3;</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>K</mml:mi></mml:mstyle><mml:mstyle mathvariant='bold-italic' mathsize='normal'><mml:mi>l</mml:mi></mml:mstyle></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo stretchy='false'>)</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
<p>where the logarithm function is here used just for mathematical convenience. Note that we highlight the dependence of the kernel matrix <italic><bold>K</bold></italic><sub><italic><bold>X</bold></italic></sub>(<italic><bold>A</bold></italic>, &#x003C3;) concerning both the projection matrix <italic><bold>A</bold></italic> and the Gaussian kernel bandwidth &#x003C3;. In this work, we propose to solve the optimization problem in Equation (6) using a recursive solution based on the well-known gradient descent approach (See Appendix <xref ref-type="supplementary-material" rid="SM1">A</xref> in Supplementary Material for further details).</p>
<p>After estimation of the projection matrix <inline-formula><mml:math id="M41"><mml:mover accent="true"><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>A</mml:mi></mml:mstyle></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula>, we assess the relevance of <italic>P</italic> input features extracted from <italic><bold>X</bold></italic>. To this end, we assume that the most contributing features should have higher values of similarity relationship with the provided neural stimuli/condition. Specifically, the CKA-based relevance analysis calculates the feature relevance vector index, &#x003F1;&#x02208;&#x0211D;<sup><italic>P</italic></sup>, having elements <inline-formula><mml:math id="M42"><mml:msub><mml:mrow><mml:mi>&#x003F1;</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mo>&#x0002B;</mml:mo></mml:mrow></mml:msup></mml:math></inline-formula> that allow measuring the contribution of each <italic>p</italic>-th input feature in building the projection matrix <inline-formula><mml:math id="M43"><mml:mover accent="true"><mml:mrow><mml:mstyle mathvariant="bold"><mml:mi>A</mml:mi></mml:mstyle></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula>. So, we use the stochastic measure of variability proposed in (Daza-Santacoloma et al., <xref ref-type="bibr" rid="B21">2009</xref>) as follows:</p>
<disp-formula id="E8"><label>(7)</label><mml:math id="M44"><mml:mrow><mml:msub><mml:mi>&#x003F1;</mml:mi><mml:mi>p</mml:mi></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mi>&#x1D53C;</mml:mi><mml:mrow><mml:mi>m</mml:mi><mml:mo>&#x02208;</mml:mo><mml:mi>M</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007B;</mml:mo><mml:mo>&#x0007C;</mml:mo><mml:msub><mml:mi>a</mml:mi><mml:mrow><mml:mi>p</mml:mi><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007C;</mml:mo><mml:mo>&#x0007D;</mml:mo><mml:mo>,</mml:mo></mml:mrow></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M45"><mml:msub><mml:mrow><mml:mi>a</mml:mi></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mi>m</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mover accent="true"><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>A</mml:mi></mml:mstyle></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula> indexes every element of matrix <inline-formula><mml:math id="M46"><mml:mover accent="true"><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>A</mml:mi></mml:mstyle></mml:mrow><mml:mo>^</mml:mo></mml:mover></mml:math></inline-formula> (<italic>p</italic>&#x02208;<italic>P</italic>, <italic>m</italic>&#x02208;<italic>M</italic>).</p>
<p>Therefore, this improvement of the feature extraction using centered kernel alignment (termed <italic>Enhanced Kernel-based Relevance Analysis</italic>&#x02013; EKRA) counts on the interpretability provided by its two central stages: (i) Seeking a feature relevance vector &#x003F1;, relying on the averaged weight magnitudes of the CKA-based rotation matrix that is directly related to the separability contribution of <italic>p</italic>-th input feature (see Equation 7). In fact, the larger the &#x003F1;<sub><italic>p</italic></sub> value, the higher the participation of <italic>p</italic>-th feature to match the neural responses in the input space with the stimulus/condition label set. So, we compute the matrix <inline-formula><mml:math id="M47"><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>X</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>S</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:math></inline-formula> to select <italic>M</italic><sub><italic>S</italic></sub>&#x0003C;<italic>P</italic> features, applying the threshold value of separability contribution: &#x003F1;<sub><italic>p</italic></sub>&#x0003E;&#x1D53C;<sub><italic>m</italic>&#x02208;<italic>M</italic></sub>{&#x003F1;<sub><italic>p</italic></sub>}. As a result, EKRA allows explaining the measured discriminating capability provided by each feature since the obtained relevance vector preserves the one-to-one relationship to the input space. (ii) Linear projection of the achieved CKA relevance subset that intends to enhance the stimulus/condition separability further, based on the explained discrimination (ruled by the introduced constraint &#x003F1;<sub><italic>p</italic></sub>&#x0003E;&#x1D53C;<sub><italic>m</italic>&#x02208;<italic>M</italic></sub>{&#x003F1;<sub><italic>p</italic></sub>}) that is measured by the input neural responses. Concerning this, the mapped feature matrix <inline-formula><mml:math id="M48"><mml:mstyle mathvariant="bold-italic"><mml:mi>Y</mml:mi></mml:mstyle><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>E</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:math></inline-formula> is calculated as: <italic><bold>Y</bold></italic> &#x0003D; <italic><bold>X</bold></italic>&#x02032;<italic><bold>A</bold></italic>&#x02032;, where <italic>M</italic><sub><italic>E</italic></sub> is the resulted amount of relevant features, and <inline-formula><mml:math id="M49"><mml:msup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>A</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>&#x02032;</mml:mi></mml:mrow></mml:msup><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>S</mml:mi></mml:mrow></mml:msub><mml:mo>&#x000D7;</mml:mo><mml:msub><mml:mrow><mml:mi>M</mml:mi></mml:mrow><mml:mrow><mml:mi>E</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup></mml:math></inline-formula> is a rotation matrix computed from <italic><bold>X</bold></italic>&#x02032;, using Equation (6) and under the assumption that <italic>M</italic><sub><italic>S</italic></sub>&#x0003C;<italic>M</italic><sub><italic>E</italic></sub>.</p>
</sec>
</sec>
<sec id="s3">
<title>3. Experimental set-up</title>
<p>For identification of the tested brain activity patterns, we validate the proposed <italic>Enhanced kernel-based relevance analysis</italic> that appraises the following training stages: (i) Feature extraction from the preprocessed EEG recordings, (ii) EKRA computation for the extracted feature set, and (iii) Detection performance of brain patterns under stimuli/conditions.</p>
<sec>
<title>3.1. Testing datasets and preprocessing</title>
<p>Intending to test two different tasks of brain activity, the validating experiments are carried out on each one of the following EEG databases:</p>
<p><bold><italic>Motor Imagery Database</italic> (MIDB)</bold><xref ref-type="fn" rid="fn0001"><sup>1</sup></xref>. This collection that is widely experimented in motor imagery tasks holds seven subjects with EEG signals recorded from 59 channels. Firstly, all recordings are submitted to a bandpass filter with bandwidth ranging from 0.05 to 200 <italic>Hz</italic>, and then to a 10-order low-pass Chebyshev II filter with stopband ripple of 50 <italic>dB</italic> down and the stopband edge frequency of 49 <italic>Hz</italic>. For emphasizing the information enclosed in &#x003B1; and &#x003B2; rhythms extracted from EEG data, a 5-order band-pass Butterworth filter is further implemented for a bandwidth ranging from 8 to 30 <italic>Hz</italic>. Besides, an average reference is employed and the <italic>Common Spatial Patterns</italic> algorithm is carried out as a data-driven supervised decomposition of the EEG multi-channel data (He et al., <xref ref-type="bibr" rid="B35">2012</xref>). All recordings are further digitized at 1000 <italic>Hz</italic> and down-sampled to supply the sampling frequency at 100 <italic>Hz</italic>. For each Motor Imagery (MI) class, the whole session was conducted without feedback, recording 100 repetitions per subject. Within the segment of interest lasting 4 <italic>s</italic> long, the subject was instructed to perform each MI task indicated by a pointing arrow on a screen. The segments lasting 2 <italic>s</italic> are interleaved with a blank screen and a fixation cross in the screen center.</p>
<p><bold>&#x0201C;<italic>Klinik f&#x000FC;r Epileptology&#x0201D; Database</italic> (KEDB)</bold> (Andrzejak et al., <xref ref-type="bibr" rid="B7">2001</xref>). This dataset, widely used in the automated epileptic seizure detection, consists of five subsets noted as <monospace>A</monospace>, <monospace>B</monospace>, <monospace>C</monospace>, <monospace>D</monospace>, and <monospace>E</monospace>. Each subset holds 100 single channel EEG segments lasting 23.6 <italic>s</italic>. Subsets <monospace>A</monospace> and <monospace>B</monospace> were acquired from five healthy subjects with eyes opened and closed, respectively. All signals from subsets <monospace>C, D</monospace>, and <monospace>E</monospace> came from five epileptic subjects. Subsets <monospace>C</monospace> and <monospace>D</monospace> included seizures-free interictal signals, which were measured on the epileptic zone and on the hemisphere opposite to the hippocampal formation of the brain. Set <monospace>E</monospace> contained epileptic signals recorded from each aforementioned location during an ictal seizure. Subsets <monospace>C, D</monospace>, and <monospace>E</monospace> were recorded intracranially. Besides, all provided EEG signals in KEDB were digitized at 173.61 <italic>Hz</italic> and 12 - bit resolution. To retain the relevant EEG information related to the studied normal and epileptic conditions, an average reference is used and all signals were filtered through a low-pass filter with a 40 <italic>Hz</italic> cutoff frequency. For the validation purpose, this data is tested on two problems with medical interest (Tzallas et al., <xref ref-type="bibr" rid="B54">2009</xref>): <italic>Bi-class</italic> (2C), normal (<monospace>A</monospace>-type) and seizure (<monospace>E</monospace>-type) labeled recordings are distinguished; and <italic>Three-class</italic> (3C), closely represents the cases of actual medical applications, including three categories: normal (<monospace>A</monospace>-type EEG segments), seizure-free interictal (<monospace>D</monospace>-type EEG segments), and seizure (<monospace>E</monospace>-type EEG segments).</p>
</sec>
<sec>
<title>3.2. Extracted feature sets for identification of brain activity patterns</title>
<p>For either case of tested neural activity task (motor imagery or epileptic seizure detection), we validate the introduced EKRA approach in the following feature sets extracted from each data collection:</p>
<sec>
<title>3.2.1. Testing feature set extracted from MIDB</title>
<p>Let <inline-formula><mml:math id="M50"><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>Z</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>C</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>T</mml:mi></mml:mrow></mml:msup></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:math></inline-formula> (<italic>n</italic>&#x02208;[1, <italic>N</italic>]) be a set of <italic>N</italic> EEG raw data trials for each subject, where <italic>C</italic>&#x02208;&#x02115; and <italic>T</italic>&#x02208;&#x02115; are the number of EEG channels and the amount of time samples, respectively. For discriminating the MI classes, we obtain a set of short-time features from each EEG trial <italic><bold>Z</bold></italic><sub><italic>n</italic></sub> using the following extraction principles (Alvarez Meza et al., <xref ref-type="bibr" rid="B6">2015</xref>):</p>
<list list-type="simple">
<list-item><p>&#x02013;<italic>Power Spectral Density-based parameters</italic> (PSD): We estimate the PSD of each EEG channel based on the nonparametric Welch&#x00027;s method that calculates the widely known Fast Fourier Transform algorithm. The number of frequency bins is fixed according to the spectral band of interest [(4&#x02013;30) <italic>Hz</italic>], where the most discriminative information for MI is concentrated (Rodr&#x000ED;guez-Berm&#x000FA;dez et al., <xref ref-type="bibr" rid="B50">2013</xref>). Due to the non-stationary nature of EEG data, a piecewise stationary analysis is carried out over a set of the extracted overlapping segments that are further windowed by a smooth time weighting window. Finally, the PSD includes the estimation of a modified periodogram vector from the Discrete Fourier Transform.</p></list-item>
<list-item><p>&#x02013;<italic>Hjorth-based parameters</italic>: For each EEG channel, we compute such a representation from a set of the extracted overlapping segments. The set of Hjort parameters comprises <italic>Activity</italic> that is directly described by the signal power variance, <italic>Mobility</italic> that measures the signal mean frequency, and <italic>Complexity</italic> that estimates the frequency variations as the signal deviation from the sine shape (Arias-Mora et al., <xref ref-type="bibr" rid="B8">2015</xref>).</p></list-item>
<list-item><p>&#x02013;<italic>Continuous Wavelet Transform</italic> (CWT) parameters. Wavelet-based methods have been heavily exploited in MI research to capture the spectral dynamics of EEG trials that usually holds non-stationary spectral components. The CWT comprises an inner-product-based transformation that quantifies the similarity between a given time-series and a previously fixed base function, termed <italic>mother wavelet</italic>. Namely, the CWT-based parameters are extracted from each EEG channel by accomplishing their convolution with the scaled and shifted mother wavelet. We select the Morlet wavelet for the CWT analysis because its wave shape and EEG signals are alike and it allows extracting features better localized in the frequency domain (Aydemir and Kayikcioglu, <xref ref-type="bibr" rid="B9">2011</xref>).</p></list-item>
<list-item><p>&#x02013;<italic>Discrete Wavelet Transform</italic> (DWT) parameters. This principle adequately addresses the trade-off between time and frequency resolution in a non-stationary signal analysis. So, DWT provides multi-resolution and non-redundant representation by decomposing the considered time-series into some sub-bands at different scales, yielding more precise time-frequency information about the data. Aiming to extract the suitable time-frequency information from each EEG channel, we compute the detail coefficients as to include the (4&#x02013;30) <italic>Hz</italic> band, resulting in the second and third levels of decomposition. Although there is a large selection of the available mother functions, we test the Symlet wavelet (Sym-7) that is closely associated with the electrical brain activity and proved to be appropriate in similar applications (Alomari et al., <xref ref-type="bibr" rid="B4">2014</xref>).</p></list-item>
</list>
<p>Once we calculate all the short-time parameters, several of their statistical measures are further considered to extract the input feature matrix <italic><bold>X</bold></italic>&#x02208;&#x0211D;<sup><italic>N</italic>&#x000D7;<italic>P</italic></sup>. Namely, the mean, the variance, and the maximum values are estimated. Consequently, the row vector <inline-formula><mml:math id="M51"><mml:msub><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>x</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>P</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> (<italic>P</italic> &#x0003D; <italic>C</italic> &#x000D7; <italic>Q</italic>) concatenates all features extracted from each <italic>n</italic>-th MI trial per channel, being <italic>Q</italic>&#x02208;&#x02115; the number of provided features. Thus, the total number of features per channel is 27 (6 for PSD, 9 &#x02013; Hjort, 6 &#x02013; CWT, and 6 for DWT). So, the size of the concatenated feature vector is <italic>P</italic> &#x0003D; 59 &#x000D7; 27, and the number of samples is <italic>N</italic> &#x0003D; 200.</p>
</sec>
<sec>
<title>3.2.2. Testing feature set extracted from KEDB.</title>
<p>The rhythms carrying out clinical and physiological interest fall primarily within the following four spectral sub-bands: Delta denoted as &#x003B4; with frequencies <italic>f</italic> &#x0003C; 4 <italic>Hz</italic>, Theta (&#x003B8;, <italic>f</italic>&#x02208;[4, 8] <italic>Hz</italic>), Alpha (&#x003B1;, <italic>f</italic>&#x02208;[8, 13] <italic>Hz</italic>), and Beta rhythms (&#x003B2;, <italic>f</italic>&#x02208;[14, 30] <italic>Hz</italic>). Then, we select the linear filter bank for representation of EEG signals because they may be more accurately tuned to each rhythm frequency bandwidth. Therefore, we use five cepstral coefficients associated with &#x003B4;, &#x003B8;, &#x003B1;, and &#x003B2; rhythms, extracted as dynamic features as in Duque-Mu&#x000F1;oz et al. (<xref ref-type="bibr" rid="B22">2014</xref>). As a result, instead of a widely used scalar-valued parameter set extracted from the EEG signal, the neural activity relating to epileptic seizures is detected by using a vector set of short-time rhythms.</p>
<p>All the baseline algorithms required to compute the features were developed based on the Signal Processing toolbox of Matlab 2013b. Note that we perform validation on two different training sets, intending to test the EKRA approach under very diverse neural activity data, namely, motor Imagery discrimination and epileptic seizure detection. Both sets are multiclass and cover all the brain regions. Besides, the variability of the former data collection is very broad, while the latter data hold more concentrated dynamics along the time. This aspect is important to test since EKRA benefits from the information about the variability of the input space. Another regard to considering is that either database is publicly available, and widely used on state-of-the-art literature, making possible to compare their performance with other used approaches of training.</p>
</sec>
</sec>
<sec>
<title>3.3. Validation of the enhanced kernel-based relevance analysis</title>
<p>With the purpose of assessing the influence of each one of the computation stages explained above, we conduct validation of the proposed EKRA method for the following two training scenarios of identification of brain activity patterns:</p>
<p>(i) The EKRA <italic>Feature selection</italic> method (noted as <italic>EKRA&#x02013;S</italic>) provides a better understanding of the input feature set, and thus, facilitates the physiological interpretation task. To this end, all relevance features are sorted in decreasing order of the achieved contribution amplitude, yielding the ranking EKRA vector &#x003F1;. Therefore, supplying to the validated classifier one-by-one the ranked features, the accuracy dependence is performed through a nested 10-fold cross-validation scheme, according to the ranked, relevant features. It is worth noting that the use of the ranking vector avoids the inclusion of any heuristic nor greedy feature selection search (like the conventional forward-backward approaches), demanding huge computational burden. For comparison regarding the physiological interpretation, the proposed EKRA is contrasted with a baseline <italic>Variance-based Relevance Analysis</italic> (VRA) that ranks the short-time input features grounded on a variability criterion. Namely, VRA computes a relevance vector based on a linear transformation of the input representation (Daza-Santacoloma et al., <xref ref-type="bibr" rid="B21">2009</xref>).</p>
<p>(ii) The EKRA <italic>Enhanced feature selection</italic> (<italic>EKRA&#x02013;ES</italic>) incorporates the projection to the EKRA&#x02013;S procedure, aiming to raise the performed accuracy of brain activity identification (see section 2.2). So, a mapped feature set is estimated to encode more accurately the available discriminant neural patterns through the embedding matrix <italic><bold>Y</bold></italic> as explained above.</p>
<p>For EKRA calculation in Appendix <xref ref-type="supplementary-material" rid="SM1">A</xref> in Supplementary Material, the free parameters are fixed as follows: The initial guess <italic><bold>A</bold></italic><sup>0</sup> is adjusted according to the well-known principal component analysis approach, &#x003C3;<sup>0</sup> is fixed to the median of the input data Euclidean distances, the gradient descent tolerance is set to 1<italic>e</italic> &#x02212; 6, the maximum number of iterations is empirically limited up to 300, and the number of relevant dimensions (<italic>M</italic><sub><italic>S</italic></sub> and <italic>M</italic><sub><italic>E</italic></sub>) are adjusted as to hold 95% of the variance explained. Note that <italic>EKRA</italic> learns two projection matrices (<italic><bold>A</bold></italic> and <italic><bold>A</bold></italic>&#x02032;), resulting in a computational complexity of <inline-formula><mml:math id="M52"><mml:mrow><mml:mi mathvariant="-tex-caligraphic">O</mml:mi></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. However, this procedure is carried out off-line. The EKRA&#x02013;S and EKRA&#x02013;ES Matlab implementation codes are publicly available<xref ref-type="fn" rid="fn0002"><sup>2</sup></xref>.</p>
<p>In each task at hand, we primarily perform a visual inspection as the simplest way of representing, graphically, the dependence of the classification performance from the relevant features, taking into consideration the applied feature extraction principle as well as the measuring EEG channel or brain region. Besides, the estimation of the classification performance is also considered. In either scenario of validation, a k-nearest neighbor classifier is employed to identify the brain patterns under stimuli/conditions, for which the number of nearest neighbors is tuned within the range {1, 3, 5, 7, 9, 11} based on the system accuracy achieved, operating a nested 10-fold cross-validation to avoid any over-fitting results.</p>
</sec>
</sec>
<sec id="s4">
<title>4. Results and discussion</title>
<sec>
<title>4.1. Validation results on motor imagery tasks</title>
<sec>
<title>4.1.1. Relevance analysis performed by the feature selection scenario</title>
<p>We initially consider the performed relevance analysis for the feature selection as shown in Figure <xref ref-type="fig" rid="F1">1</xref> that displays the estimated relevance averaged through all tested subjects. In each 2D representation, the abscissa indicates the cardinal assigned to each one of 27 features, while the horizontal axis represents the cardinal of the 59 channels labeled by the international 10&#x02013;20 electrode location montage. The proposed subspace-based methodology rests on measuring the covariance-based evolution of underlying time-varying signals, providing the best discrimination among the classes. To this end, EKRA quantifies the short-time relevance, using the centralized kernel alignment that is adjusted to learning the linear projection that best discriminates a given input feature space, assessing the similarity between the projected input data and stimulus/condition labels. For the sake of comparison, validation also comprises the baseline variance-based relevance analysis (VAR) that only estimates the time-varying feature contribution in the original input space over time, but regarding the unsupervised classifier performance. Although the validation performed on two data collections shows that the obtained preliminary results are encouraging, some additional aspects should be further considered:</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>2D representation (top and bottom rows) and topoplots (middle row) of the performed relevance analysis on motor imagery task. The plot at the top shows the marginal relevance (mean and standard deviation) per channel. The topoplots are computed by averaging the input feature relevance values in &#x003F1; regarding the EEG montage (channels). The right-side plot shows the marginal relevance averaged for all considered principles of feature extraction.</p></caption>
<graphic xlink:href="fnins-11-00550-g0001.tif"/>
</fig>
<p>The performed relevance by the VRA algorithm (left column) allows to appraise the contribution from every single electrodeposition, resulting in three channel groups of relevance that can be spatially identified, namely, the channels numbered as &#x00023;1&#x02013;13, &#x00023;14&#x02013;33, and &#x00023;34&#x02013;59. To quantify the contribution, we calculate the marginal relevance per channel averaged through all involved features. Thus, the first labeled 13 channels (placed over the association cortex) perform the strongest contribution, having even the lowest dispersion as can be seen in the top plot of Figure <xref ref-type="fig" rid="F1">1</xref>. A lower relevance value is supplied by the electrode positions that collect the neural activity in the anterior parts of the anterior parietal cortex (&#x00023;33&#x02013;59). The lowest relevance (having even the highest variance) is produced by the electrode positions that relate to the precentral gyrus (&#x00023;14&#x02013;28) and the dorsal lateral premotor area (&#x00023;30&#x02013;33). In the case of EKRA&#x02013;S, the three spatially distinct groups remain, but the estimated relevance per channel is different from the one assessed by VRA. As seen in the top plot of Figure <xref ref-type="fig" rid="F1">1</xref>, the electrode positions labeled as &#x00023;14&#x02013; 32 now become the most important succeeded by the group &#x00023;33&#x02013; 59. In contrast to the VRA approach, the channels placed over the association cortex (&#x00023;1&#x02013;13) perform the worst relevance. As regards the relevance assessed by each principle of feature extraction, the tested feature sets do not cluster so distinctly for either used selection algorithm, though most of the characteristics behave differently depending on the electrode position (see right-side plots of Figure <xref ref-type="fig" rid="F1">1</xref>).</p>
<p>With the aim of further exploring the spatial distribution of the carried out feature selection, all computed relevance values are arranged in the 10&#x02013;20 channel montage as displayed in the topoplots of the middle row of Figure <xref ref-type="fig" rid="F1">1</xref>. It is worth noting that we describe the MI brain activity performed by a hypothetical medium person due to the estimated relevance planes are averaged across all subjects. So, VRA produces the highest contribution of relevance for the middle frontal gyrus that is represented by channels F5, F3, F4, and F6. However, the middle frontal gyrus should not be related to any imagery stimulation (Hanakawa et al., <xref ref-type="bibr" rid="B33">2003</xref>). Rather, this brain area activates as a response to body movements, e.g., powerful EEG artifacts. In other words, the presence of EEG channels with high-energy disturbances may mislead the VRA estimator, identifying the MI patterns wrongly. Instead, EKRA&#x02013;S assigns the bigger values of relevance to the EEG channels placed over two brain areas that indeed are commonly related to MI tasks. Namely, the posterior superior parietal cortex (P3, P1, PZ, and P2), and the left precentral sulcus at the level of the middle frontal gyrus (CFC5, C3, CFC3, C1, and CFC1). Furthermore, the middle frontal gyrus has the lowest contribution, weakening the influence of movement artifacts. To better visualize the joint channel-feature relationship, we rearrange each 2D representation so that the relevance estimates are now ranked in decreasing order along the channel and feature extraction principle axes. Importantly, all features that do not contribute to 95% of the variance explained are zero-valued. Comparison of 2D representations of the bottom row in Figure <xref ref-type="fig" rid="F1">1</xref> allows concluding that each employed relevance estimator associates the input training set differently, playing a very significant role in appraising the channel-feature contribution.</p>
<p>Therefore, the baseline VRA algorithm produces higher values of the relevance marginal (see the top and right-side plots of each 2D representation) in comparison to the proposed EKRA&#x02013;S, suggesting that the latter approach encodes the whole brain activity task into a lower number of features. The following two facts can explain this advantage of EKRA&#x02013;S: (i) the use of the MI label information to reveal features, which must be salient regarding the studied paradigm. Thus, the brain activity patterns are better localized. (ii) Representation through enhanced RKHS allows dealing with complex neighboring data dependencies, rejecting more efficiently redundant features and highlighting coherent spatial regions (EEG channels) regarding the studied MI paradigm. In contrast, VRA mainly explains the relevance concerning its energy-based cost functional that emphasizes the brain regions with intense activity, which are activated during the time the stimuli goes. Yet, this assumption does not necessarily hold for MI tasks.</p>
<p>With the aim of estimating the classification performance of the contemplated MI tasks, we assume the selected training set as the one containing the minimum amount of features to reach the maximum classification accuracy. To this end, the <italic>k</italic>-nearest neighbor classifier is fed by adding one by one the relevant features, which have been previously ranked in decreasing order. In average for all subjects, VRA performs an accuracy close to &#x0007E;85.16 &#x000B1; 3.88% and clearly falls behind the EKRA&#x02013;S algorithm that reaches &#x0007E;95.71 &#x000B1; 3.01% as seen in Figure <xref ref-type="fig" rid="F2">2</xref>. Moreover, the number of selected training features is also shown for each subject. Regardless of the used feature selection strategy, the performed accuracy has some fluctuations due to the inter-subject variability that has been already reported for spatial patterns and spectro-temporal characteristics of brain signals in Motor Imagery tasks (Blankertz et al., <xref ref-type="bibr" rid="B12">2007</xref>). One more reason causing the performance fluctuations is the simplicity of the employed <italic>knn</italic> classifier. Therefore, the use more elaborate classifiers (like SVM) should deal better with the fluctuations. Along with the MI discrimination performance, another important aspect to explain is the number of selected training features of the whole input set (1593). As displayed, VRA chooses about 1, 410 features, but EKRA&#x02013;S does only 275 features. Consequently, a dimension reduction is close to one and 5.8, respectively, averaged for all subjects. Therefore, EKRA&#x02013;S is as much as five times more efficient than the contrasted baseline estimator, regarding the reduction dimension processing.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Motor Imagery discrimination accuracy performed by the feature selection strategy. The achieved average accuracy of classification is computed using a nested cross-validation approach, adding one by one the features ranked. VRA, left column; EKRA-S, middle column; DR, right column. Results are displayed concerning each studied subject.</p></caption>
<graphic xlink:href="fnins-11-00550-g0002.tif"/>
</fig>
<p>For both selected feature sets, a detailed analysis results in the following findings:</p>
<list list-type="simple">
<list-item><p>&#x02013; Apparently, the Hjorth principle of extraction supplies the features with the highest relevance, contributing the most to the discrimination of MI classes regardless of the used relevance estimator. The remaining spectral characteristics have a comparable contribution, though some differences apply for EKRA&#x02013;S.</p></list-item>
<list-item><p>&#x02013; Regarding the proportion of features encoding MI information (the <italic>superior parietal</italic> plus <italic>middle frontal gyrus</italic>), EKRA&#x02013;S produces a higher number of salient features.</p></list-item>
<list-item><p>&#x02013; As one of the biggest challenges in BCI research, it is worth mentioning the inter-subject variability of spatial patterns and spectro-temporal characteristics of brain signals (Blankertz et al., <xref ref-type="bibr" rid="B12">2007</xref>). In the contemplated MI task, some subjects might not focus their gaze in the proper direction, and thus, the EEG recordings will not be reliable for meaningful interpretation. From the comparison plots of relevance in Figure <xref ref-type="fig" rid="F3">3</xref>, it follows that EKRA&#x02013;S better adapts the BCI system for each particular subject, at least, in terms of revealing the most discriminating features. Also, the layout of Figure <xref ref-type="fig" rid="F1">1</xref> is enhanced by including a circular representation that embraces the subject variability estimated for each channel, pointing out on the places where the proposed technique captures better the individual patterns.</p></list-item>
<list-item><p>&#x02013; Either relevance approach (VRA or EKRA&#x02013;S) provides a high variability among subjects, mainly focusing on two brain areas: Posterior parietal cortex (SP) and left precentral sulcus at the level of the middle frontal gyrus (MF). Note that both areas are commonly related to MI tasks. However, the relevance, given by EKRA&#x02013;S to brain areas related to MI tasks (SP and MF), is higher in each subject than the relevance assigned by the VRA approach. In other words, though the variability among subjects is similar in both methods, EKRA&#x02013;S enhances the relevance of brain areas that are more related to MI tasks.</p></list-item>
</list>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Contribution of the selected feature set to the Motor Imagery discrimination performance. VRA, left column; EKRA-S, right column Feature extraction principle (PSD, Hjorth, CWT, and DWT parameters) - top row, Brain area (SP and MF) - bottom row. Results are displayed concerning each studied subject.</p></caption>
<graphic xlink:href="fnins-11-00550-g0003.tif"/>
</fig>
</sec>
<sec>
<title>4.1.2. Relevance analysis performed in enhanced feature selection scenario:</title>
<p>In this case, we calculate the performance of the proposed relevance analysis for both cases of consideration: feature selection (EKRA-S) and enhanced feature selection (EKRA-ES). Results are given regarding the classifier accuracy achieved for the contemplated MI task for every subject. In the former case, our proposal reaches an averaged accuracy 95.71&#x000B1;03.01 and 96.71&#x000B1;01.84, respectively, outperforming the contrasted baseline VRA that produces 92.86&#x000B1;03.77. For the sake of comparison, we also include the accuracy estimated by the approach submitted in Zhang et al. (<xref ref-type="bibr" rid="B63">2012</xref>) that selects an extracted spatio-temporal feature set, from which a non-linear regression for predicting the time-series of class labels is applied. Another compared work is the one in (He et al., <xref ref-type="bibr" rid="B35">2012</xref>) that uses an adaptive frequency band selection of the spatial preprocessed features that feed an SVM classifier. Lastly, we consider the approach in Higashi and Tanaka (<xref ref-type="bibr" rid="B36">2013</xref>) involving common space-time-frequency patterns to design the time-windows used for the MI task. As seen in Table <xref ref-type="table" rid="T1">1</xref>, all referred training strategies underperform the proposed relevance analysis method.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Performed classification accuracy for Motor Imagery discrimination (average &#x000B1; standard deviation [%]).</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Subject</bold></th>
<th valign="top" align="center"><bold>VRA</bold></th>
<th valign="top" align="center"><bold>EKRA-S</bold></th>
<th valign="top" align="center"><bold>He et al., <xref ref-type="bibr" rid="B35">2012</xref></bold></th>
<th valign="top" align="center"><bold>Zhang et al., <xref ref-type="bibr" rid="B63">2012</xref></bold></th>
<th valign="top" align="center"><bold>Higashi and Tanaka, <xref ref-type="bibr" rid="B36">2013</xref></bold></th>
<th valign="top" align="center"><bold>EKRA-ES</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">&#x00023; 1</td>
<td valign="top" align="center">91.50 &#x000B1; 05.29</td>
<td valign="top" align="center"><underline>94.16 &#x000B1; 05.30</underline></td>
<td valign="top" align="center">67.70 &#x000B1; 02.20</td>
<td valign="top" align="center">77.20 &#x000B1; 00.03</td>
<td valign="top" align="center">92.30 &#x000B1; 02.50</td>
<td valign="top" align="center"><bold>98.00</bold> &#x000B1; <bold>02.58</bold></td>
</tr>
<tr>
<td valign="top" align="left">&#x00023; 2</td>
<td valign="top" align="center"><bold>96.50</bold> &#x000B1; <bold>03.37</bold></td>
<td valign="top" align="center">90.16 &#x000B1; 05.88</td>
<td valign="top" align="center">70.70 &#x000B1; 01.20</td>
<td valign="top" align="center">70.80 &#x000B1; 00.02</td>
<td valign="top" align="center">90.60 &#x000B1; 7.20</td>
<td valign="top" align="center"><underline>93.00 &#x000B1; 05.37</underline></td>
</tr>
<tr>
<td valign="top" align="left">&#x00023; 3</td>
<td valign="top" align="center">91.50 &#x000B1; 04.74</td>
<td valign="top" align="center"><bold>98.50</bold> &#x000B1; <bold>08.57</bold></td>
<td valign="top" align="center">83.90 &#x000B1; 01.30</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center"><underline>97.50 &#x000B1; 03.54</underline></td>
</tr>
<tr>
<td valign="top" align="left">&#x00023; 4</td>
<td valign="top" align="center">87.00 &#x000B1; 06.32</td>
<td valign="top" align="center"><underline>94.50 &#x000B1; 07.01</underline></td>
<td valign="top" align="center">93.00 &#x000B1; 01.20</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center"><bold>96.50</bold> &#x000B1; <bold>04.12</bold></td>
</tr>
<tr>
<td valign="top" align="left">&#x00023; 5</td>
<td valign="top" align="center">91.50 &#x000B1; 07.47</td>
<td valign="top" align="center"><bold>98.50</bold> &#x000B1; <bold>04.60</bold></td>
<td valign="top" align="center">93.20 &#x000B1;01.20</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center"><underline>97.50 &#x000B1; 03.54</underline></td>
</tr>
<tr>
<td valign="top" align="left">&#x00023; 6</td>
<td valign="top" align="center"><underline>98.50 &#x000B1; 02.42</underline></td>
<td valign="top" align="center">97.66 &#x000B1; 04.82</td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">76.80 &#x000B1; 00.03</td>
<td valign="top" align="center">93.30 &#x000B1; 03.60</td>
<td valign="top" align="center"><bold>98.50</bold> &#x000B1; <bold>00.01</bold></td>
</tr>
<tr>
<td valign="top" align="left">&#x00023; 7</td>
<td valign="top" align="center">93.50 &#x000B1; 07.09</td>
<td valign="top" align="center"><bold>96.50</bold> &#x000B1; <bold>03.45</bold></td>
<td valign="top" align="center">&#x02013;</td>
<td valign="top" align="center">80.00 &#x000B1; 00.03</td>
<td valign="top" align="center">94.10 &#x000B1; 04.10</td>
<td valign="top" align="center"><underline>96.00 &#x000B1; 03.16</underline></td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="top" align="left"><bold>Mean</bold></td>
<td valign="top" align="center">92.86 &#x000B1; 03.77</td>
<td valign="top" align="center"><underline>95.71 &#x000B1; 03.01</underline></td>
<td valign="top" align="center">81.70 &#x000B1; 12.06</td>
<td valign="top" align="center">76.20 &#x000B1; 03.87</td>
<td valign="top" align="center">92.58 &#x000B1; 01.51</td>
<td valign="top" align="center"><bold>96.71</bold> &#x000B1; <bold>01.84</bold></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>Notation (-) stands for Not provided. Note that the accuracy of EKRA-S, EKRA-ES, and VRA is estimated as the highest value performed for each tested subject under a nested cross-validation scheme. Bold values indicate the best results</italic>.</p>
</table-wrap-foot>
</table-wrap>
</sec>
</sec>
<sec>
<title>4.2. Validation results on epileptic seizure detection</title>
<sec>
<title>4.2.1. Relevance analysis performed in the feature selection scenario:</title>
<p>We test both, VRA and EKRA-S, approaches as a feature selection tool of the spectral coefficients extracted from the physiological rhythms (&#x003B4;, &#x003B8;, &#x003B1;, and &#x003B2;). Since the KEDB dataset only has one-channel EEG recordings, the physiological interpretation of the selected feature set only covers the influence of the physiological waveforms on the two possible challenges of epileptic seizure detection. The selected feature set is calculated as in the motor imagery task for which the accuracy of the <italic>k</italic>-nearest neighbor classifier is also performed through the nested 10-fold cross-validation scheme.</p>
<p>As seen in Figure <xref ref-type="fig" rid="F4">4</xref>, either comparative approach of feature selection attains the highest accuracy (100%) for the bi-class task. Further, EKRA-S betters the baseline VRA for the tasks of three classes (96.00 vs. 90.78%, respectively). Regarding the number of selected training features, once again the EKRA-S approach outperforms VRA in all tasks. Note that the VRA classification accuracy increases as the number of features grow, requiring the full input feature set to reach the maximum performance. Meanwhile, the addition of more features drops the performance once the EKRA-S approach gets the highest accuracy, indicating that the inclusion of other features may be redundant. As a result, the dimension reduction is two and three times bigger than the one obtained by VRA for the 2C and 3C tasks, respectively. This aspect can be of benefit for reliable online monitoring of traces of interictal/ictal states of epilepsy since the demanded time-window of EEG analysis may be remarkably shortened.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Performed accuracy for epileptic seizure detection using each compared approach of feature selection. The achieved average accuracy of classification is computed using a nested cross-validation approach, adding one by one the features ranked. VRA (continuous lines), EKRA&#x02013;S (dashed lines). The blue and green colors hold for the 2C and the 3C problem, respectively.</p></caption>
<graphic xlink:href="fnins-11-00550-g0004.tif"/>
</fig>
<p>Figure <xref ref-type="fig" rid="F5">5</xref> shows the normalized relevance values that are estimated for each rhythm. By the VRA estimator, the selected features make &#x003B1; and &#x003B2; waveforms the most relevant for all considered tasks. At the same time, low-frequency rhythms (&#x003B4;, &#x003B8;) exhibit modest values of relevance. Although EKRA-S infers a similar contribution of the rhythms, the relationship between the high to low-frequency rhythms decreases as the number of classes increases. This result indicates on the energy redistribution, taking place as the complexity of the task increases as has been explained in similar works (Duque Mu&#x000F1;oz et al., <xref ref-type="bibr" rid="B23">2015</xref>).</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Relevant rhythms regarding the seizure detection tasks. DR - left column. The number of selected features is shown for each provided classification problem (2C and 3C), in blue and red for VRA and EKRA, respectively. VRA-based rhythms selection - middle column, EKRA-S-based rhythms selection - right column. The percentage of selected rhythms are shown in colors for the 2C and the 3C problems.</p></caption>
<graphic xlink:href="fnins-11-00550-g0005.tif"/>
</fig>
<p>For the sake of comparison, both proposed strategies for relevance analysis (EKRA-S and EKRA-ES) are contrasted with some recent approaches for epileptic seizure detection. Although this comparison may not be entirely fair due to different details on the testing procedures (Kumar et al., <xref ref-type="bibr" rid="B39">2010</xref>; Zandi et al., <xref ref-type="bibr" rid="B62">2010</xref>), it seems to be the best possible option. The best classification accuracy achieved by each contrasted approach of epileptic seizure detection is displayed in Table <xref ref-type="table" rid="T2">2</xref>, showing that almost all benchmarked approaches provide a high classification accuracy that ranges from 99.5% to 100% for a <italic>Bi-class</italic> problem, and from 90.78 to 100% for a <italic>Three-class</italic> task. Note that we employ a nested 10-fold cross-validation to avoid any over-fitting of the discrimination system.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Accomplished classification results for epileptic seizure detection.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="center" colspan="3" style="border-bottom: thin solid #000000;"><bold>2-class</bold></th>
<th valign="top" align="center" colspan="3" style="border-bottom: thin solid #000000;"><bold>3-class</bold></th>
</tr>
<tr>
<th valign="top" align="left"><bold>Authors</bold></th>
<th valign="top" align="left"><bold>Features/Classifier</bold></th>
<th valign="top" align="center"><bold>Accuracy</bold></th>
<th valign="top" align="left"><bold>Authors</bold></th>
<th valign="top" align="left"><bold>Features/Classifier</bold></th>
<th valign="top" align="center"><bold>Accuracy</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Srinivasan et al., <xref ref-type="bibr" rid="B51">2005</xref></td>
<td valign="top" align="left"><italic>t-f</italic> analysis/RNN</td>
<td valign="top" align="center">99.6</td>
<td valign="top" align="left">Ghosh-Dastidar et al., <xref ref-type="bibr" rid="B30">2008</xref></td>
<td valign="top" align="left">PCA-RBF/ANN</td>
<td valign="top" align="center">96.60</td>
</tr>
<tr>
<td valign="top" align="left">Gandhi et al., <xref ref-type="bibr" rid="B29">2011</xref></td>
<td valign="top" align="left">WT/PNN</td>
<td valign="top" align="center">99.99</td>
<td valign="top" align="left">Naghsh-Nilchi and Aghashahi, <xref ref-type="bibr" rid="B46">2010</xref></td>
<td valign="top" align="left">EV/MLP NN</td>
<td valign="top" align="center">97.50</td>
</tr>
<tr>
<td valign="top" align="left">Polat and Gunes, <xref ref-type="bibr" rid="B49">2007</xref></td>
<td valign="top" align="left">PCA FFT/AIRS</td>
<td valign="top" align="center">100</td>
<td valign="top" align="left">Tang and Durand, <xref ref-type="bibr" rid="B53">2012</xref></td>
<td valign="top" align="left">PSD&#x0002B;CLZ/SVMA</td>
<td valign="top" align="center">98.72</td>
</tr>
<tr>
<td valign="top" align="left">Zafer et al., <xref ref-type="bibr" rid="B60">2011</xref></td>
<td valign="top" align="left">CC&#x0002B;PSD/vot. rule</td>
<td valign="top" align="center">100</td>
<td valign="top" align="left">Mart&#x000ED;nez-Vargas et al., <xref ref-type="bibr" rid="B44">2012</xref></td>
<td valign="top" align="left">TFR-2DPCA/<italic>k-nn</italic></td>
<td valign="top" align="center">98.80</td>
</tr>
<tr>
<td valign="top" align="left">Mart&#x000ED;nez-Vargas et al., <xref ref-type="bibr" rid="B44">2012</xref></td>
<td valign="top" align="left">TFR-2DPCA/<italic>k-nn</italic></td>
<td valign="top" align="center">100</td>
<td valign="top" align="left">Tzallas et al., <xref ref-type="bibr" rid="B54">2009</xref></td>
<td valign="top" align="left"><italic><italic>t</italic>&#x02212;<italic>f</italic></italic> analysis/ANN</td>
<td valign="top" align="center">100</td>
</tr>
<tr>
<td valign="top" align="left">Tzallas et al., <xref ref-type="bibr" rid="B54">2009</xref></td>
<td valign="top" align="left"><italic>t-f</italic> analysis/ANN</td>
<td valign="top" align="center">100</td>
<td valign="top" align="left">Duque-Mu&#x000F1;oz et al., <xref ref-type="bibr" rid="B22">2014</xref></td>
<td valign="top" align="left">short-time/<italic>k-nn</italic></td>
<td valign="top" align="center">100</td>
</tr>
<tr>
<td valign="top" align="left">Duque-Mu&#x000F1;oz et al., <xref ref-type="bibr" rid="B22">2014</xref></td>
<td valign="top" align="left">short-time/<italic>k-nn</italic></td>
<td valign="top" align="center">99.5</td>
<td valign="top" align="left"><italic>Proposal</italic></td>
<td valign="top" align="left">EKRA-S</td>
<td valign="top" align="center"><bold>90.78</bold></td>
</tr>
<tr>
<td valign="top" align="left"><italic>Proposal</italic></td>
<td valign="top" align="left">EKRA-S</td>
<td valign="top" align="center"><bold>100</bold></td>
<td valign="top" align="left"><italic>Proposal</italic></td>
<td valign="top" align="left">EKRA-ES</td>
<td valign="top" align="center"><bold>96.00</bold></td>
</tr>
<tr>
<td valign="top" align="left"><italic><italic>Proposal</italic></italic></td>
<td valign="top" align="left">EKRA-ES</td>
<td valign="top" align="center"><bold>100</bold></td>
<td/>
<td/>
<td/>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The EKRA-S and EKRA-ES approaches are compared against state-of-the-art methods concerning the average classification accuracy [%]. Bold values indicate the best results</italic>.</p>
</table-wrap-foot>
</table-wrap>
<p>Another aspect to reflect is the influence of the parameter tuning. As seen in Table <xref ref-type="table" rid="T3">3</xref> that shows the confidence of the point classification estimates provided by the EKRA free meta-parameter, there are small fluctuations in performance among the training folds, which are calculated by the nested cross-validation strategy to avoid any overestimation effect of the performed accuracy. Hence, the proposed approach together with its optimization strategy allows extracting relevant brain patterns, providing a solid classification accuracy.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>EKRA free parameters values for each provided brain activity task. The average &#x000B1; the standard deviation are presented.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="center" colspan="2"><bold>Brain activity task</bold></th>
<th valign="top" align="center"><bold><italic>M</italic><sub><italic>S</italic></sub></bold></th>
<th valign="top" align="center"><bold><italic>M</italic><sub><italic>E</italic></sub></bold></th>
<th valign="top" align="center"><bold><italic>k</italic>-neighbors</bold></th>
<th valign="top" align="center"><bold>&#x003C3;</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Motor imagery</td>
<td valign="top" align="left">S1</td>
<td valign="top" align="center">796.00 &#x000B1; 0.00</td>
<td valign="top" align="center">94.70 &#x000B1; 1.64</td>
<td valign="top" align="center">4.20 &#x000B1; 1.93</td>
<td valign="top" align="center">0.91 &#x000B1; 0.00</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">S2</td>
<td valign="top" align="center">796.00 &#x000B1; 0.00</td>
<td valign="top" align="center">108.90 &#x000B1; 0.32</td>
<td valign="top" align="center">3.20 &#x000B1; 0.63</td>
<td valign="top" align="center">0.91 &#x000B1; 0.01</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">S3</td>
<td valign="top" align="center">796.00 &#x000B1; 0.00</td>
<td valign="top" align="center">122.10 &#x000B1; 0.32</td>
<td valign="top" align="center">3.40 &#x000B1; 0.84</td>
<td valign="top" align="center">0.91 &#x000B1; 0.00</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">S4</td>
<td valign="top" align="center">796.00 &#x000B1; 0.00</td>
<td valign="top" align="center">116.90 &#x000B1; 0.32</td>
<td valign="top" align="center">3.40 &#x000B1; 1.26</td>
<td valign="top" align="center">0.93 &#x000B1; 0.00</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">S5</td>
<td valign="top" align="center">796.00 &#x000B1; 0.00</td>
<td valign="top" align="center">117.80 &#x000B1; 0.42</td>
<td valign="top" align="center">3.20 &#x000B1; 0.63</td>
<td valign="top" align="center">0.93 &#x000B1; 0.00</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">S6</td>
<td valign="top" align="center">796.00 &#x000B1; 0.00</td>
<td valign="top" align="center">104.20 &#x000B1; 0.63</td>
<td valign="top" align="center">3.00 &#x000B1; 0.00</td>
<td valign="top" align="center">0.92 &#x000B1; 0.01</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">S7</td>
<td valign="top" align="center">796.00 &#x000B1; 0.00</td>
<td valign="top" align="center">90.90 &#x000B1; 1.45</td>
<td valign="top" align="center">3.80 &#x000B1; 1.93</td>
<td valign="top" align="center">0.86 &#x000B1; 0.01</td>
</tr>
<tr style="border-top: thin solid #000000;">
<td valign="top" align="center">Epileptic seizure detection</td>
<td valign="top" align="center">2-Class</td>
<td valign="top" align="center">516.00 &#x000B1; 0.00</td>
<td valign="top" align="center">111.90 &#x000B1; 0.74</td>
<td valign="top" align="center">3.00 &#x000B1; 0.00</td>
<td valign="top" align="center">1.07 &#x000B1; 0.02</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">3-Class</td>
<td valign="top" align="center">516.00 &#x000B1; 0.00</td>
<td valign="top" align="center">143.50 &#x000B1; 0.85</td>
<td valign="top" align="center">4.20 &#x000B1; 2.15</td>
<td valign="top" align="center">1.10 &#x000B1; 0.00</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p><italic>The number of selected and projected features in EKRA (M<sub>S</sub> and M<sub>E</sub>), as well as the number of neighbors and the Gaussian kernel bandwidth in the k-nearest neighbor classifier (k and &#x003C3;), are studied according to a nested cross-validation scheme</italic>.</p>
</table-wrap-foot>
</table-wrap>
</sec>
</sec>
</sec>
<sec sec-type="conclusions" id="s5">
<title>5. Conclusions</title>
<p>We discuss a novel kernel-based approach for the feature relevance analysis to enhance the automatic identification of brain activity patterns. To this end, the proposed relevance analysis incorporates two kernel functions to take advantage of the available joint information associating neural responses to an individual stimulus/conditions with the corresponding labels. Then, kernel alignment learns all relevant patterns from the short-time input features. Validation of the proposed Kernel-based Relevance Analysis is carried out in two scenarios of training: feature selection and enhanced feature selection. In particular, two tasks of brain activity identification are studied that exhibit highly non-stationary behavior: motor imagery discrimination and epileptic seizure detection.</p>
<p>With the aim to encode two different notions of similarity, the need for handling a couple of kernels encourages the use of the well-known kernel alignment to unify both tasks into a single optimization framework. Nonetheless, the selection of distances, which implement each aligned kernel as well as the same alignment, mostly determines the effectiveness of the kernel-based approach for a given application. In the particular case of brain activity identification, we rely on the Mahalanobis distance to carry out the pairwise comparison between samples based on the Gaussian kernel. Thus, a linear projection is further learned from the employed CKA-based functional as an alternative to highlight the salient input features, taking advantage of the non-linear notion of similarity behind the selected kernels. For the sake of simplicity, the iterative gradient descent optimization is employed to calculate the projection matrix and the Gaussian kernel free parameter.</p>
<p>To implement EKRA as a feature selection tool &#x02013; (EKRA-S), we introduce a feature relevance vector index devoted to measuring the contribution of each one of the input features in building the projecting CKA matrix. So, we assess the selected feature set that satisfies a given stopping criteria (namely, we fix the proportion of variance explained) by ranking this contribution. Thus, the feature selection using EKRA-S demands small feature sets with the benefit of providing a better interpretation of the space brain activity distribution and the principle of employed feature extraction. Besides, the EKRA-based ranking separates redundant features, which usually tend to drop the system accuracy. As another advantage, the EKRA-S approach adapts the relevance analysis to include the inter-subject variability. This aspect remains one of the most challenging issues of training for BCI systems. With the purpose to improve interpretation on a neurophysiological basis, the proposed Kernel-based Relevance Analysis is designed to take advantage of the measured brain activity, associating neural responses to a given stimulus condition and aiming to assess the contribution of each spatial electrode location to the identification performance. As a result, the EKRA-based relevance mainly highlights those regions that are indeed neurophysiologically related to the Motor imagery tasks with the benefit of providing a confident and competitive accuracy performance. So, the EKRA enhances the interpretability of brain activity patterns, enabling to discuss, verify and even improve the performed results. Therefore, EKRA as a feature selection tool can reach a suitable classification accuracy with a high dimension reduction factor, providing better physiological interpretation of the brain activity patterns.</p>
<p>In the other training scenario of enhanced feature selection (EKRA-ES), we use the relevance index vector to estimate the representation space that optimizes a trade-off between separability and no redundancy of the available neural patterns. As a result, our proposal outperforms those compared approaches that carry out the multivariate feature selection and/or dimension reduction. Indeed, the EKRA-based enhanced feature spaces handle the brain activity complexity to support further classification stages regarding system accuracy and reliability.</p>
<p>Regarding the EKRA shortcomings, we must clarify that its current implementation, based on gradient descent optimization, requires a considerable computational load of a large number of samples (thousands). Besides, EKRA-based relevance analysis could be biased under imbalanced data scenarios, due to the CKA-based cost function tends to accentuate data relations within the class with the highest number of samples.</p>
<p>As the future work, the authors plan to improve the EKRA algorithm by introducing more elaborate alignment functions and different kernel mappings with the aim to get a better description of non-stationary signals, which can be immersed in either Gaussian or non-Gaussian noise conditions. For example, the measures based on information theory would be of benefit (Giraldo et al., <xref ref-type="bibr" rid="B31">2015</xref>). Besides, hidden inter-channel relationships should be estimated to enhance the extraction of brain activity patterns within the EKRA-based framework (Dauwan et al., <xref ref-type="bibr" rid="B20">2016</xref>). Alike, a more elaborate study regarding the EKRA optimization process must be carried out by including second derivatives and low-rank approximations of the kernel matrices to favor its convergence, and weighting approaches to deal with imbalanced tasks (Jian et al., <xref ref-type="bibr" rid="B38">2016</xref>). Furthermore, the EKRA benefits should be studied regarding the EEG reference used, e.g., the average reference vs. the Reference Electrode Standardization Technique; in particular, if we intend to extend our proposal to other salient brain activity applications (Yao, <xref ref-type="bibr" rid="B58">2001</xref>; Chella et al., <xref ref-type="bibr" rid="B15">2016</xref>; Yao, <xref ref-type="bibr" rid="B59">2017</xref>).</p>
</sec>
<sec id="s6">
<title>Author contributions</title>
<p>AA: EKRA theoretical development and coding. Motor imagery database tests and analysis. Support the manuscript writing. AO: Epileptic database tests and analysis. Support the manuscript writing. GC: EKRA theoretical development, BCI database analysis, and support the manuscript writing.</p>
<sec>
<title>Conflict of interest statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</sec>
</body>
<back>
<ack>
<p>This work is supported by COLCIENCIAS grant 111074455778: &#x0201C;Desarrollo de un sistema de apoyo al diagn&#x000F3;stico no invasivo de pacientes con epilepsia f&#x000E1;rmaco-resistente asociada a displasis corticales cerebrales: m&#x000E9;todo costo-efectivo basado en procesamiento de im&#x000E1;genes de resonancia magn&#x000E9;tica.&#x0201D;</p>
</ack>
<sec sec-type="supplementary-material" id="s7">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fnins.2017.00550/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fnins.2017.00550/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="DataSheet1.docx" id="SM1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Acharya</surname> <given-names>U. R.</given-names></name> <name><surname>Molinari</surname> <given-names>F.</given-names></name> <name><surname>Sree</surname> <given-names>S. V.</given-names></name> <name><surname>Chattopadhyay</surname> <given-names>S.</given-names></name> <name><surname>Ng</surname> <given-names>K.-H.</given-names></name> <name><surname>Suri</surname> <given-names>J. S.</given-names></name></person-group> (<year>2012</year>). <article-title>Automated diagnosis of epileptic EEG using entropies</article-title>. <source>Biomed. Signal Proc. Control</source> <volume>7</volume>, <fpage>401</fpage>&#x02013;<lpage>408</lpage>. <pub-id pub-id-type="doi">10.1016/j.bspc.2011.07.007</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Adeli</surname> <given-names>E.</given-names></name> <name><surname>Wu</surname> <given-names>G.</given-names></name> <name><surname>Saghafi</surname> <given-names>B.</given-names></name> <name><surname>An</surname> <given-names>L.</given-names></name> <name><surname>Shi</surname> <given-names>F.</given-names></name> <name><surname>Shen</surname> <given-names>D.</given-names></name></person-group> (<year>2017</year>). <article-title>Kernel-based joint feature selection and max-margin classification for early diagnosis of Parkinson&#x00027;s disease</article-title>. <source>Sci. Rep.</source> <volume>7</volume>:<fpage>41069</fpage>. <pub-id pub-id-type="doi">10.1038/srep41069</pub-id><pub-id pub-id-type="pmid">28120883</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Al-Fahoum</surname> <given-names>A. S.</given-names></name> <name><surname>Al-Fraihat</surname> <given-names>A. A.</given-names></name></person-group> (<year>2014</year>). <article-title>Methods of EEG signal features extraction using linear analysis in frequency and time-frequency domains</article-title>. <source>ISRN Neurosci.</source> <volume>2014</volume>:<fpage>730218</fpage>. <pub-id pub-id-type="doi">10.1155/2014/730218</pub-id><pub-id pub-id-type="pmid">24967316</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alomari</surname> <given-names>M. H.</given-names></name> <name><surname>Awada</surname> <given-names>E. A.</given-names></name> <name><surname>Samaha</surname> <given-names>A.</given-names></name> <name><surname>Alkamha</surname> <given-names>K.</given-names></name></person-group> (<year>2014</year>). <article-title>Wavelet-based feature extraction for the analysis of EEG signals associated with imagined fists and feet movements</article-title>. <source>Comput. Inf. Sci.</source> <volume>7</volume>:<fpage>17</fpage>. <pub-id pub-id-type="doi">10.5539/cis.v7n2p17</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alotaiby</surname> <given-names>T.</given-names></name> <name><surname>El-Samie</surname> <given-names>F. E. A.</given-names></name> <name><surname>Alshebeili</surname> <given-names>S. A.</given-names></name> <name><surname>Ahmad</surname> <given-names>I.</given-names></name></person-group> (<year>2015</year>). <article-title>A review of channel selection algorithms for EEG signal processing</article-title>. <source>EURASIP J. Adv. Signal Proc.</source> <volume>2015</volume>:<fpage>66</fpage>. <pub-id pub-id-type="doi">10.1186/s13634-015-0251-9</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alvarez Meza</surname> <given-names>A.</given-names></name> <name><surname>Velasquez Martinez</surname> <given-names>L.</given-names></name> <name><surname>Castellanos Dominguez</surname> <given-names>G.</given-names></name></person-group> (<year>2015</year>). <article-title>Time-series discrimination using feature relevance analysis in motor imagery classification</article-title>. <source>Neurocomputing</source> <volume>151</volume>, <fpage>122</fpage>&#x02013;<lpage>129</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2014.07.077</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Andrzejak</surname> <given-names>R. G.</given-names></name> <name><surname>Lehnertz</surname> <given-names>K.</given-names></name> <name><surname>Mormann</surname> <given-names>F.</given-names></name> <name><surname>Rieke</surname> <given-names>C.</given-names></name> <name><surname>David</surname> <given-names>P.</given-names></name> <name><surname>Elger</surname> <given-names>C. E.</given-names></name></person-group> (<year>2001</year>). <article-title>Indications of non-linear deterministic and finite-dimensional structures in time series of brain electrical activity: dependence on recording region and brain state</article-title>. <source>Phys. Rev. E</source> <volume>64</volume>:<fpage>061907</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevE.64.061907</pub-id></citation></ref>
<ref id="B8">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Arias-Mora</surname> <given-names>L.</given-names></name> <name><surname>L&#x000F3;pez-R&#x000ED;os</surname> <given-names>L.</given-names></name> <name><surname>C&#x000E9;spedes-Villar</surname> <given-names>Y.</given-names></name> <name><surname>Velasquez-Martinez</surname> <given-names>L. F.</given-names></name> <name><surname>Alvarez-Meza</surname> <given-names>A. M.</given-names></name> <name><surname>Castellanos-Dominguez</surname> <given-names>G.</given-names></name></person-group> (<year>2015</year>). <article-title>Kernel-based relevant feature extraction to support motor imagery classification</article-title>, in <source>Signal Processing, Images and Computer Vision (STSIVA)</source> (<publisher-loc>Bogot&#x000E1;</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>1</fpage>&#x02013;<lpage>6</lpage>.</citation></ref>
<ref id="B9">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Aydemir</surname> <given-names>O.</given-names></name> <name><surname>Kayikcioglu</surname> <given-names>T.</given-names></name></person-group> (<year>2011</year>). <article-title>Wavelet transform based classification of invasive brain computer interface data</article-title>. <source>Radioengineering</source> <volume>20</volume>, <fpage>31</fpage>&#x02013;<lpage>38</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.radioeng.cz/fulltexts/2011/11_01_031_038.pdf">https://www.radioeng.cz/fulltexts/2011/11_01_031_038.pdf</ext-link></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bhattacharyya</surname> <given-names>S.</given-names></name> <name><surname>Sengupta</surname> <given-names>A.</given-names></name> <name><surname>Chakraborti</surname> <given-names>T.</given-names></name> <name><surname>Konar</surname> <given-names>A.</given-names></name> <name><surname>Tibarewala</surname> <given-names>D. N.</given-names></name></person-group> (<year>2014</year>). <article-title>Automatic feature selection of motor imagery EEG signals using differential evolution and learning automata</article-title>. <source>Med. Biol. Eng. Comput.</source> <volume>52</volume>, <fpage>131</fpage>&#x02013;<lpage>139</lpage>. <pub-id pub-id-type="doi">10.1007/s11517-013-1123-9</pub-id><pub-id pub-id-type="pmid">24165805</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Birjandtalab</surname> <given-names>J.</given-names></name> <name><surname>Pouyan</surname> <given-names>M. B.</given-names></name> <name><surname>Cogan</surname> <given-names>D.</given-names></name> <name><surname>Nourani</surname> <given-names>M.</given-names></name> <name><surname>Harvey</surname> <given-names>J.</given-names></name></person-group> (<year>2017</year>). <article-title>Automated seizure detection using limited-channel EEG and non-linear dimension reduction</article-title>. <source>Comput. Biol. Med.</source> <volume>82</volume>, <fpage>49</fpage>&#x02013;<lpage>58</lpage>. <pub-id pub-id-type="doi">10.1016/j.compbiomed.2017.01.011</pub-id><pub-id pub-id-type="pmid">28161592</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blankertz</surname> <given-names>B.</given-names></name> <name><surname>Dornhege</surname> <given-names>G.</given-names></name> <name><surname>Krauledat</surname> <given-names>M.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K.-R.</given-names></name> <name><surname>Curio</surname> <given-names>G.</given-names></name></person-group> (<year>2007</year>). <article-title>The non-invasive Berlin brain-computer interface: fast acquisition of effective performance in untrained subjects</article-title>. <source>NeuroImage</source> <volume>37</volume>, <fpage>539</fpage>&#x02013;<lpage>550</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2007.01.051</pub-id><pub-id pub-id-type="pmid">17475513</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Brockmeier</surname> <given-names>A. J.</given-names></name> <name><surname>Choi</surname> <given-names>J. S.</given-names></name> <name><surname>Kriminger</surname> <given-names>E. G.</given-names></name> <name><surname>Francis</surname> <given-names>J. T.</given-names></name> <name><surname>Principe</surname> <given-names>J. C.</given-names></name></person-group> (<year>2014</year>). <article-title>Neural decoding with Kernel-based metric learning</article-title>. <source>Neural Comput.</source> <volume>26</volume>, <fpage>1080</fpage>&#x02013;<lpage>1107</lpage>. <pub-id pub-id-type="doi">10.1162/NECO_a_00591</pub-id><pub-id pub-id-type="pmid">24684447</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Brockmeier</surname> <given-names>A. J.</given-names></name> <name><surname>Sanchez Giraldo</surname> <given-names>L. G.</given-names></name> <name><surname>Emigh</surname> <given-names>M. S.</given-names></name> <name><surname>Bae</surname> <given-names>J.</given-names></name> <name><surname>Choi</surname> <given-names>J. S.</given-names></name> <name><surname>Francis</surname> <given-names>J. T.</given-names></name> <etal/></person-group>. (<year>2013</year>). <article-title>Information-theoretic metric learning: 2-D linear projections of neural data for visualization</article-title>, in <source>Engineering in Medicine and Biology Society (EMBC), 2013 35th Annual International Conference of the IEEE</source> (<publisher-loc>Osaka</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>5586</fpage>&#x02013;<lpage>5589</lpage>. <pub-id pub-id-type="pmid">24111003</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chella</surname> <given-names>F.</given-names></name> <name><surname>Pizzella</surname> <given-names>V.</given-names></name> <name><surname>Zappasodi</surname> <given-names>F.</given-names></name> <name><surname>Marzetti</surname> <given-names>L.</given-names></name></person-group> (<year>2016</year>). <article-title>Impact of the reference choice on scalp eeg connectivity estimation</article-title>. <source>J. Neural Eng.</source> <volume>13</volume>:<fpage>036016</fpage>. <pub-id pub-id-type="doi">10.1088/1741-2560/13/3/036016</pub-id><pub-id pub-id-type="pmid">27138114</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>L.-I.</given-names></name> <name><surname>Zhao</surname> <given-names>Y.</given-names></name> <name><surname>Zhang</surname> <given-names>J.</given-names></name> <name><surname>Zou</surname> <given-names>J.-Z.</given-names></name></person-group> (<year>2015</year>). <article-title>Automatic detection of alertness/drowsiness from physiological signals using wavelet-based non-linear features and machine learning</article-title>. <source>Exp. Syst. Appl.</source> <volume>42</volume>, <fpage>7344</fpage>&#x02013;<lpage>7355</lpage>. <pub-id pub-id-type="doi">10.1016/j.eswa.2015.05.028</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>S.</given-names></name> <name><surname>Luo</surname> <given-names>Z.</given-names></name> <name><surname>Gan</surname> <given-names>H.</given-names></name></person-group> (<year>2016</year>). <article-title>An entropy fusion method for feature extraction of EEG</article-title>. <source>Neural Comput. Appl.</source> <volume>1</volume>, <fpage>1</fpage>&#x02013;<lpage>7</lpage>. <pub-id pub-id-type="doi">10.1007/s00521-016-2594-z</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chu</surname> <given-names>C.</given-names></name> <name><surname>Ni</surname> <given-names>Y.</given-names></name> <name><surname>Tan</surname> <given-names>G.</given-names></name> <name><surname>Saunders</surname> <given-names>C. J.</given-names></name> <name><surname>Ashburner</surname> <given-names>J.</given-names></name></person-group> (<year>2011</year>). <article-title>Kernel regression for fMRI pattern prediction</article-title>. <source>NeuroImage</source> <volume>56</volume>, <fpage>662</fpage>&#x02013;<lpage>673</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2010.03.058</pub-id><pub-id pub-id-type="pmid">20348000</pub-id></citation></ref>
<ref id="B19">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Cortes</surname> <given-names>C.</given-names></name> <name><surname>Mohri</surname> <given-names>M.</given-names></name> <name><surname>Rostamizadeh</surname> <given-names>A.</given-names></name></person-group> (<year>2012</year>). <article-title>Algorithms for learning kernels based on centered alignment</article-title>. <source>J. Mach. Learn. Res.</source> <volume>13</volume>, <fpage>795</fpage>&#x02013;<lpage>828</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://www.jmlr.org/papers/volume13/cortes12a/cortes12a.pdf">http://www.jmlr.org/papers/volume13/cortes12a/cortes12a.pdf</ext-link></citation></ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dauwan</surname> <given-names>M.</given-names></name> <name><surname>Van Dellen</surname> <given-names>E.</given-names></name> <name><surname>van Boxtel</surname> <given-names>L.</given-names></name> <name><surname>van Straaten</surname> <given-names>E. C.</given-names></name> <name><surname>de Waal</surname> <given-names>H.</given-names></name> <name><surname>Lemstra</surname> <given-names>A. W.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>Eeg-directed connectivity from posterior brain regions is decreased in dementia with lewy bodies: a comparison with Alzheimer&#x00027;s disease and controls</article-title>. <source>Neurobiol. Aging</source> <volume>41</volume>, <fpage>122</fpage>&#x02013;<lpage>129</lpage>. <pub-id pub-id-type="doi">10.1016/j.neurobiolaging.2016.02.017</pub-id><pub-id pub-id-type="pmid">27103525</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Daza-Santacoloma</surname> <given-names>G.</given-names></name> <name><surname>Arias-Londo&#x00144;o</surname> <given-names>J. D.</given-names></name> <name><surname>Godino-Llorente</surname> <given-names>J. I.</given-names></name> <name><surname>S&#x000E1;enz-Lech&#x000F3;n</surname> <given-names>N.</given-names></name> <name><surname>Osma-Ruiz</surname> <given-names>V.</given-names></name> <name><surname>Castellanos-Dominguez</surname> <given-names>G.</given-names></name></person-group> (<year>2009</year>). <article-title>Dynamic feature extraction: an application to voice pathology detection</article-title>. <source>Intell. Autom. Soft Comput.</source> <volume>15</volume>, <fpage>667</fpage>&#x02013;<lpage>682</lpage>. <pub-id pub-id-type="doi">10.1080/10798587.2009.10643056</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Duque-Mu&#x000F1;oz</surname> <given-names>L.</given-names></name> <name><surname>Espinosa-Oviedo</surname> <given-names>J. J.</given-names></name> <name><surname>Castellanos-Dominguez</surname> <given-names>C. G.</given-names></name></person-group> (<year>2014</year>). <article-title>Identification and monitoring of brain activity based on stochastic relevance analysis of short&#x02013;time EEG rhythms</article-title>. <source>Biomed. Eng. Online</source> <volume>13</volume>:<fpage>123</fpage>. <pub-id pub-id-type="doi">10.1186/1475-925X-13-123</pub-id><pub-id pub-id-type="pmid">25168571</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Duque Mu&#x000F1;oz</surname> <given-names>L.</given-names></name> <name><surname>Pinzon Morales</surname> <given-names>R.</given-names></name> <name><surname>Castellanos Dominguez</surname> <given-names>G.</given-names></name></person-group> (<year>2015</year>). <article-title>EEG Rhythm Extraction Based on Relevance Analysis and Customized Wavelet Transform</article-title>, in <source>Artificial Computation in Biology and Medicine</source>. (<publisher-loc>Elche</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>419</fpage>&#x02013;<lpage>428</lpage>.</citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fang</surname> <given-names>R.</given-names></name> <name><surname>Pouyanfar</surname> <given-names>S.</given-names></name> <name><surname>Yang</surname> <given-names>Y.</given-names></name> <name><surname>Chen</surname> <given-names>S.-C.</given-names></name> <name><surname>Iyengar</surname> <given-names>S.</given-names></name></person-group> (<year>2016</year>). <article-title>Computational health informatics in the big data age: a survey</article-title>. <source>ACM Comput. Surveys (CSUR)</source> <volume>49</volume>:<fpage>12</fpage>. <pub-id pub-id-type="doi">10.1145/2932707</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Faust</surname> <given-names>O.</given-names></name> <name><surname>Acharya</surname> <given-names>U. R.</given-names></name> <name><surname>Adeli</surname> <given-names>H.</given-names></name> <name><surname>Adeli</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>Wavelet-based EEG processing for computer-aided seizure detection and epilepsy diagnosis</article-title>. <source>Seizure</source> <volume>26</volume>, <fpage>56</fpage>&#x02013;<lpage>64</lpage>. <pub-id pub-id-type="doi">10.1016/j.seizure.2015.01.012</pub-id><pub-id pub-id-type="pmid">25799903</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Feess</surname> <given-names>D.</given-names></name> <name><surname>Krell</surname> <given-names>M. M.</given-names></name> <name><surname>Metzen</surname> <given-names>J. H.</given-names></name></person-group> (<year>2013</year>). <article-title>Comparison of sensor selection mechanisms for an ERP-based brain-computer interface</article-title>. <source>PLoS ONE</source> <volume>8</volume>:<fpage>e67543</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0067543</pub-id><pub-id pub-id-type="pmid">23844021</pub-id></citation></ref>
<ref id="B27">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Fukumizu</surname> <given-names>K.</given-names></name> <name><surname>Bach</surname> <given-names>F. R.</given-names></name> <name><surname>Jordan</surname> <given-names>M. I.</given-names></name></person-group> (<year>2004</year>). <article-title>Dimensionality reduction for supervised learning with reproducing kernel Hilbert spaces</article-title>. <source>J. Mach. Learn. Res.</source> <volume>5</volume>, <fpage>73</fpage>&#x02013;<lpage>99</lpage>. Available online at: <ext-link ext-link-type="uri" xlink:href="http://www.jmlr.org/papers/volume5/fukumizu04a/fukumizu04a.pdf">http://www.jmlr.org/papers/volume5/fukumizu04a/fukumizu04a.pdf</ext-link></citation></ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gajic</surname> <given-names>D.</given-names></name> <name><surname>Djurovic</surname> <given-names>Z.</given-names></name> <name><surname>Di Gennaro</surname> <given-names>S.</given-names></name> <name><surname>Gustafsson</surname> <given-names>F.</given-names></name></person-group> (<year>2014</year>). <article-title>Classification of EEG signals for detection of epileptic seizures based on wavelets and statistical pattern recognition</article-title>. <source>Biomed. Eng. Appl. Basis Commun.</source> <volume>26</volume>:<fpage>1450021</fpage>. <pub-id pub-id-type="doi">10.4015/S1016237214500215</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gandhi</surname> <given-names>T.</given-names></name> <name><surname>Panigrahi</surname> <given-names>B. K.</given-names></name> <name><surname>Anand</surname> <given-names>S.</given-names></name></person-group> (<year>2011</year>). <article-title>A comparative study of wavelet families for EEG signal classification</article-title>. <source>Neurocomputing</source> <volume>74</volume>, <fpage>3051</fpage>&#x02013;<lpage>3057</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2011.04.029</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ghosh-Dastidar</surname> <given-names>S.</given-names></name> <name><surname>Adeli</surname> <given-names>H.</given-names></name> <name><surname>Dadmehr</surname> <given-names>N.</given-names></name></person-group> (<year>2008</year>). <article-title>Principal component analysis-enhanced cosine radial basis function neural network for robust epilepsy and seizure detection</article-title>. <source>IEEE Trans. Biomed. Eng.</source> <volume>55</volume>, <fpage>512</fpage>&#x02013;<lpage>518</lpage>. <pub-id pub-id-type="doi">10.1109/TBME.2007.905490</pub-id><pub-id pub-id-type="pmid">18269986</pub-id></citation></ref>
<ref id="B31">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Giraldo</surname> <given-names>L. G. S.</given-names></name> <name><surname>Rao</surname> <given-names>M.</given-names></name> <name><surname>Principe</surname> <given-names>J. C.</given-names></name></person-group> (<year>2015</year>). <article-title>Measures of entropy from data using infinitely divisible kernels</article-title>. <source>IEEE Trans. Inform. Theory</source> <volume>61</volume>, <fpage>535</fpage>&#x02013;<lpage>548</lpage>. <pub-id pub-id-type="doi">10.1109/TIT.2014.2370058</pub-id></citation></ref>
<ref id="B32">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gretton</surname> <given-names>A.</given-names></name> <name><surname>Bousquet</surname> <given-names>O.</given-names></name> <name><surname>Smola</surname> <given-names>A.</given-names></name> <name><surname>Sch&#x000F6;lkopf</surname> <given-names>B.</given-names></name></person-group> (<year>2005</year>). <article-title>Measuring statistical dependence with Hilbert-Schmidt norms</article-title>, in <source>Algorithmic Learning Theory</source> (<publisher-loc>Singapore</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>63</fpage>&#x02013;<lpage>77</lpage>.</citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hanakawa</surname> <given-names>T.</given-names></name> <name><surname>Immisch</surname> <given-names>I.</given-names></name> <name><surname>Toma</surname> <given-names>K.</given-names></name> <name><surname>Dimyan</surname> <given-names>M. A.</given-names></name> <name><surname>Van Gelderen</surname> <given-names>P.</given-names></name> <name><surname>Hallett</surname> <given-names>M.</given-names></name></person-group> (<year>2003</year>). <article-title>Functional properties of brain areas associated with motor execution and imagery</article-title>. <source>J. Neurophysiol.</source> <volume>89</volume>, <fpage>989</fpage>&#x02013;<lpage>1002</lpage>. <pub-id pub-id-type="doi">10.1152/jn.00132.2002</pub-id><pub-id pub-id-type="pmid">12574475</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Haufe</surname> <given-names>S.</given-names></name> <name><surname>D&#x000E4;hne</surname> <given-names>S.</given-names></name> <name><surname>Nikulin</surname> <given-names>V. V.</given-names></name></person-group> (<year>2014</year>). <article-title>Dimensionality reduction for the analysis of brain oscillations</article-title>. <source>NeuroImage</source> <volume>101</volume>, <fpage>583</fpage>&#x02013;<lpage>597</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2014.06.073</pub-id><pub-id pub-id-type="pmid">25003816</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>He</surname> <given-names>W.</given-names></name> <name><surname>Wei</surname> <given-names>P.</given-names></name> <name><surname>Wang</surname> <given-names>L.</given-names></name> <name><surname>Zou</surname> <given-names>Y.</given-names></name></person-group> (<year>2012</year>). <article-title>A novel emd-based common spatial pattern for motor imagery brain-computer interface</article-title>, in <source>Proceedings of 2012 IEEE-EMBS International Conference on Biomedical and Health Informatics</source> (<publisher-loc>Hong Kong</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>216</fpage>&#x02013;<lpage>219</lpage>.</citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Higashi</surname> <given-names>H.</given-names></name> <name><surname>Tanaka</surname> <given-names>T.</given-names></name></person-group> (<year>2013</year>). <article-title>Common spatio-time-frequency patterns for motor imagery-based brain machine interfaces</article-title>. <source>Comput. Intell. Neurosci.</source> <volume>2013</volume>:<fpage>537218</fpage>. <pub-id pub-id-type="doi">10.1155/2013/537218</pub-id><pub-id pub-id-type="pmid">24302929</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hurtado-Rinc&#x000F3;n</surname> <given-names>J. V.</given-names></name> <name><surname>Mart&#x000ED;nez-Vargas</surname> <given-names>J. D.</given-names></name> <name><surname>Rojas-Jaramillo</surname> <given-names>S.</given-names></name> <name><surname>Giraldo</surname> <given-names>E.</given-names></name> <name><surname>Castellanos-Dominguez</surname> <given-names>G.</given-names></name></person-group> (<year>2016</year>). <article-title>Identification of Relevant Inter-channel EEG Connectivity Patterns: A Kernel-Based Supervised Approach</article-title>, in <source>International Conference on Brain and Health Informatics</source> (<publisher-loc>Omaha, NE</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>14</fpage>&#x02013;<lpage>23</lpage>.</citation></ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jian</surname> <given-names>C.</given-names></name> <name><surname>Gao</surname> <given-names>J.</given-names></name> <name><surname>Ao</surname> <given-names>Y.</given-names></name></person-group> (<year>2016</year>). <article-title>A new sampling method for classifying imbalanced data based on support vector machine ensemble</article-title>. <source>Neurocomputing</source> <volume>193</volume>, <fpage>115</fpage>&#x02013;<lpage>122</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2016.02.006</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kumar</surname> <given-names>S. P.</given-names></name> <name><surname>Sriraam</surname> <given-names>N.</given-names></name> <name><surname>Benakop</surname> <given-names>P.</given-names></name> <name><surname>Jinaga</surname> <given-names>B.</given-names></name></person-group> (<year>2010</year>). <article-title>Entropies based detection of epileptic seizures with artificial neural network classifiers</article-title>. <source>Exp. Syst. Appl.</source> <volume>37</volume>, <fpage>3284</fpage>&#x02013;<lpage>3291</lpage>. <pub-id pub-id-type="doi">10.1016/j.eswa.2009.09.051</pub-id></citation></ref>
<ref id="B40">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lee</surname> <given-names>J. A.</given-names></name> <name><surname>Verleysen</surname> <given-names>M.</given-names></name></person-group> (<year>2007</year>). <source>Non-linear Dimensionality Reduction</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Springer Science &#x00026; Business Media</publisher-name>.</citation></ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liao</surname> <given-names>X.</given-names></name> <name><surname>Yao</surname> <given-names>D.</given-names></name> <name><surname>Wu</surname> <given-names>D.</given-names></name> <name><surname>Li</surname> <given-names>C.</given-names></name></person-group> (<year>2007</year>). <article-title>Combining spatial filters for the classification of single-trial eeg in a finger movement task</article-title>. <source>IEEE Trans. Biomed. Eng.</source> <volume>54</volume>, <fpage>821</fpage>&#x02013;<lpage>831</lpage>. <pub-id pub-id-type="doi">10.1109/TBME.2006.889206</pub-id><pub-id pub-id-type="pmid">17518278</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>W.</given-names></name> <name><surname>Principe</surname> <given-names>J. C.</given-names></name> <name><surname>Haykin</surname> <given-names>S.</given-names></name></person-group> (<year>2011</year>). <source>Kernel Adaptive Filtering: A Comprehensive Introduction</source>, <volume>Vol. 57</volume>. <publisher-loc>New Jersey, NJ</publisher-loc>: <publisher-name>John Wiley &#x00026; Sons</publisher-name>.</citation></ref>
<ref id="B43">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Martinez-Leon</surname> <given-names>J.-A.</given-names></name> <name><surname>Cano-Izquierdo</surname> <given-names>J. M.</given-names></name> <name><surname>Ibarrola</surname> <given-names>J.</given-names></name></person-group> (<year>2015</year>). <article-title>Feature selection applying statistical and neurofuzzy methods to EEG-based BCI</article-title>. <source>Comput. Intell. Neurosci.</source> <volume>2015</volume>:<fpage>781207</fpage>. <pub-id pub-id-type="doi">10.1155/2015/781207</pub-id><pub-id pub-id-type="pmid">25977685</pub-id></citation></ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mart&#x000ED;nez-Vargas</surname> <given-names>J. D.</given-names></name> <name><surname>Godino Llorente</surname> <given-names>J. I.</given-names></name> <name><surname>Castellanos-Dominguez</surname> <given-names>G.</given-names></name></person-group> (<year>2012</year>). <article-title>Time&#x02013;frequency based feature selection for discrimination of non-stationary biosignals</article-title>. <source>EURASIP J. Adv. Signal Proc.</source> <volume>2012</volume>, <fpage>1</fpage>&#x02013;<lpage>18</lpage>. <pub-id pub-id-type="doi">10.1186/1687-6180-2012-219</pub-id></citation></ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Naeem</surname> <given-names>M.</given-names></name> <name><surname>Brunner</surname> <given-names>C.</given-names></name> <name><surname>Pfurtscheller</surname> <given-names>G.</given-names></name></person-group> (<year>2009</year>). <article-title>Dimensionality reduction and channel selection of motor imagery electroencephalographic data</article-title>. <source>Comput. Intell. Neurosci.</source> <volume>2009</volume>:<fpage>537504</fpage>. <pub-id pub-id-type="doi">10.1155/2009/537504</pub-id></citation></ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Naghsh-Nilchi</surname> <given-names>A. R.</given-names></name> <name><surname>Aghashahi</surname> <given-names>M.</given-names></name></person-group> (<year>2010</year>). <article-title>Epilepsy seizure detection using eigen-system spectral estimation and Multiple Layer Perceptron neural network</article-title>. <source>Biomed. Signal Proc. Control</source> <volume>5</volume>, <fpage>147</fpage>&#x02013;<lpage>157</lpage>. <pub-id pub-id-type="doi">10.1016/j.bspc.2010.01.004</pub-id></citation></ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nicolas-Alonso</surname> <given-names>L. F.</given-names></name> <name><surname>Gomez-Gil</surname> <given-names>J.</given-names></name></person-group> (<year>2012</year>). <article-title>Brain computer interfaces, a review</article-title>. <source>Sensors</source> <volume>12</volume>, <fpage>1211</fpage>&#x02013;<lpage>1279</lpage>. <pub-id pub-id-type="doi">10.3390/s120201211</pub-id><pub-id pub-id-type="pmid">22438708</pub-id></citation></ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pisotta</surname> <given-names>I.</given-names></name> <name><surname>Perruchoud</surname> <given-names>D.</given-names></name> <name><surname>Ionta</surname> <given-names>S.</given-names></name></person-group> (<year>2015</year>). <article-title>Hand-in-hand advances in biomedical engineering and sensorimotor restoration</article-title>. <source>J. Neurosci. Methods</source> <volume>246</volume>, <fpage>22</fpage>&#x02013;<lpage>29</lpage>. <pub-id pub-id-type="doi">10.1016/j.jneumeth.2015.03.003</pub-id><pub-id pub-id-type="pmid">25769276</pub-id></citation></ref>
<ref id="B49">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Polat</surname> <given-names>K.</given-names></name> <name><surname>Gunes</surname> <given-names>S.</given-names></name></person-group> (<year>2007</year>). <article-title>Classification of epileptiform EEG using a hybrid system based on decision tree classifier and fast Fourier transform</article-title>. <source>Appl. Math. Comput.</source> <volume>187</volume>, <fpage>1017</fpage>&#x02013;<lpage>1026</lpage>. <pub-id pub-id-type="doi">10.1016/j.amc.2006.09.022</pub-id></citation></ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rodr&#x000ED;guez-Berm&#x000FA;dez</surname> <given-names>G.</given-names></name> <name><surname>Garc&#x000ED;a-Laencina</surname> <given-names>P. J.</given-names></name> <name><surname>Roca-Gonz&#x000E1;lez</surname> <given-names>J.</given-names></name> <name><surname>Roca-Dorda</surname> <given-names>J.</given-names></name></person-group> (<year>2013</year>). <article-title>Efficient feature selection and linear discrimination of EEG signals</article-title>. <source>Neurocomputing</source> <volume>115</volume>, <fpage>161</fpage>&#x02013;<lpage>165</lpage>. <pub-id pub-id-type="doi">10.1016/j.neucom.2013.01.001</pub-id></citation></ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Srinivasan</surname> <given-names>V.</given-names></name> <name><surname>Eswaran</surname> <given-names>C.</given-names></name> <name><surname>Sriraam</surname> <given-names>N.</given-names></name></person-group> (<year>2005</year>). <article-title>Artificial neural network based epileptic detection using time-domain and frequency-domain features</article-title>. <source>J. Med. Syst.</source> <volume>29</volume>, <fpage>647</fpage>&#x02013;<lpage>660</lpage>. <pub-id pub-id-type="doi">10.1007/s10916-005-6133-1</pub-id><pub-id pub-id-type="pmid">16235818</pub-id></citation></ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sturm</surname> <given-names>I.</given-names></name> <name><surname>Lapuschkin</surname> <given-names>S.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K.-R.</given-names></name></person-group> (<year>2016</year>). <article-title>Interpretable deep neural networks for single-trial eeg classification</article-title>. <source>J. Neurosci. Methods</source> <volume>274</volume>, <fpage>141</fpage>&#x02013;<lpage>145</lpage>. <pub-id pub-id-type="doi">10.1016/j.jneumeth.2016.10.008</pub-id><pub-id pub-id-type="pmid">27746229</pub-id></citation></ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tang</surname> <given-names>Y.</given-names></name> <name><surname>Durand</surname> <given-names>D.</given-names></name></person-group> (<year>2012</year>). <article-title>A tunable support vector machine assembly classifier for epileptic seizure detection</article-title>. <source>Exp. Syst. Appl.</source> <volume>39</volume>, <fpage>3925</fpage>&#x02013;<lpage>3938</lpage>. <pub-id pub-id-type="doi">10.1016/j.eswa.2011.08.088</pub-id><pub-id pub-id-type="pmid">22563146</pub-id></citation></ref>
<ref id="B54">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tzallas</surname> <given-names>A. T.</given-names></name> <name><surname>Tsipouras</surname> <given-names>M. G.</given-names></name> <name><surname>Fotiadis</surname> <given-names>D.</given-names></name></person-group> (<year>2009</year>). <article-title>Epileptic seizure detection in EEGs using time&#x02013;frequency analysis</article-title>. <source>Inform. Technol. Biomed. IEEE Trans.</source> <volume>13</volume>, <fpage>703</fpage>&#x02013;<lpage>710</lpage>. <pub-id pub-id-type="doi">10.1109/TITB.2009.2017939</pub-id><pub-id pub-id-type="pmid">19304486</pub-id></citation></ref>
<ref id="B55">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Vecchiato</surname> <given-names>G.</given-names></name> <name><surname>Maglione</surname> <given-names>A. G.</given-names></name> <name><surname>Babiloni</surname> <given-names>F.</given-names></name></person-group> (<year>2015</year>). <article-title>On the use of cognitive neuroscience in industrial applications by using neuroelectromagnetic recordings</article-title>, in <source>Advances in Cognitive Neurodynamics (IV)</source> (<publisher-loc>Sigtuna</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>31</fpage>&#x02013;<lpage>37</lpage>.</citation></ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>J.-J.</given-names></name> <name><surname>Xue</surname> <given-names>F.</given-names></name> <name><surname>Li</surname> <given-names>H.</given-names></name></person-group> (<year>2015</year>). <article-title>Simultaneous channel and feature selection of fused EEG features based on sparse group LASSO</article-title>. <source>Biomed. Res. Int.</source> <volume>2015</volume>:<fpage>703768</fpage>. <pub-id pub-id-type="doi">10.1155/2015/703768</pub-id><pub-id pub-id-type="pmid">25802861</pub-id></citation></ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>She</surname> <given-names>X.</given-names></name> <name><surname>Liao</surname> <given-names>Y.</given-names></name> <name><surname>Li</surname> <given-names>H.</given-names></name> <name><surname>Zhang</surname> <given-names>Q.</given-names></name> <name><surname>Zhang</surname> <given-names>S.</given-names></name> <etal/></person-group>. (<year>2016</year>). <article-title>Tracking neural modulation depth by dual sequential monte carlo estimation on point processes for brain-machine interfaces</article-title>. <source>IEEE Trans. Biomed. Eng.</source> <volume>63</volume>, <fpage>1728</fpage>&#x02013;<lpage>1741</lpage>. <pub-id pub-id-type="doi">10.1109/TBME.2015.2500585</pub-id><pub-id pub-id-type="pmid">26584486</pub-id></citation></ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yao</surname> <given-names>D.</given-names></name></person-group> (<year>2001</year>). <article-title>A method to standardize a reference of scalp eeg recordings to a point at infinity</article-title>. <source>Physiol. Meas.</source> <volume>22</volume>:<fpage>693</fpage>. <pub-id pub-id-type="doi">10.1088/0967-3334/22/4/305</pub-id><pub-id pub-id-type="pmid">11761077</pub-id></citation></ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yao</surname> <given-names>D.</given-names></name></person-group> (<year>2017</year>). <article-title>Is the surface potential integral of a dipole in a volume conductor always zero? a cloud over the average reference of eeg and erp</article-title>. <source>Brain Topogr.</source> <volume>30</volume>, <fpage>161</fpage>&#x02013;<lpage>171</lpage>. <pub-id pub-id-type="doi">10.1007/s10548-016-0543-x</pub-id><pub-id pub-id-type="pmid">28194613</pub-id></citation></ref>
<ref id="B60">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zafer</surname> <given-names>I.</given-names></name> <name><surname>Zumray</surname> <given-names>D.</given-names></name> <name><surname>Tamer</surname> <given-names>D.</given-names></name></person-group> (<year>2011</year>). <article-title>Classification of electroencephalogram signals with combined time and frequency features</article-title>. <source>Exp. Syst. Appl.</source> <volume>38</volume>, <fpage>10499</fpage>&#x02013;<lpage>10505</lpage>. <pub-id pub-id-type="doi">10.1016/j.eswa.2011.02.110</pub-id></citation></ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zajacova</surname> <given-names>A.</given-names></name> <name><surname>Huzurbazar</surname> <given-names>S.</given-names></name> <name><surname>Greenwood</surname> <given-names>M.</given-names></name> <name><surname>Nguyen</surname> <given-names>H.</given-names></name></person-group> (<year>2015</year>). <article-title>Long-term BMI trajectories and health in older adults hierarchical clustering of functional curves</article-title>. <source>J. Aging Health</source> <volume>27</volume>, <fpage>1443</fpage>&#x02013;<lpage>1461</lpage>. <pub-id pub-id-type="doi">10.1177/0898264315584329</pub-id><pub-id pub-id-type="pmid">25953813</pub-id></citation></ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zandi</surname> <given-names>A. S.</given-names></name> <name><surname>Javidan</surname> <given-names>M.</given-names></name> <name><surname>Dumont</surname> <given-names>G. A.</given-names></name> <name><surname>Tafreshi</surname> <given-names>R.</given-names></name></person-group> (<year>2010</year>). <article-title>Automated real-time epileptic seizure detection in scalp EEG recordings using an algorithm based on wavelet packet transform</article-title>. <source>Biomed. Eng. IEEE Trans.</source> <volume>57</volume>, <fpage>1639</fpage>&#x02013;<lpage>1651</lpage>. <pub-id pub-id-type="doi">10.1109/TBME.2010.2046417</pub-id><pub-id pub-id-type="pmid">20659825</pub-id></citation></ref>
<ref id="B63">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>H.</given-names></name> <name><surname>Guan</surname> <given-names>C.</given-names></name> <name><surname>Ang</surname> <given-names>K. K.</given-names></name> <name><surname>Wang</surname> <given-names>C.</given-names></name></person-group> (<year>2012</year>). <article-title>BCI competition IV&#x02013;data set I: learning discriminative patterns for self-paced EEG-based motor imagery detection</article-title>. <source>Front. Neurosci.</source> <volume>6</volume>:<fpage>7</fpage>. <pub-id pub-id-type="doi">10.3389/fnins.2012.00007</pub-id><pub-id pub-id-type="pmid">22347153</pub-id></citation></ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Ji</surname> <given-names>X.</given-names></name> <name><surname>Zhang</surname> <given-names>S.</given-names></name></person-group> (<year>2016</year>). <article-title>An approach to EEG-based emotion recognition using combined feature extraction method</article-title>. <source>Neurosci. Lett.</source> <volume>633</volume>, <fpage>152</fpage>&#x02013;<lpage>157</lpage>. <pub-id pub-id-type="doi">10.1016/j.neulet.2016.09.037</pub-id><pub-id pub-id-type="pmid">27666975</pub-id></citation></ref>
<ref id="B65">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>Y.</given-names></name> <name><surname>Zhou</surname> <given-names>G.</given-names></name> <name><surname>Jin</surname> <given-names>J.</given-names></name> <name><surname>Wang</surname> <given-names>X.</given-names></name> <name><surname>Cichocki</surname> <given-names>A.</given-names></name></person-group> (<year>2015</year>). <article-title>Optimizing spatial patterns with sparse filter bands for motor-imagery based brain-computer interface</article-title>. <source>J. Neurosci. Methods</source>. <volume>255</volume>, <fpage>85</fpage>&#x02013;<lpage>91</lpage>. <pub-id pub-id-type="doi">10.1016/j.jneumeth.2015.08.004</pub-id><pub-id pub-id-type="pmid">26277421</pub-id></citation></ref>
<ref id="B66">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Zimmer</surname> <given-names>V. A.</given-names></name> <name><surname>Lekadir</surname> <given-names>K.</given-names></name> <name><surname>Hoogendoorn</surname> <given-names>C.</given-names></name> <name><surname>Frangi</surname> <given-names>A. F.</given-names></name> <name><surname>Piella</surname> <given-names>G.</given-names></name></person-group> (<year>2015</year>). <article-title>A framework for optimal kernel-based manifold embedding of medical image data</article-title>. <source>Comput. Med. Imaging Graph.</source> <volume>41</volume>, <fpage>93</fpage>&#x02013;<lpage>107</lpage>. <pub-id pub-id-type="doi">10.1016/j.compmedimag.2014.06.001</pub-id><pub-id pub-id-type="pmid">25008538</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn id="fn0001"><p><sup>1</sup><ext-link ext-link-type="uri" xlink:href="http://bbci.de/competition/iv/desc_1.html">http://bbci.de/competition/iv/desc_1.html</ext-link>. Dataset 1 used in the BCI competition IV 2008</p></fn>
<fn id="fn0002"><p><sup>2</sup><ext-link ext-link-type="uri" xlink:href="https://github.com/andresmarino07utp/EKRA-ES">https://github.com/andresmarino07utp/EKRA-ES</ext-link></p></fn>
</fn-group>
</back>
</article>
