<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Archiving and Interchange DTD v2.3 20070202//EN" "archivearticle.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="methods-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Artif. Intell.</journal-id>
<journal-title>Frontiers in Artificial Intelligence</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Artif. Intell.</abbrev-journal-title>
<issn pub-type="epub">2624-8212</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/frai.2024.1381921</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Artificial Intelligence</subject>
<subj-group>
<subject>Methods</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A global model-agnostic rule-based XAI method based on Parameterized Event Primitives for time series classifiers</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Mekonnen</surname> <given-names>Ephrem Tibebe</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2640513/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-original-draft/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Longo</surname> <given-names>Luca</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/589949/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/supervision/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Dondio</surname> <given-names>Pierpaolo</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/2830774/overview"/>
<role content-type="https://credit.niso.org/contributor-roles/supervision/"/>
<role content-type="https://credit.niso.org/contributor-roles/writing-review-editing/"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>School of Computer Science, College of Health and Science, Technological University Dublin</institution>, <addr-line>Dublin</addr-line>, <country>Ireland</country></aff>
<aff id="aff2"><sup>2</sup><institution>Artificial Intelligence and Cognitive Load Research Lab, Technological University Dublin</institution>, <addr-line>Dublin</addr-line>, <country>Ireland</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Giorgio Maria Di Nunzio, University of Padua, Italy</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Giovanni Paragliola, National Research Council (CNR), Italy</p>
<p>Anli Ji, Georgia State University, United States</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Ephrem Tibebe Mekonnen <email>D22125038&#x00040;mytudublin.ie</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>20</day>
<month>09</month>
<year>2024</year>
</pub-date>
<pub-date pub-type="collection">
<year>2024</year>
</pub-date>
<volume>7</volume>
<elocation-id>1381921</elocation-id>
<history>
<date date-type="received">
<day>04</day>
<month>02</month>
<year>2024</year>
</date>
<date date-type="accepted">
<day>30</day>
<month>08</month>
<year>2024</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2024 Mekonnen, Longo and Dondio.</copyright-statement>
<copyright-year>2024</copyright-year>
<copyright-holder>Mekonnen, Longo and Dondio</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Time series classification is a challenging research area where machine learning and deep learning techniques have shown remarkable performance. However, often, these are seen as black boxes due to their minimal interpretability. On the one hand, there is a plethora of eXplainable AI (XAI) methods designed to elucidate the functioning of models trained on image and tabular data. On the other hand, adapting these methods to explain deep learning-based time series classifiers may not be straightforward due to the temporal nature of time series data. This research proposes a novel global <italic>post-hoc</italic> explainable method for unearthing the key time steps behind the inferences made by deep learning-based time series classifiers. This novel approach generates a decision tree graph, a specific set of rules, that can be seen as explanations, potentially enhancing interpretability. The methodology involves two major phases: (1) training and evaluating deep-learning-based time series classification models, and (2) extracting parameterized primitive events, such as increasing, decreasing, local max and local min, from each instance of the evaluation set and clustering such events to extract prototypical ones. These prototypical primitive events are then used as input to a decision-tree classifier trained to fit the model predictions of the test set rather than the ground truth data. Experiments were conducted on diverse real-world datasets sourced from the UCR archive, employing metrics such as accuracy, fidelity, robustness, number of nodes, and depth of the extracted rules. The findings indicate that this global <italic>post-hoc</italic> method can improve the global interpretability of complex time series classification models.</p></abstract>
<kwd-group>
<kwd>deep learning</kwd>
<kwd>Explainable Artificial Intelligence</kwd>
<kwd>time series classification</kwd>
<kwd>decision tree</kwd>
<kwd>model agnostic</kwd>
<kwd><italic>post-hoc</italic></kwd>
</kwd-group>
<counts>
<fig-count count="7"/>
<table-count count="4"/>
<equation-count count="2"/>
<ref-count count="32"/>
<page-count count="10"/>
<word-count count="6148"/>
</counts>
<custom-meta-wrap>
<custom-meta>
<meta-name>section-at-acceptance</meta-name>
<meta-value>Machine Learning and Artificial Intelligence</meta-value>
</custom-meta>
</custom-meta-wrap>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1 Introduction</title>
<p>Due to the affordability of sensors, time series data have become prevalent in various domains, including finance (Zhang et al., <xref ref-type="bibr" rid="B31">2019</xref>), healthcare (Liu et al., <xref ref-type="bibr" rid="B8">2022</xref>; Strodthoff et al., <xref ref-type="bibr" rid="B25">2020</xref>), recognition of human activity (Mekruksavanich and Jitpattanakul, <xref ref-type="bibr" rid="B11">2021</xref>; Joshi and Abdelfattah, <xref ref-type="bibr" rid="B5">2021</xref>), and environmental monitoring (Shu et al., <xref ref-type="bibr" rid="B21">2019</xref>). Time series classification involves categorizing or assigning a class label to a given time series, a critical task in scenarios where sensor or financial data analysis is essential for informed business decisions. Various algorithms have been devised for time series classification. Deep learning models have shown exceptional effectiveness in tasks such as computer vision, natural language processing, and time series classification. However, these models are often deemed opaque due to their complex architecture and lack of transparency, giving rise to research in Explainable Artificial Intelligence (XAI) to address this limitation (Longo et al., <xref ref-type="bibr" rid="B9">2023</xref>). XAI is a growing field of research that aims to address this issue by (i) developing techniques that aim at providing understandable and transparent explanations of machine learning models and (ii) evaluating and assessing their impact on humans (Theissler et al., <xref ref-type="bibr" rid="B26">2022</xref>; Di Martino and Delmastro, <xref ref-type="bibr" rid="B3">2022</xref>; Vilone and Longo, <xref ref-type="bibr" rid="B28">2023</xref>). Several XAI methods have been proposed for deep learning-based time series classification models to overcome these issues. These techniques include using commonly used XAI methods for computer vision (Schlegel et al., <xref ref-type="bibr" rid="B19">2019</xref>), such as Local Interpretable Model-agnostic Explanations (LIME)(Ribeiro et al., <xref ref-type="bibr" rid="B17">2016</xref>), Saliency Maps (Simonyan et al., <xref ref-type="bibr" rid="B23">2013</xref>), and Layer-wise Relevance Propagation (LRP)(Bach et al., <xref ref-type="bibr" rid="B1">2015</xref>).</p>
<p>However, adapting existing XAI methods for image and tabular data to time series data presents unique challenges due to the need to account for the temporal nature of the data (Schlegel et al., <xref ref-type="bibr" rid="B19">2019</xref>; Theissler et al., <xref ref-type="bibr" rid="B26">2022</xref>). These methods often produce heatmap-based explanations that are hard to interpret and primarily developer-focused (Rojat et al., <xref ref-type="bibr" rid="B18">2021</xref>; Jeyakumar et al., <xref ref-type="bibr" rid="B4">2020</xref>). Moreover, feature importance methods such as bespoke LIME and SHAP fail to capture temporal dependencies by treating each time step or segment independently.</p>
<p>This research addresses these limitations by offering global rule-based explanations using parameterized event primitives, which represent specific types of events such as increasing or decreasing trends, local maxima, and local minima. These parameterized events effectively capture and convey inherent temporal patterns, making explanations more intuitive and comprehensible (Kadous, <xref ref-type="bibr" rid="B6">1999</xref>). The approach generates a decision tree that provides a set of rules assumed to be more understandable to humans, making it easier for non-experts to comprehend a model&#x00027;s predictions. Decision trees are considered interpretable by design and can provide insights into the relationships between features and the output (Molnar, <xref ref-type="bibr" rid="B12">2020</xref>). Furthermore, they can be easily visualized, facilitating comprehension of inference chains (Vilone and Longo, <xref ref-type="bibr" rid="B28">2023</xref>).</p>
<p>The main contribution of this research is a novel global <italic>post-hoc</italic> XAI method to explain the inference process of deep learning-based time series classification models using a decision tree based on parameterized event primitives.</p>
<p>Finally, we point out that our approach is a good starting point for further improvement and the generation of explanatory descriptions that back up AI decisions of time series classification models.</p>
<p>The rest of the paper is structured as follows: Section 2 reviews existing XAI methods that have been used to explain deep learning-based time series classifiers. Section 3 outlines the proposed approach. In Section 4, the experimental results are presented and discussed in detail. Finally, Section 5 concludes the article and highlights possible future directions.</p>
</sec>
<sec id="s2">
<title>2 Related work</title>
<p>In recent years, the surge of interest in Explainable Artificial Intelligence (XAI) methods has gained attention to address the transparency and interpretability challenges posed by complex models within the field of machine learning. In particular, two pivotal paradigms within the XAI framework are attributions and attentions (Theissler et al., <xref ref-type="bibr" rid="B26">2022</xref>).</p>
<p>Attribution methods, encompassing techniques such as LIME (Ribeiro et al., <xref ref-type="bibr" rid="B17">2016</xref>), Saliency Maps (Simonyan et al., <xref ref-type="bibr" rid="B23">2013</xref>), SHAP (Lundberg and Lee, <xref ref-type="bibr" rid="B10">2017</xref>), and LRP (Bach et al., <xref ref-type="bibr" rid="B1">2015</xref>), have played a critical role in computer vision for elucidating salient features within input data. The application of these methods has seamlessly transitioned to the domain of time series analysis, as evidenced by the works of Schlegel et al. (<xref ref-type="bibr" rid="B19">2019</xref>), particularly the work described in Neves et al. (<xref ref-type="bibr" rid="B14">2021</xref>) and Sivill and Flach (<xref ref-type="bibr" rid="B24">2022</xref>), adapted LIME for direct application to time series data. Advancing the discourse on time series classifiers, Zhou et al. (<xref ref-type="bibr" rid="B32">2021</xref>) have enriched the interpretability landscape by enhancing Class Activation Maps (CAM) and Grand-CAM with backpropagation. Simultaneously, the work described in Siddiqui et al. (<xref ref-type="bibr" rid="B22">2019</xref>) introduced TSViz, a saliency map-based methodology later integrated into TSXplain (Munir et al., <xref ref-type="bibr" rid="B13">2019</xref>) for unearthing the logic behind Deep Neural Networks (DNNs) in time series. These methodologies combine salient regions, instances, and statistical features, thereby fostering natural language explanations.</p>
<p>In the realm of time series data, Vielhaben et al. (<xref ref-type="bibr" rid="B27">2023</xref>) have introduced DFT-LRP, a tailored variant of Layer-wise Relevance Propagation (LRP). This methodology is purposefully designed to cater to the intricacies of time series data and involves the incorporation of a virtual inspection layer preceding the input layer, an innovative step facilitating the transformation of time series data and enabling the propagation of relevance attributions through Layer-wise Relevance Propagation (LRP).</p>
<p>Despite the efficacy of attributions, their application to time series data is not without challenges, due to the non-intelligible nature of time series (Schlegel and Keim, <xref ref-type="bibr" rid="B20">2021</xref>). Heat maps, often used in the visualization of attributions, are promising for domain experts, but pose challenges for general users (Jeyakumar et al., <xref ref-type="bibr" rid="B4">2020</xref>). Additionally, the assumption of feature independence inherent in attributions is frequently violated when considering adjacent observations within time series data (Watson, <xref ref-type="bibr" rid="B30">2022</xref>).</p>
<p>Similarly, attention mechanisms, notably exemplified by Karim et al. (<xref ref-type="bibr" rid="B7">2017</xref>), share a challenge in visual interpretation similar to that faced by attribution methods, often relying on heatmaps.</p>
<p>In the midst of the prevailing emphasis on local interpretability in XAI research, particularly in time series data, it is crucial to recognize researchers contributing to global insights in time series classifiers. The work described in Oviedo et al. (<xref ref-type="bibr" rid="B16">2019</xref>) generalizes CAM to encompass all instances within a class, offering an average CAM for comprehensive insight. Moreover, Siddiqui et al. (<xref ref-type="bibr" rid="B22">2019</xref>) focus on clustering filters, while Cho et al. (<xref ref-type="bibr" rid="B2">2020</xref>) concentrate on clustering input sequences, both enriching global understanding through grouping based on activation patterns.</p>
<p>Despite the multitude of Explainable Artificial Intelligence (XAI) methods dedicated to explaining specific instances in time series data, there is a noticeable gap. There is a lack of methods not tied to a specific model and can easily provide comprehensive global insights. Our novel approach presents a global model-agnostic method to explain deep learning-based time series classifiers using a decision tree. This approach aims to maintain the temporal dependency inherent in time-series data while providing explanations in an understandable format.</p>
<p>Our methodology falls within the domain of surrogate-based approaches, as we leverage linear models like decision trees to mimic the inference process of deep learning time series classifiers. The method produces a set of rules or a decision tree graph as an explanation, making it transparent and easy to comprehend. Decision tree-based explanations are intuitive and structured, representing the logic of an ML model as a set of rules that can be easily interpreted and visualized. Therefore, they are considered naturally transparent and intelligible by scholars (Vilone et al., <xref ref-type="bibr" rid="B29">2020</xref>).</p>
</sec>
<sec id="s3">
<title>3 Proposed method</title>
<p>This section introduces a novel model-agnostic <italic>post-hoc</italic> Explainable Artificial Intelligence (XAI) method for deep learning-based time series classifiers. <xref ref-type="fig" rid="F1">Figure 1</xref> illustrates the diagram of our proposed method, which consists of three distinct phases. In what follows, we provide a detailed explanation of the method.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Design of the proposed method: (Phase I) Initial data preprocessing for training and evaluating a deep learning-based time series classifier. (Phase II) Sub-steps include <bold>(a)</bold> Extraction of Parameterized Event Primitives (PEPs) from the test set, encompassing events like increasing, decreasing, flat, local maximum, and local minimum. <bold>(b)</bold> Clustering of PEPs, <bold>(c)</bold> Event attribution by counting events belonging to each cluster using the extracted events and predefined clusters, <bold>(d)</bold> Concatenating each data frame of PEPs produced during the event attribution step, and <bold>(e)</bold> Training and testing of the decision tree using the transformed test set and the model prediction. (Phase III) Evaluation of decision tree rules using objective metrics, including accuracy, fidelity, complexity, and robustness.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-07-1381921-g0001.tif"/>
</fig>
<sec>
<title>3.1 Phase I: training and evaluating</title>
<p>The initial phase of the method involves preparing the data and subsequently training and evaluating the targeted deep-learning models for explanation.</p>
</sec>
<sec>
<title>3.2 Phase II: transforming the test set</title>
<p>Parameterized Event Primitives (PEPs) are extracted from the test set of the deep learning model as shown in <xref ref-type="fig" rid="F1">Figure 1</xref>. Parameterized Event Primitives (PEPs) are enlisted to extract events defined by a tuple of parameters and a finding function presumed to manifest in the domain. Extracting PEPs from a time series helps to represent the temporal characteristics of events as parameters, which facilitates learning for interpretable models such as decision trees (Kadous, <xref ref-type="bibr" rid="B6">1999</xref>). In this study, an event refers to a specific pattern or behavior that is expected to occur in the domain. These events are defined using Parameterized Event Primitives such as increasing or decreasing trends, local maxima, and local minima, which are intuitive and meaningful to users.</p>
<p>The methodology outlined in Kadous (<xref ref-type="bibr" rid="B6">1999</xref>) is implemented with a modification (in Subsection 3.2.3 at the event attribution stage) that aims to count the number of events within a cluster (event<sub>cluster_num</sub>), rather than simply indicating their presence or absence with a binary representation. This refinement contributes to a noticeable improvement in the decision tree performance, particularly manifesting significant improvements in specific datasets. The accuracy of the decision tree assumes paramount importance, given its consequential impact on the enhancement of fidelity. Subsequently, the sections detail each step for transforming the test data to train and evaluate the surrogate decision tree.</p>
<sec>
<title>3.2.1 Extracting Parameterized Event Primitives</title>
<p>In this step, we extract Parameterized Event Primitives (PEPs) from each evaluation set time series sequence. Let a time series sequence be denoted as <italic>x</italic> &#x0003D; <italic>x</italic><sub>1</sub>, <italic>x</italic><sub>2</sub>, &#x02026;, <italic>x</italic><sub><italic>n</italic></sub>, where <italic>x</italic><sub><italic>i</italic></sub> represents the time series value at time <italic>i</italic>. The function that extracts the events takes a series as input and returns a list of extracted events, denoted as E. For example, considering the increasing event, <italic>E</italic><sub><italic>inc</italic></sub> can be represented as the set of tuples where each tuple contains the time when a positive gradient begins (<italic>t</italic><sub>start</sub>), the duration until the gradient stops increasing (dura), and the average gradient values (<italic>grad</italic><sub>avg</sub>). This can be formally denoted as:</p>
<disp-formula id="E1"><mml:math id="M1"><mml:mrow><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mtext class="textrm" mathvariant="normal">inc</mml:mtext></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mrow><mml:mo>{</mml:mo><mml:mrow><mml:mtable style="text-align:axis;" equalrows="false" columnlines="none" equalcolumns="false" class="array"><mml:mtr><mml:mtd><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mtext class="textrm" mathvariant="normal">start</mml:mtext></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mtext class="textrm" mathvariant="normal">dura</mml:mtext></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mtext class="textrm" mathvariant="normal">grad</mml:mtext></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mtext class="textrm" mathvariant="normal">avg</mml:mtext></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mtext class="textrm" mathvariant="normal">start</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mtext class="textrm" mathvariant="normal">dura</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mtext class="textrm" mathvariant="normal">grad</mml:mtext></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mtext class="textrm" mathvariant="normal">avg</mml:mtext></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo></mml:mtd><mml:mtd><mml:mo>&#x02026;</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:mrow><mml:mo>}</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>
<p><xref ref-type="fig" rid="F2">Figures 2A</xref>, <xref ref-type="fig" rid="F2">B</xref> show examples of extracted events from a single time series. <xref ref-type="fig" rid="F3">Figure 3</xref> shows the average number of extracted events per class for each parameterized event primitive across the entire Ford A dataset evaluation set.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Examples of events extracted from a single time series <bold>(A)</bold> increasing and decreasing events <bold>(B)</bold> local max and local min events.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-07-1381921-g0002.tif"/>
</fig>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Average number of extracted events for each parameterized event primitives on Ford A dataset. The <monospace>&#x02018;_ch1&#x00027;</monospace> suffix denotes the channel number; in this context, it signifies a univariate time series due to a single channel.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-07-1381921-g0003.tif"/>
</fig>
</sec>
<sec>
<title>3.2.2 Event clustering</title>
<p>Each parameterized event, denoted as <italic>E</italic>, undergoes a flattening process to apply a clustering algorithm (for instance, flattening increasing events across all the test set cases). The KMeans clustering algorithm was used in this experiment, with the silhouette method determining the optimal number of clusters. The optimal number corresponds to the highest average silhouette score, as illustrated in <xref ref-type="fig" rid="F4">Figure 4</xref>.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Optimal number of clusters obtained using silhouette method for <bold>(A)</bold> increasing events, <bold>(B)</bold> decreasing events, <bold>(C)</bold> local max events, and <bold>(D)</bold> local min events, of FordA data.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-07-1381921-g0004.tif"/>
</fig>
<p>This iterative procedure is executed for all four extracted parameterized events: increasing events (<italic>E</italic><sub>inc</sub>), decreasing events (<italic>E</italic><sub>dec</sub>), local maxima events (<italic>E</italic><sub>max</sub>), and local minima events (<italic>E</italic><sub>min</sub>). <xref ref-type="fig" rid="F5">Figure 5</xref> visually represents a set of clusters generated by the clustering algorithms for each parameterized event.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Clusters produced by KMeans for <bold>(A)</bold> increasing events, <bold>(B)</bold> decreasing events, <bold>(C)</bold> local maxima events, and <bold>(D)</bold> local minima events, of the FordA dataset.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-07-1381921-g0005.tif"/>
</fig>
<p>Each cluster, denoted as <italic>C</italic><sub><italic>i, j</italic></sub>, where <italic>i</italic> signifies the type of parameterized event and <italic>j</italic> represents the cluster index, serves as the foundation unit for the subsequent event attribution step.</p>
</sec>
<sec>
<title>3.2.3 Event attribution</title>
<p>At this step, the extracted events <italic>E</italic> and the set of clusters <italic>C</italic><sub><italic>j</italic></sub> are taken as input. The output of this process is a data frame <italic>D</italic>, where instances are represented along rows and clusters along columns. Each cell <italic>D</italic><sub><italic>i, j</italic></sub> denotes the number of events from the extracted set of events belonging to each cluster for a specific instance:</p>
<disp-formula id="E2"><mml:math id="M2"><mml:mrow><mml:msub><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mi>I</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>E</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mi>C</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>Here, <italic>k</italic> represents the index of the event in the list of the extracted events of <italic>i</italic> instance of the dataset, and <italic>n</italic> represents the length of the event (<italic>i</italic>.<italic>e</italic>., <italic>n</italic> &#x0003D; len(<italic>E</italic>)), and <italic>I</italic>(&#x000B7;) is the indicator function that equals 1 if the condition inside the parentheses is true and 0 otherwise. <xref ref-type="fig" rid="F6">Figure 6</xref> depicts the average number of events in each cluster within the event <italic>E</italic>.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Number of events belonging to each parameterized event cluster.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-07-1381921-g0006.tif"/>
</fig>
</sec>
<sec>
<title>3.2.4 Combination</title>
<p>Following the event attribution step, the resultant data frames corresponding to each parameterized event are combined to construct the training data set for the decision tree classifier. Upon the culmination of this process, a comprehensive training dataset is acquired and employed to train the decision tree classifier.</p>
</sec>
<sec>
<title>3.2.5 Train decision tree classifier</title>
<p>After transforming the test set, the next step is to apply the decision tree classifier. To do this, we split the transformed data into training and testing sets, with 70% of the data used for training and 30% used for testing.</p>
</sec>
</sec>
<sec>
<title>3.3 Phase III: objective evaluation</title>
<p>To objectively and quantitatively assess the interpretability of our method, we selected five metrics: accuracy, fidelity, robustness, depth, and number of nodes. To achieve objectivity, we exclude any human intervention in the evaluation process. Accuracy measures the fraction of correct predictions made by the model, while fidelity evaluates the consistency between the model&#x00027;s decision and the explanation provided by the decision tree. Depth and number of nodes measure the complexity of the decision tree. Robustness measures the XAI method&#x00027;s resilience to minor input changes that do not affect the model predictions. Refer to <xref ref-type="table" rid="T1">Table 1</xref> for an in-depth presentation of the objective evaluation metrics.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Objective evaluation metrics for rule-based explanation.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Metric</bold></th>
<th valign="top" align="center"><bold>Definition</bold></th>
<th valign="top" align="center"><bold>Formula</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Accuracy</td>
<td valign="top" align="center">The proportion of correctly predicted instances (<italic>c</italic>) out of the total instances (<italic>N</italic>).</td>
<td valign="top" align="center"><inline-formula><mml:math id="M3"><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">A</mml:mtext></mml:mstyle><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">c</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">N</mml:mtext></mml:mstyle></mml:mrow></mml:mfrac></mml:math></inline-formula></td>
</tr>
<tr>
<td valign="top" align="left">Fidelity</td>
<td valign="top" align="center">Ratio of input instances where the surrogate model agrees (a) with the actual model, divided by the total number of instances (N)</td>
<td valign="top" align="center"><inline-formula><mml:math id="M4"><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">F</mml:mtext></mml:mstyle><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">a</mml:mtext></mml:mstyle></mml:mrow><mml:mrow><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">N</mml:mtext></mml:mstyle></mml:mrow></mml:mfrac></mml:math></inline-formula></td>
</tr>
<tr>
<td valign="top" align="left">Complexity</td>
<td valign="top" align="center">The complexity or simplicity of the generated explanation is measured by the number of nodes and depth</td>
<td valign="top" align="center">C &#x0003D; &#x00023; Depth, &#x00023;Nodes</td>
</tr>
<tr>
<td valign="top" align="left">Robustness</td>
<td valign="top" align="center">The persistence of methods (the surrogate model (<italic>g</italic>(<italic>x</italic><sub><italic>n</italic></sub>)) in our case) to withstand small perturbations (&#x003B4;) of the input that does not change the prediction of the model (<italic>f</italic>(<italic>x</italic><sub><italic>n</italic></sub>)).</td>
<td valign="top" align="center"><inline-formula><mml:math id="M5"><mml:mstyle class="text"><mml:mtext class="textrm" mathvariant="normal">R</mml:mtext></mml:mstyle><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>g</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>g</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>x</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mi>&#x003B4;</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:mfrac></mml:math></inline-formula></td>
</tr></tbody>
</table>
</table-wrap>
</sec>
</sec>
<sec id="s4">
<title>4 Experimental settings</title>
<sec>
<title>4.1 Datasets and models</title>
<p>We specifically chose four univariate time series datasets (ECG 200, Gunpoint, FordA, and FordB) from the 2018 UCR archive to assess the effectiveness of our proposed method. The ECG200 dataset comprises a set of time series. Each series traces the electrical activity recorded during one heartbeat. The dataset has two classes: normal heartbeat and myocardial infarction. The GunPoint time series dataset is a widely used benchmark for evaluating the performance of time series classification algorithms. It consists of 200 univariate time series representing hand movement trajectories of one male and one female actor to classify hand movement into point gesture and gun gesture. The FordA and FordB datasets contain time series data of engine noise collected during standard operating conditions to classify the presence or absence of symptoms. However, the FordB dataset is gathered in a noisy environment. Refer to <xref ref-type="table" rid="T2">Table 2</xref> for detailed statistics on the datasets. In terms of class distribution, ECG200 displays a slight imbalance, with Class 0 comprising 67 instances and Class 1 comprising 133 instances, resulting in a ratio of &#x0007E;2:1. In contrast, Gunpoint demonstrates a balanced distribution, with both classes containing 100 instances each. Similarly, FordA and FordB datasets maintain balanced distributions, with each maintaining a 1:1 ratio. FordA contains 2,527 instances in Class 0 and 2,394 instances in Class 1, while FordB showcases 2,261 instances in Class 0 and 2,185 instances in Class 1. Despite ECG200&#x00027;s slight imbalance, it does not significantly affect the analysis. To ensure consistent class distributions across training and test sets, we employed stratified splitting for all datasets. All datasets underwent minimal preprocessing, with batch-wise standardization applied before training using the TSStandardize() function from the tsai library.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Statistics of four datasets used in the experiment.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Name</bold></th>
<th valign="top" align="center"><bold>Data size</bold></th>
<th valign="top" align="center"><bold>No. classes</bold></th>
<th valign="top" align="center"><bold>Length</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">ECG 200</td>
<td valign="top" align="center">200</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">96</td>
</tr>
<tr>
<td valign="top" align="left">Gunpoint</td>
<td valign="top" align="center">200</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">150</td>
</tr>
<tr>
<td valign="top" align="left">Ford A</td>
<td valign="top" align="center">4,921</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">500</td>
</tr>
<tr>
<td valign="top" align="left">Ford B</td>
<td valign="top" align="center">4,446</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">500</td>
</tr></tbody>
</table>
</table-wrap>
<p>We used two difficult-to-interpret architectures in our experimental setup: LSTM with a Fully Convolutional Network (LSTM-FCN) and a standalone Fully Convolutional Neural Network (FCN). These models were constructed using the PyTorch-based tsai library (Oguiza, <xref ref-type="bibr" rid="B15">2023</xref>), with the current default configuration featuring kernel sizes of 7, 5, 3 for the convolutional layers and corresponding filter sizes of 128, 256, 128 specifically for the FCN. The FCN architecture comprises three one-dimensional convolutional layers, each integrated with batch normalization and ReLU activation, a Global Average Pooling (GAP) layer, and a softmax layer. The LSTM-FCN architecture combines Long Short Term Memory (LSTM) and Fully Convolutional Networks (FCN). The fully convolutional block consists of three stacked temporal convolutional blocks with filter sizes of 128, 256, and 128, respectively. The time series input is passed into the FCN and LSTM block. The output of the global pooling layer integrated at the end of FCN architecture and the LSTM block is concatenated and passed onto a softmax classification layer. Two models were trained and tested on the four selected datasets. The smaller datasets, ECG200 and Gunpoint, were partitioned into 60% for training, 15% for validation, and 25% for testing. The larger datasets, FordA and FordB, were partitioned into 70% for training, 15% for validation and 15% for testing. Both models demonstrated outstanding results. To prevent overfitting, early stopping was used during training with a patience of 15 and a minimum delta of 0.001. Furthermore, each model was trained 100 times using the Monte Carlo cross-validation technique with random training, validation, and test splits to ensure stable accuracy. The average performance of the models is presented in <xref ref-type="table" rid="T3">Table 3</xref>.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Mean test and validation accuracy with standard deviation for FCN and LSTM-FCN models on four datasets.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Dataset</bold></th>
<th valign="top" align="center" colspan="2"><bold>FCN</bold></th>
<th valign="top" align="center" colspan="2"><bold>LSTM FCN</bold></th>
</tr>
</thead>
<tbody>
<tr style="background-color:#919498;color:#ffffff">
<td/>
<td valign="top" align="center"><bold>Test Acc</bold></td>
<td valign="top" align="center"><bold>Valid Acc</bold></td>
<td valign="top" align="center"><bold>Test Acc</bold></td>
<td valign="top" align="center"><bold>Valid Acc</bold></td>
</tr>
<tr>
<td valign="top" align="left">ECG200</td>
<td valign="top" align="center">0.87 &#x000B1; 0.05</td>
<td valign="top" align="center">0.86 &#x000B1; 0.07</td>
<td valign="top" align="center">0.86 &#x000B1; 0.05</td>
<td valign="top" align="center">0.85 &#x000B1; 0.05</td>
</tr>
<tr>
<td valign="top" align="left">GunPoint</td>
<td valign="top" align="center">0.99 &#x000B1; 0.03</td>
<td valign="top" align="center">0.98 &#x000B1; 0.07</td>
<td valign="top" align="center">0.98 &#x000B1; 0.06</td>
<td valign="top" align="center">0.98 &#x000B1; 0.07</td>
</tr>
<tr>
<td valign="top" align="left">FordA</td>
<td valign="top" align="center">0.90 &#x000B1; 0.04</td>
<td valign="top" align="center">0.90 &#x000B1; 0.04</td>
<td valign="top" align="center">0.91 &#x000B1; 0.05</td>
<td valign="top" align="center">0.91 &#x000B1; 0.05</td>
</tr>
<tr>
<td valign="top" align="left">FordB</td>
<td valign="top" align="center">0.88 &#x000B1; 0.03</td>
<td valign="top" align="center">0.89 &#x000B1; 0.04</td>
<td valign="top" align="center">0.86 &#x000B1; 0.04</td>
<td valign="top" align="center">0.86 &#x000B1; 0.04</td>
</tr></tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>4.2 Transforming the test set</title>
<p>After transformation, the test set used for evaluating the deep learning models, as explained in the Subsection 3.2, is employed to train and test the decision tree classifier to generate rules as an explanation. Unlike approaches focusing on local explanations for individual instances within the test set, our method provides a holistic understanding of the inference process of the black box model.</p>
<p>In this study, implemented PEPs include increasing and decreasing events, yielding three parameters (start time (start), duration (duration<sub>event</sub>), and the average value of the gradient (avg_gradient). Local max and local min events are also considered, providing two parameters (time of the maximum/minimum (time<sub>max/min</sub>) and the corresponding value (value<sub>max/min</sub>).</p>
</sec>
</sec>
<sec id="s5">
<title>5 Result and discussion</title>
<p>The objective evaluation results for the proposed XAI method are presented in <xref ref-type="table" rid="T4">Table 4</xref>, showcasing the mean and standard deviation of various objective evaluation metrics. The method was applied to four different datasets for two different models: Fully Convolutional Network (FCN) and LSTM FCN, and objectively evaluated using the metrics depicted in <xref ref-type="table" rid="T1">Table 1</xref>. <xref ref-type="fig" rid="F7">Figure 7</xref> illustrates the graph derived from a decision tree classifier trained on transformed data, as explained in Section 3C.</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>Mean and standard deviation of the objective evaluation of the rule-based explanation.</p></caption>
<table frame="box" rules="all">
<thead>
<tr style="background-color:#919498;color:#ffffff">
<th valign="top" align="left"><bold>Dataset</bold></th>
<th valign="top" align="center" colspan="5"><bold>FCN</bold></th>
<th valign="top" align="center" colspan="5"><bold>LSTM FCN</bold></th>
</tr>
</thead>
<tbody>
<tr style="background-color:#919498;color:#ffffff">
<td/>
<td valign="top" align="center"><bold>Acc</bold></td>
<td valign="top" align="center"><bold>Fidelity</bold></td>
<td valign="top" align="center"><bold>&#x00023;Depth</bold></td>
<td valign="top" align="center"><bold>&#x00023;Node</bold></td>
<td valign="top" align="center"><bold>Rob</bold>.</td>
<td valign="top" align="center"><bold>Acc</bold></td>
<td valign="top" align="center"><bold>Fidelity</bold></td>
<td valign="top" align="center"><bold>&#x00023;Depth</bold></td>
<td valign="top" align="center"><bold>&#x00023;Node</bold></td>
<td valign="top" align="center"><bold>Rob</bold>.</td>
</tr>
<tr>
<td valign="top" align="left">ECG200</td>
<td valign="top" align="center">0.79 &#x000B1; 0.10</td>
<td valign="top" align="center">0.89 &#x000B1; 0.06</td>
<td valign="top" align="center">3 &#x000B1; 2</td>
<td valign="top" align="center">10 &#x000B1; 6</td>
<td valign="top" align="center">0.78&#x000B1; 0.12</td>
<td valign="top" align="center">0.80 &#x000B1; 0.12</td>
<td valign="top" align="center">0.89 &#x000B1; 0.06</td>
<td valign="top" align="center">4 &#x000B1; 2</td>
<td valign="top" align="center">10 &#x000B1; 5</td>
<td valign="top" align="center">0.76 &#x000B1; 0.14</td>
</tr>
<tr>
<td valign="top" align="left">GunPoint</td>
<td valign="top" align="center">0.74 &#x000B1; 0.12</td>
<td valign="top" align="center">0.88 &#x000B1; 0.11</td>
<td valign="top" align="center">4 &#x000B1; 2</td>
<td valign="top" align="center">12 &#x000B1; 5</td>
<td valign="top" align="center">0.64 &#x000B1; 0.18</td>
<td valign="top" align="center">0.73 &#x000B1; 0.11</td>
<td valign="top" align="center">0.88 &#x000B1; 0.07</td>
<td valign="top" align="center">4 &#x000B1; 2</td>
<td valign="top" align="center">12 &#x000B1; 5</td>
<td valign="top" align="center">0.64 &#x000B1; 0.17</td>
</tr>
<tr>
<td valign="top" align="left">FordA</td>
<td valign="top" align="center">0.78 &#x000B1; 0.03</td>
<td valign="top" align="center">0.84 &#x000B1; 0.04</td>
<td valign="top" align="center">8 &#x000B1; 3</td>
<td valign="top" align="center">42 &#x000B1; 34</td>
<td valign="top" align="center">0.76 &#x000B1; 0.04</td>
<td valign="top" align="center">0.79 &#x000B1; 0.04</td>
<td valign="top" align="center">0.84 &#x000B1; 0.05</td>
<td valign="top" align="center">8 &#x000B1; 4</td>
<td valign="top" align="center">41 &#x000B1; 38</td>
<td valign="top" align="center">0.77 &#x000B1; 0.05</td>
</tr>
<tr>
<td valign="top" align="left">FordB</td>
<td valign="top" align="center">0.81 &#x000B1; 0.04</td>
<td valign="top" align="center">0.87 &#x000B1; 0.05</td>
<td valign="top" align="center">8 &#x000B1; 4</td>
<td valign="top" align="center">42 &#x000B1; 34</td>
<td valign="top" align="center">0.77 &#x000B1; 0.06</td>
<td valign="top" align="center">0.81 &#x000B1; 0.04</td>
<td valign="top" align="center">0.86 &#x000B1; 0.05</td>
<td valign="top" align="center">7 &#x000B1; 4</td>
<td valign="top" align="center">37 &#x000B1; 33</td>
<td valign="top" align="center">0.79 &#x000B1; 0.08</td>
</tr></tbody>
</table>
</table-wrap>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>Visualization of decision tree graph produced by the proposed method applied to ECG data for the FCN model.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="frai-07-1381921-g0007.tif"/>
</fig>
<p>For the FCN model, the accuracy values of the decision tree range from 0.74 to 0.81, reflecting how well the decision tree approximates the underlying complex model. Fidelity values, ranging from 0.84 to 0.89, indicate the agreement between the decision tree and the predictions of the deep learning model. The number of depth and nodes varies from 3 to 8 and 10 to 42, respectively, indicating the complexity of the decision tree graph or rules. The robustness scores range from 0.64 to 0.78, indicating its stability against insignificant data changes that do not affect model performance.</p>
<p>On the LSTM-FCN side, the accuracy values range from 0.73 to 0.81. The fidelity values, ranging from 0.84 to 0.89, indicate a high degree of alignment between the rule-based explanations and the predictions of the deep learning model. The LSTM-FCN model exhibits a more concise representation with a depth range of 4 to 8 and several nodes ranging from 10 to 41. The robustness scores for LSTM-FCN range from 0.64 to 0.79, almost similar to that of FCN. We fine-tuned the decision tree through a post-pruning technique, specifically using cost complexity pruning.</p>
<p>The list of rules extracted in the following demonstrates the findings of our experiment in the ECG200 data set. Each rule highlights the importance of particular time steps and the corresponding events occurring at those steps, significantly impacting the model prediction. Notably, the decision tree features represent clusters of each Parameterized Event Primitives (PEPs), and the nodes in the graph or rules symbolize the centroids of these clusters. For instance, let (<italic>t, v</italic>) represent the centroids obtained from Cluster 1 of local maxima. After post-processing, these centroids can be denoted as a Local Maximum event at time <italic>t</italic> with a value of <italic>v</italic>. It&#x00027;s important to note that for local maxima and local minima, the centroids consist of the variables time (<italic>t</italic>) and value (<italic>v</italic>). In the case of increasing and decreasing events, the centroids include time (<italic>t</italic>), duration (<italic>d</italic>), and average value (<inline-formula><mml:math id="M6"><mml:mover accent="true"><mml:mrow><mml:mi>v</mml:mi></mml:mrow><mml:mo>&#x00304;</mml:mo></mml:mover></mml:math></inline-formula>). Additionally, if domain experts provide definitions for the conditional part of the rules, we can generate human-readable explanations for better comprehension.</p>
<list list-type="order">
<list-item><p>increases from time 70 to 71 with average value 0.79 &#x02264; 12.5 and decreases from time 23 to 29 with an average value -0.07 &#x02264; 3.5 &#x021D2; Myocardial Infarction</p></list-item>
<list-item><p>increases from time 70 to 71 with average value 0.93 &#x0003E; 12.5 and decreases from time 23 to 29 with an average value -0.07 &#x0003C; 3.5 &#x021D2; Normal Heartbeat</p></list-item>
<list-item><p>increases from time 70 to 71 with average value 0.79 &#x0003E; 12.5 &#x021D2; Normal Heartbeat</p></list-item>
</list>
<p>The objective evaluation results shed light on the effectiveness of our novel <italic>post-hoc</italic> XAI method in explaining the inference process of deep learning-based time series classification models.</p>
<p>The competitive accuracy results show the reliability of the decision trees generated to capture the essence of the underlying deep learning models. Fidelity, which measures how well the surrogate model predictions match the models&#x00027; decisions, showed strong results for both types of models we tested. However, there is still a need for additional metrics to ensure the faithfulness of the generated explanation.</p>
<p>The robustness scores, especially for ECG200, FordA and FordB on both FCN and LSTM-FCN, indicate the resilience of the proposed XAI method in producing consistent and reliable explanations for insignificant changes in the data that do not affect the model prediction.</p>
<p>The decision tree graphs are relatively simple for smaller datasets with a low number of nodes and depth. However, for FordA and FordB, the number of depth and nodes, especially the standard deviation, is higher. This is primarily attributed to our automatic selection of the optimal alpha value for post-pruning the decision tree. Manual selection of the optimal alpha value, accounting for the number of nodes, depth, and accuracy, could have created a more interpretable decision tree graph.</p>
<p>These findings suggest that the proposed method can generate interpretable explanations using relatively simple decision trees that are easily understandable to users. The core strength of our methodology lies in its ability to avoid time series data segmentation, choosing instead the direct extraction and clustering of parameterized event primitives to provide rule-based global explanations. This approach not only simplifies the feature space but also ensures the faithful representation of temporal relationships within the time series in the resulting explanation model. Despite this, it is crucial to recognize a potential limitation concerning its performance on more complex datasets, especially those with higher dimensionality, such as multivariate time series. In such cases, the resulting decision tree graphs might become more intricate and pose interpretation challenges. However, the proposed method could be extended to address these challenges by incorporating more sophisticated clustering or feature extraction techniques.</p>
</sec>
<sec id="s6">
<title>6 Conclusion and future work</title>
<p>This paper introduced a novel model-agnostic XAI method for deep learning-based time series classification models. The proposed method utilizes a decision tree graph to show the crucial time steps in the model prediction. The study evaluated the explanation generated by this approach using various objective metrics such as accuracy, fidelity, depth, number of nodes and robustness. The findings of this research provide a strong foundation for developing more transparent and interpretable XAI methods for state-of-the-art deep learning models in the future. Our experiments suggest that the explanation becomes more interpretable with a reduced depth and number of nodes. Moving forward, we plan to validate this method on complex and multivariate time series datasets and conduct a human-centered evaluation of the explanations generated by this method in comparison to existing XAI methods for time series.</p>
</sec>
</body>
<back>
<sec sec-type="data-availability" id="s7">
<title>Data availability statement</title>
<p>The original contributions presented in the study are publicly available. This data can be found here: <ext-link ext-link-type="uri" xlink:href="https://www.timeseriesclassification.com/">https://www.timeseriesclassification.com/</ext-link>.</p>
</sec>
<sec sec-type="author-contributions" id="s8">
<title>Author contributions</title>
<p>EM: Writing &#x02013; original draft, Writing &#x02013; review &#x00026; editing. LL: Supervision, Writing &#x02013; review &#x00026; editing. PD: Supervision, Writing &#x02013; review &#x00026; editing.</p>
</sec>
<sec sec-type="funding-information" id="s9">
<title>Funding</title>
<p>The author(s) declare financial support was received for the research, authorship, and/or publication of this article. The work described in this manuscript is part of a doctoral research project funded by the Technological University Dublin.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted without any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
<p>The author(s) declared that they were an editorial board member of Frontiers, at the time of submission. This had no impact on the peer review process and the final decision.</p>
</sec>
<sec sec-type="disclaimer" id="s10">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec><sec sec-type="supplementary-material" id="s11">
<title>Supplementary material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/frai.2024.1381921/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/frai.2024.1381921/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.pdf" id="SM1" mimetype="application/pdf" xmlns:xlink="http://www.w3.org/1999/xlink"/></sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bach</surname> <given-names>S.</given-names></name> <name><surname>Binder</surname> <given-names>A.</given-names></name> <name><surname>Montavon</surname> <given-names>G.</given-names></name> <name><surname>Klauschen</surname> <given-names>F.</given-names></name> <name><surname>M&#x000FC;ller</surname> <given-names>K.-R.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2015</year>). <article-title>On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation</article-title>. <source>PLoS ONE</source> <volume>10</volume>:<fpage>e0130140</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0130140</pub-id><pub-id pub-id-type="pmid">26161953</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cho</surname> <given-names>S.</given-names></name> <name><surname>Lee</surname> <given-names>G.</given-names></name> <name><surname>Chang</surname> <given-names>W.</given-names></name> <name><surname>Choi</surname> <given-names>J.</given-names></name></person-group> (<year>2020</year>). <article-title>Interpretation of deep temporal representations by selective visualization of internally activated nodes</article-title>. <source>arXiv preprint arXiv:2004.12538</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2004.12538</pub-id></citation>
</ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Di Martino</surname> <given-names>F.</given-names></name> <name><surname>Delmastro</surname> <given-names>F.</given-names></name></person-group> (<year>2022</year>). <article-title>Explainable ai for clinical and remote health applications: a survey on tabular and time series data</article-title>. <source>Artif. Intell. Rev</source>. <volume>3</volume>, <fpage>1</fpage>&#x02013;<lpage>55</lpage>. <pub-id pub-id-type="doi">10.1007/s10462-022-10304-3</pub-id><pub-id pub-id-type="pmid">36320613</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Jeyakumar</surname> <given-names>J. V.</given-names></name> <name><surname>Noor</surname> <given-names>J.</given-names></name> <name><surname>Cheng</surname> <given-names>Y.-H.</given-names></name> <name><surname>Garcia</surname> <given-names>L.</given-names></name> <name><surname>Srivastava</surname> <given-names>M.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;How can i explain this to you? an empirical study of deep neural network explanation methods,&#x0201D;</article-title> in <source>Proceedings of the 34th International Conference on Neural Information Processing Systems</source> (<publisher-loc>Vancouver, BC</publisher-loc>: <publisher-name>Curran Associates Inc</publisher-name>). <pub-id pub-id-type="doi">10.5555/3495724.3496078</pub-id></citation>
</ref>
<ref id="B5">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Joshi</surname> <given-names>S.</given-names></name> <name><surname>Abdelfattah</surname> <given-names>E.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Deep neural networks for time series classification in human activity recognition,&#x0201D;</article-title> in <source>2021 IEEE 12th Annual Information Technology, Electronics and Mobile Communication Conference (IEMCON)</source> (<publisher-loc>Vancouver, BC</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>559</fpage>&#x02013;<lpage>566</lpage>.</citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kadous</surname> <given-names>M. W.</given-names></name></person-group> (<year>1999</year>). <article-title>Learning comprehensible descriptions of multivariate time series</article-title>. <source>ICML</source> <volume>454</volume>:<fpage>463</fpage>.</citation>
</ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Karim</surname> <given-names>F.</given-names></name> <name><surname>Majumdar</surname> <given-names>S.</given-names></name> <name><surname>Darabi</surname> <given-names>H.</given-names></name> <name><surname>Chen</surname> <given-names>S.</given-names></name></person-group> (<year>2017</year>). <article-title>LSTM fully convolutional networks for time series classification</article-title>. <source>IEEE Access</source> <volume>6</volume>, <fpage>1662</fpage>&#x02013;<lpage>1669</lpage>. <pub-id pub-id-type="doi">10.1109/ACCESS.2017.2779939</pub-id></citation>
</ref>
<ref id="B8">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liu</surname> <given-names>P.</given-names></name> <name><surname>Sun</surname> <given-names>X.</given-names></name> <name><surname>Han</surname> <given-names>Y.</given-names></name> <name><surname>He</surname> <given-names>Z.</given-names></name> <name><surname>Zhang</surname> <given-names>W.</given-names></name> <name><surname>Wu</surname> <given-names>C.</given-names></name></person-group> (<year>2022</year>). <article-title>Arrhythmia classification of LSTM autoencoder based on time series anomaly detection</article-title>. <source>Biomed. Sign. Process. Contr</source>. <volume>71</volume>:<fpage>103228</fpage>. <pub-id pub-id-type="doi">10.1016/j.bspc.2021.103228</pub-id><pub-id pub-id-type="pmid">36574391</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Longo</surname> <given-names>L.</given-names></name> <name><surname>Brcic</surname> <given-names>M.</given-names></name> <name><surname>Cabitza</surname> <given-names>F.</given-names></name> <name><surname>Choi</surname> <given-names>J.</given-names></name> <name><surname>Confalonieri</surname> <given-names>R.</given-names></name> <name><surname>Del Ser</surname> <given-names>J.</given-names></name> <etal/></person-group>. (<year>2023</year>). <article-title>Explainable artificial intelligence (XAI) 2.0: a manifesto of open challenges and interdisciplinary research directions</article-title>. <source>arXiv preprint arXiv:2310.19775</source>. <pub-id pub-id-type="doi">10.48550/arXiv.2310.19775</pub-id></citation>
</ref>
<ref id="B10">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lundberg</surname> <given-names>S. M.</given-names></name> <name><surname>Lee</surname> <given-names>S.-I.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;A unified approach to interpreting model predictions,&#x0201D;</article-title> in <source>Proceedings of the 31st International Conference on Neural Information Processing Systems</source> (<publisher-loc>Long Beach, CA</publisher-loc>: <publisher-name>Curran Associates Inc.</publisher-name>), <fpage>4768</fpage>&#x02013;<lpage>4777</lpage>. <pub-id pub-id-type="doi">10.5555/3295222.3295230</pub-id></citation>
</ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mekruksavanich</surname> <given-names>S.</given-names></name> <name><surname>Jitpattanakul</surname> <given-names>A.</given-names></name></person-group> (<year>2021</year>). <article-title>Lstm networks using smartphone data for sensor-based human activity recognition in smart homes</article-title>. <source>Sensors</source> <volume>21</volume>:<fpage>1636</fpage>. <pub-id pub-id-type="doi">10.3390/s21051636</pub-id><pub-id pub-id-type="pmid">33652697</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Molnar</surname> <given-names>C.</given-names></name></person-group> (<year>2020</year>). <source>Interpretable Machine Learning</source>. Lulu Press.</citation>
</ref>
<ref id="B13">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Munir</surname> <given-names>M.</given-names></name> <name><surname>Siddiqui</surname> <given-names>S. A.</given-names></name> <name><surname>K&#x000FC;sters</surname> <given-names>F.</given-names></name> <name><surname>Mercier</surname> <given-names>D.</given-names></name> <name><surname>Dengel</surname> <given-names>A.</given-names></name> <name><surname>Ahmed</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;TSXplain: demystification of DNN decisions for time-series using natural language and statistical features,&#x0201D;</article-title> in <source>Artificial Neural Networks and Machine Learning&#x02013;ICANN 2019: Workshop and Special Sessions: 28th International Conference on Artificial Neural Networks, Munich, Germany, September 17&#x02013;19, 2019, Proceedings 28</source> (<publisher-loc>Berlin</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>426</fpage>&#x02013;<lpage>439</lpage>.</citation>
</ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Neves</surname> <given-names>I.</given-names></name> <name><surname>Folgado</surname> <given-names>D.</given-names></name> <name><surname>Santos</surname> <given-names>S.</given-names></name> <name><surname>Barandas</surname> <given-names>M.</given-names></name> <name><surname>Campagner</surname> <given-names>A.</given-names></name> <name><surname>Ronzio</surname> <given-names>L.</given-names></name> <etal/></person-group>. (<year>2021</year>). <article-title>Interpretable heartbeat classification using local model-agnostic explanations on ECGS</article-title>. <source>Comput. Biol. Med</source>. <volume>133</volume>:<fpage>104393</fpage>. <pub-id pub-id-type="doi">10.1016/j.compbiomed.2021.104393</pub-id><pub-id pub-id-type="pmid">33915362</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="web"><person-group person-group-type="author"><name><surname>Oguiza</surname> <given-names>I.</given-names></name></person-group> (<year>2023</year>). <source>tsai-A State-of-the-Art Deep Learning Library for Time Series and Sequential Data</source>. Github. Available at: <ext-link ext-link-type="uri" xlink:href="https://github.com/timeseriesAI/tsai">https://github.com/timeseriesAI/tsai</ext-link></citation>
</ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Oviedo</surname> <given-names>F.</given-names></name> <name><surname>Ren</surname> <given-names>Z.</given-names></name> <name><surname>Sun</surname> <given-names>S.</given-names></name> <name><surname>Settens</surname> <given-names>C.</given-names></name> <name><surname>Liu</surname> <given-names>Z.</given-names></name> <name><surname>Hartono</surname> <given-names>N. T. P.</given-names></name> <etal/></person-group>. (<year>2019</year>). <article-title>Fast and interpretable classification of small x-ray diffraction datasets using data augmentation and deep neural networks</article-title>. <source>NPJ Comput. Mater</source>. <volume>5</volume>:<fpage>60</fpage>. <pub-id pub-id-type="doi">10.1038/s41524-019-0196-x</pub-id></citation>
</ref>
<ref id="B17">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Ribeiro</surname> <given-names>M. T.</given-names></name> <name><surname>Singh</surname> <given-names>S.</given-names></name> <name><surname>Guestrin</surname> <given-names>C.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Why should i trust you? explaining the predictions of any classifier,&#x0201D;</article-title> in <source>Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining</source> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name>), <fpage>1135</fpage>&#x02013;<lpage>1144</lpage>.</citation>
</ref>
<ref id="B18">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rojat</surname> <given-names>T.</given-names></name> <name><surname>Puget</surname> <given-names>R.</given-names></name> <name><surname>Filliat</surname> <given-names>D.</given-names></name> <name><surname>Del Ser</surname> <given-names>J.</given-names></name> <name><surname>Gelin</surname> <given-names>R.</given-names></name> <name><surname>D&#x000ED;az-Rodr&#x000ED;guez</surname> <given-names>N.</given-names></name></person-group> (<year>2021</year>). <article-title>Explainable artificial intelligence (XAI) on timeseries data: a survey</article-title>. <source>arXiv [Preprint]</source>. arXiv:2104.00950.</citation>
</ref>
<ref id="B19">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Schlegel</surname> <given-names>U.</given-names></name> <name><surname>Arnout</surname> <given-names>H.</given-names></name> <name><surname>El-Assady</surname> <given-names>M.</given-names></name> <name><surname>Oelke</surname> <given-names>D.</given-names></name> <name><surname>Keim</surname> <given-names>D. A.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Towards a rigorous evaluation of XAI methods on time series,&#x0201D;</article-title> in <source>2019 IEEE/CVF International Conference on Computer Vision Workshop (ICCVW)</source> (<publisher-loc>Seoul</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>4197</fpage>&#x02013;<lpage>4201</lpage>.</citation>
</ref>
<ref id="B20">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Schlegel</surname> <given-names>U.</given-names></name> <name><surname>Keim</surname> <given-names>D. A.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Time series model attribution visualizations as explanations,&#x0201D;</article-title> in <source>2021 IEEE Workshop on TRust and EXpertise in Visual Analytics (TREX)</source> (<publisher-loc>Seoul</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>27</fpage>&#x02013;<lpage>31</lpage>.</citation>
</ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shu</surname> <given-names>T.</given-names></name> <name><surname>Chen</surname> <given-names>J.</given-names></name> <name><surname>Bhargava</surname> <given-names>V. K.</given-names></name> <name><surname>de Silva</surname> <given-names>C. W.</given-names></name></person-group> (<year>2019</year>). <article-title>An energy-efficient dual prediction scheme using LMS filter and LSTM in wireless sensor networks for environment monitoring</article-title>. <source>IEEE Internet Things J</source>. <volume>6</volume>, <fpage>6736</fpage>&#x02013;<lpage>6747</lpage>. <pub-id pub-id-type="doi">10.1109/JIOT.2019.2911295</pub-id></citation>
</ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Siddiqui</surname> <given-names>S. A.</given-names></name> <name><surname>Mercier</surname> <given-names>D.</given-names></name> <name><surname>Munir</surname> <given-names>M.</given-names></name> <name><surname>Dengel</surname> <given-names>A.</given-names></name> <name><surname>Ahmed</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>TSViz: demystification of deep learning models for time-series analysis</article-title>. <source>IEEE Access</source> <volume>7</volume>, <fpage>67027</fpage>&#x02013;<lpage>67040</lpage>. <pub-id pub-id-type="doi">10.1109/ACCESS.2019.2912823</pub-id></citation>
</ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Simonyan</surname> <given-names>K.</given-names></name> <name><surname>Vedaldi</surname> <given-names>A.</given-names></name> <name><surname>Zisserman</surname> <given-names>A.</given-names></name></person-group> (<year>2013</year>). <article-title>Deep inside convolutional networks: visualising image classification models and saliency maps</article-title>. <source>arXiv preprint arXiv:1312.6034</source>. <pub-id pub-id-type="doi">10.48550/arXiv.1312.6034</pub-id></citation>
</ref>
<ref id="B24">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sivill</surname> <given-names>T.</given-names></name> <name><surname>Flach</surname> <given-names>P.</given-names></name></person-group> (<year>2022</year>). <article-title>&#x0201C;Limesegment: meaningful, realistic time series explanations,&#x0201D;</article-title> in <source>International Conference on Artificial Intelligence and Statistics</source> (<publisher-loc>PMLR</publisher-loc>), <fpage>3418</fpage>&#x02013;<lpage>3433</lpage>.</citation>
</ref>
<ref id="B25">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Strodthoff</surname> <given-names>N.</given-names></name> <name><surname>Wagner</surname> <given-names>P.</given-names></name> <name><surname>Schaeffter</surname> <given-names>T.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2020</year>). <article-title>Deep learning for ECG analysis: benchmarks and insights from PTB-XL</article-title>. <source>IEEE J. Biomed. Health Informat</source>. <volume>25</volume>, <fpage>1519</fpage>&#x02013;<lpage>1528</lpage>. <pub-id pub-id-type="doi">10.1109/JBHI.2020.3022989</pub-id><pub-id pub-id-type="pmid">32903191</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Theissler</surname> <given-names>A.</given-names></name> <name><surname>Spinnato</surname> <given-names>F.</given-names></name> <name><surname>Schlegel</surname> <given-names>U.</given-names></name> <name><surname>Guidotti</surname> <given-names>R.</given-names></name></person-group> (<year>2022</year>). <article-title>Explainable AI for time series classification: a review, taxonomy and research directions</article-title>. <source>IEEE Access</source> <volume>2022</volume>:<fpage>3207765</fpage>. <pub-id pub-id-type="doi">10.1109/ACCESS.2022.3207765</pub-id></citation>
</ref>
<ref id="B27">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vielhaben</surname> <given-names>J.</given-names></name> <name><surname>Lapuschkin</surname> <given-names>S.</given-names></name> <name><surname>Montavon</surname> <given-names>G.</given-names></name> <name><surname>Samek</surname> <given-names>W.</given-names></name></person-group> (<year>2023</year>). <article-title>Explainable AI for time series via virtual inspection layers</article-title>. <source>arXiv preprint arXiv:2303.06365</source>. <pub-id pub-id-type="doi">10.2139/ssrn.4399242</pub-id></citation>
</ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Vilone</surname> <given-names>G.</given-names></name> <name><surname>Longo</surname> <given-names>L.</given-names></name></person-group> (<year>2023</year>). <article-title>&#x0201C;Development of a human-centred psychometric test for the evaluation of explanations produced by XAI methods,&#x0201D;</article-title> in <source>Explainable Artificial Intelligence</source>, ed. L. Longo (Cham: Springer Nature Switzerland), <fpage>205</fpage>&#x02013;<lpage>232</lpage>.</citation>
</ref>
<ref id="B29">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Vilone</surname> <given-names>G.</given-names></name> <name><surname>Rizzo</surname> <given-names>L.</given-names></name> <name><surname>Longo</surname> <given-names>L.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;A comparative analysis of rule-based, model-agnostic methods for explainable artificial intelligence,&#x0201D;</article-title> in <source>Proceedings of The 28th Irish Conference on Artificial Intelligence and Cognitive Science, Dublin, Republic of Ireland, volume 2771 of CEUR Workshop Proceedings</source> (<publisher-loc>Aachen</publisher-loc>: <publisher-name>CEUR-WS.org</publisher-name>), <fpage>85</fpage>&#x02013;<lpage>96</lpage>.</citation>
</ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Watson</surname> <given-names>D. S.</given-names></name></person-group> (<year>2022</year>). <article-title>Conceptual challenges for interpretable machine learning</article-title>. <source>Synthese</source> <volume>200</volume>:<fpage>65</fpage>. <pub-id pub-id-type="doi">10.1007/s11229-022-03485-5</pub-id></citation>
</ref>
<ref id="B31">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhang</surname> <given-names>X.</given-names></name> <name><surname>Liang</surname> <given-names>X.</given-names></name> <name><surname>Zhiyuli</surname> <given-names>A.</given-names></name> <name><surname>Zhang</surname> <given-names>S.</given-names></name> <name><surname>Xu</surname> <given-names>R.</given-names></name> <name><surname>Wu</surname> <given-names>B.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;AT-LSTM: an attention-based lstm model for financial time series prediction,&#x0201D;</article-title> in <source>IOP Conference Series: Materials Science and Engineering, volume 569</source> (<publisher-loc>Bristol</publisher-loc>: <publisher-name>IOP Publishing</publisher-name>), e052037.</citation>
</ref>
<ref id="B32">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Zhou</surname> <given-names>L.</given-names></name> <name><surname>Ma</surname> <given-names>C.</given-names></name> <name><surname>Shi</surname> <given-names>X.</given-names></name> <name><surname>Zhang</surname> <given-names>D.</given-names></name> <name><surname>Li</surname> <given-names>W.</given-names></name> <name><surname>Wu</surname> <given-names>L.</given-names></name></person-group> (<year>2021</year>). <article-title>&#x0201C;Salience-CAM: visual explanations from convolutional neural networks via salience score,&#x0201D;</article-title> in <source>2021 International Joint Conference on Neural Networks (IJCNN)</source> (<publisher-loc>Shenzhen</publisher-loc>: <publisher-name>IEEE</publisher-name>), <fpage>1</fpage>&#x02013;<lpage>8</lpage>.</citation>
</ref>
</ref-list>
</back>
</article>