<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Appl. Math. Stat.</journal-id>
<journal-title>Frontiers in Applied Mathematics and Statistics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Appl. Math. Stat.</abbrev-journal-title>
<issn pub-type="epub">2297-4687</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fams.2022.1076083</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Applied Mathematics and Statistics</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Quantifying uncertainty of machine learning methods for loss given default</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Nagl</surname> <given-names>Matthias</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/2093432/overview"/>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Nagl</surname> <given-names>Maximilian</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1976081/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>R&#x000F6;sch</surname> <given-names>Daniel</given-names></name>
</contrib>
</contrib-group>
<aff><institution>Chair of Statistics and Risk Management, Universit&#x000E4;t Regensburg</institution>, <addr-line>Regensburg</addr-line>, <country>Germany</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Paolo Giudici, University of Pavia, Italy</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Paola Cerchiello, University of Pavia, Italy; Arianna Agosto, University of Pavia, Italy</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Maximilian Nagl <email>maximilian.nagl&#x00040;ur.de</email></corresp>
<fn fn-type="other" id="fn001"><p>This article was submitted to Mathematical Finance, a section of the journal Frontiers in Applied Mathematics and Statistics</p></fn></author-notes>
<pub-date pub-type="epub">
<day>15</day>
<month>12</month>
<year>2022</year>
</pub-date>
<pub-date pub-type="collection">
<year>2022</year>
</pub-date>
<volume>8</volume>
<elocation-id>1076083</elocation-id>
<history>
<date date-type="received">
<day>21</day>
<month>10</month>
<year>2022</year>
</date>
<date date-type="accepted">
<day>18</day>
<month>11</month>
<year>2022</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2022 Nagl, Nagl and R&#x000F6;sch.</copyright-statement>
<copyright-year>2022</copyright-year>
<copyright-holder>Nagl, Nagl and R&#x000F6;sch</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license></permissions>
<abstract>
<p>Machine learning has increasingly found its way into the credit risk literature. When applied to forecasting credit risk parameters, the approaches have been found to outperform standard statistical models. The quantification of prediction uncertainty is typically not analyzed in the machine learning credit risk setting. However, this is vital to the interests of risk managers and regulators alike as its quantification increases the transparency and stability in risk management and reporting tasks. We fill this gap by applying the novel approach of deep evidential regression to loss given defaults (LGDs). We evaluate aleatoric and epistemic uncertainty for LGD estimation techniques and apply explainable artificial intelligence (XAI) methods to analyze the main drivers. We find that aleatoric uncertainty is considerably larger than epistemic uncertainty. Hence, the majority of uncertainty in LGD estimates appears to be irreducible as it stems from the data itself.</p></abstract>
<kwd-group>
<kwd>machine learning</kwd>
<kwd>explainable artificial intelligence (XAI)</kwd>
<kwd>credit risk</kwd>
<kwd>uncertainty</kwd>
<kwd>loss given default</kwd>
</kwd-group>
<counts>
<fig-count count="7"/>
<table-count count="7"/>
<equation-count count="14"/>
<ref-count count="72"/>
<page-count count="13"/>
<word-count count="8959"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>Financial institutions play a central role in the stability of the financial sector. They act as intermediaries to support the supply of money and lending as well as the transfer of risk between entities. However, this exposes financial institutions to several types of risk, including credit risk. Credit risk has the largest stake with roughly 84% of risk-weighted assets of 131 major EU banks as of June 2021 [<xref ref-type="bibr" rid="B1">1</xref>]. The expected loss (EL) due to credit risk is composed of three parameters: Probability of Default (PD), Loss Given Default (LGD), and Exposure at Default (EAD). PD is defined as the probability that a creditor will not comply with his agreed obligations at a later time. LGD is defined as the relative fraction of the outstanding amount that is lost. Finally, EAD is defined as the outstanding amount at the time of default.</p>
<p>This article focuses on LGD as this risk parameter is important for financial institutions not only from a risk management perspective but also for pricing credit risky assets. Financial institutions can use their own models to calculate an estimate for the LGD. This estimate is subsequently used to determine the interest on the loan/bond and the capital requirement for the financial institution itself see, e.g., [<xref ref-type="bibr" rid="B2">2</xref>&#x02013;<xref ref-type="bibr" rid="B6">6</xref>]. Depending on the defaulted asset, we can divide the LGD further into market-based and workout LGD. The former refers to publicly traded instruments like bonds and is commonly defined as one minus the ratio of the market price 30 days after default divided by the outstanding amount at the time of default. The latter refers to bank loans and is determined by accumulating discounted payments from creditors during the default resolution process. In this article, we use a record of nearly three decades of market-based LGDs gathered from the Moody&#x00027;s Default and Recovery Database starting in January 1990 until December 2019. Recent literature using a shorter history of this data documents that machine learning models due to their ability to account for non-linear relationships of drivers and LGD estimates outperform standard statistical methods, see, e.g., [<xref ref-type="bibr" rid="B7">7</xref>&#x02013;<xref ref-type="bibr" rid="B9">9</xref>]. Fraisse and Laporte [<xref ref-type="bibr" rid="B10">10</xref>] show that allowing for non-linearity can be beneficial in many risk management applications and can lead to a better estimation of the capital requirements for banks. Therefore, using machine learning models can increase the precision of central credit risk parameters and, as a consequence, could have the potential to yield more adequate capital requirements for banks due to the increased precision.</p>
<p>There is a large body of literature using advanced statistical methods for LGDs. These include beta regression, factorial response models, local logit regressions, mixture regression, and quantile regression among many others, see, e.g., [<xref ref-type="bibr" rid="B2">2</xref>&#x02013;<xref ref-type="bibr" rid="B4">4</xref>, <xref ref-type="bibr" rid="B9">9</xref>, <xref ref-type="bibr" rid="B11">11</xref>&#x02013;<xref ref-type="bibr" rid="B18">18</xref>]. Concerning the increased computational power and methodical progress in academia, machine learning models have become more and more frequently applied concerning LGDs<xref ref-type="fn" rid="fn0001"><sup>1</sup></xref>. Early studies by Matuszyk et al. [<xref ref-type="bibr" rid="B28">28</xref>] and Bastos [<xref ref-type="bibr" rid="B12">12</xref>] employ tree-based methods. Moreover, several studies provide benchmark exercises using various machine learning methods, see, e.g., [<xref ref-type="bibr" rid="B13">13</xref>&#x02013;<xref ref-type="bibr" rid="B15">15</xref>]. Bellotti et al. [<xref ref-type="bibr" rid="B5">5</xref>] and Kaposty et al. [<xref ref-type="bibr" rid="B29">29</xref>] update previous benchmark studies with new data and algorithms. Nazemi et al. [<xref ref-type="bibr" rid="B30">30</xref>] find text-based variables to be important drivers for marked-based LGDs. Furthermore, evidence that spatial dependence plays a key role in peer-to-peer lending LGD estimation can be found in Calabrese and Zanin [<xref ref-type="bibr" rid="B31">31</xref>]. By combining statistical and machine learning models, Sigrist and Hirnschall [<xref ref-type="bibr" rid="B32">32</xref>] and Kellner et al. [<xref ref-type="bibr" rid="B6">6</xref>] show that benefits from both worlds can be captured.</p>
<p>An important aspect, to which the machine learning LGD literature has not yet paid attention, is the associated uncertainty around estimates and predictions<xref ref-type="fn" rid="fn0002"><sup>2</sup></xref>. Commonly, we can define two types of uncertainty, aleatoric and epistemic [<xref ref-type="bibr" rid="B33">33</xref>]. Following Gawlikowski et al. [<xref ref-type="bibr" rid="B34">34</xref>], aleatoric uncertainty is the uncertainty in the data itself that can not be reduced and is therefore also known as irreducible or data uncertainty. In classical statistics, this type of uncertainty is for example represented by &#x003F5; in the linear regression framework. Epistemic uncertainty refers to the uncertainty of a model due to the (limited) sample size. This uncertainty can be reduced by increasing the sample size on which the model is trained and is therefore also known as reducible or model uncertainty [<xref ref-type="bibr" rid="B34">34</xref>]. In a linear regression setting, epistemic uncertainty is, accounted for by the standard errors of the beta coefficients. Given a larger sample size, the standard errors should decrease. Recently, the literature on uncertainty estimation has grown rapidly as outlined in a survey article by Gawlikowski et al. [<xref ref-type="bibr" rid="B34">34</xref>].</p>
<p>A first intuitive way to quantify uncertainty is the Bayesian approach, which is also common in classical statistics. However, Bayesian neural networks are computationally expensive and do not scale easily to complex neural network architectures containing many parameters. Therefore, other researchers aim at approximating Bayesian inference/prediction for neural networks. Blundell et al. [<xref ref-type="bibr" rid="B35">35</xref>] introduce a backpropagation-compatible algorithm to learn probability distributions of weights instead of only point estimates. They call their approach &#x0201C;Bayes by Backprop.&#x0201D; Rather than apply Bayesian principles at the time of training, another strand of literature tries to approximate the posterior distribution only at the time of prediction. Gal and Ghahramani [<xref ref-type="bibr" rid="B36">36</xref>] introduce a concept called Monte Carlo Dropout, which applies a random dropout layer at the time of prediction to estimate uncertainty. Another variant of this framework is called Monte Carlo DropConnect by Mobiny et al. [<xref ref-type="bibr" rid="B37">37</xref>]. This variant uses the generalization of Dropout Layers, called DropConnect Layers, where the dropping is applied directly to each weight, rather than to each output unit. The DropConnect approach has outperformed Dropout in many applications and data sets, see, e.g., [<xref ref-type="bibr" rid="B37">37</xref>]. Another strategy is to use so-called hypernetworks [<xref ref-type="bibr" rid="B38">38</xref>]. This type of network is a neural network that produces parameters of another neural network (so-called primary network) with random noise input. Finally, the hyper and primary neural networks together form a single model that can easily be trained by backpropagation. Another strand of literature applies an ensemble of methods and uses their information to approximate uncertainty, see, e.g., [<xref ref-type="bibr" rid="B39">39</xref>&#x02013;<xref ref-type="bibr" rid="B41">41</xref>]. However, these approaches are computationally more expensive than Dropout or DropConnect-related approaches. A further strand of literature aims at predicting the types of uncertainty directly within the neural network structure. One of these approaches is called deep evidential regression by Amini et al. [<xref ref-type="bibr" rid="B42">42</xref>] and extensions by Meinert et al. [<xref ref-type="bibr" rid="B43">43</xref>], which learn the parameters of a so-called evidential distribution. This method quantifies uncertainty without extra computations after training. Additionally, the estimated parameters of the evidential distribution can be plugged into analytical formulas for epistemic and aleatoric uncertainty. This approach quantifies uncertainty in a fast and traceable way without any additional computational burden. Because it has many advantages, this article relies on the deep evidential regression framework.</p>
<p>We contribute to the literature in two important ways. First, this article applies an uncertainty estimation framework in machine learning LGD estimation and prediction. We observe that deep evidential regression provides a sound and fast framework to quantify both, aleatoric and epistemic uncertainty. This is important with respect to regulatory concerns. Not only is explainability required by regulators, the quantification of uncertainty surrounding their predictions may be a fruitful step toward the acceptance of machine learning algorithms in regulatory contexts. Second, this article analyzes the ratio between aleatoric and epistemic uncertainty and finds that aleatoric uncertainty is much larger than epistemic uncertainty. This implies that the largest share of uncertainty comes from the data itself and, thus, cannot be reduced. Epistemic uncertainty, i.e., model uncertainty, plays only a minor role. This may explain why advanced methods may outperform simpler ones, but still, the estimation and prediction of LGD remain a very challenging task.</p>
<p>The remainder of this article is structured as follows. Data is presented in Section 2, while the methodology is described in Section 3. Our empirical results are discussed in Sections 4, 5 concludes.</p>
</sec>
<sec id="s2">
<title>2. Data</title>
<p>To analyze bond loss given defaults, we use Moody&#x00027;s Default and Recovery Database (Moody&#x00027;s DRD). This data has information regarding the market-based LGD, default type, and various other characteristics of 1,999 US bonds from January 1990 until December 2019<xref ref-type="fn" rid="fn0003"><sup>3</sup></xref>. We use bond characteristics such as the coupon rate, the maturity, the seniority of the bond, and an additional variable, which indicates whether the bond is backed by guarantees beyond the bond issuer&#x00027;s assets. Furthermore, we include a binary variable, which determines if the issuer&#x00027;s industrial sector belongs to the FIRE (finance, insurance, or real estate) sector. To control for differences due to the reason of default, we also include the default type in our analysis. In addition to that, we add the S&#x00026;P 500 return to control for the macroeconomic surrounding. Consistent with Gambetti et al. [<xref ref-type="bibr" rid="B4">4</xref>], we calculate the US default rates directly from Moody&#x00027;s DRD. To control for withdrawal effects, we use the number of defaults occurring in a given month divided by the number of firms followed in the same period. Since we are interested in the uncertainty in the LGD estimation, we include uncertainty variables. To incorporate financial uncertainty, we use the financial uncertainty index by Jurado et al. [<xref ref-type="bibr" rid="B44">44</xref>] and Ludvigson et al. [<xref ref-type="bibr" rid="B45">45</xref>] which is publicly available on their website. Finally, we include the news-based economic policy uncertainty index provided by Baker et al. [<xref ref-type="bibr" rid="B46">46</xref>], which is also accessible on his website. To keep predictive properties, we lag all macroeconomic variables and uncertainty indices by one-quarter similar to Olson et al. [<xref ref-type="bibr" rid="B8">8</xref>].</p>
<p>Our dependent variable shows a mode at 90%, illustrated in <xref ref-type="fig" rid="F1">Figure 1</xref>. This is consistent with Gambetti et al. [<xref ref-type="bibr" rid="B4">4</xref>], who analyzed the recovery rates. The average LGD is about 64.29% as shown in <xref ref-type="table" rid="T1">Table 1</xref> with a standard deviation of 27.59%. The sample also covers nearly the whole range of market-based LGDs with a minimum of 0.5% and a maximum of 99.99%.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Histogram of LGDs.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-08-1076083-g0001.tif"/>
</fig>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Descriptive statistics of LGDs across the whole sample.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center"><bold><italic>N</italic></bold></th>
<th valign="top" align="center"><bold>Min</bold>.</th>
<th valign="top" align="center"><bold>Median</bold></th>
<th valign="top" align="center"><bold>Mean</bold></th>
<th valign="top" align="center"><bold>Max</bold></th>
<th valign="top" align="center"><bold>St.Dev</bold>.</th>
<th valign="top" align="center"><bold>Skewness</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">LGD</td>
<td valign="top" align="center">1999</td>
<td valign="top" align="center">0.50</td>
<td valign="top" align="center">73.00</td>
<td valign="top" align="center">64.29</td>
<td valign="top" align="center">99.99</td>
<td valign="top" align="center">27.59</td>
<td valign="top" align="center">&#x02013;0.59</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>All displayed values except the sample size and skewness are expressed as percentages.</p>
</table-wrap-foot>
</table-wrap>
<p><xref ref-type="table" rid="T2">Table 2</xref> lists the variables and data types. In total, we use six bond-related variables, two macroeconomic, and two uncertainty-related variables. The categorical bond-related variables act as control variables for differences in the bond structure.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p>Selected variables for the network.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Variable</bold></th>
<th valign="top" align="left"><bold>Variable type</bold></th>
<th valign="top" align="left"><bold>Data type</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Coupon rate</td>
<td valign="top" align="left">Bond</td>
<td valign="top" align="left">Continuous</td>
</tr>
<tr>
<td valign="top" align="left">Maturity</td>
<td valign="top" align="left">Bond</td>
<td valign="top" align="left">Continuous</td>
</tr>
<tr>
<td valign="top" align="left">Seniority</td>
<td valign="top" align="left">Bond</td>
<td valign="top" align="left">Categorical</td>
</tr>
<tr>
<td valign="top" align="left">Default type</td>
<td valign="top" align="left">Bond</td>
<td valign="top" align="left">Categorical</td>
</tr>
<tr>
<td valign="top" align="left">Backed guarantee</td>
<td valign="top" align="left">Bond</td>
<td valign="top" align="left">Binary</td>
</tr>
<tr>
<td valign="top" align="left">Industry type</td>
<td valign="top" align="left">Bond</td>
<td valign="top" align="left">Binary</td>
</tr>
<tr>
<td valign="top" align="left">S&#x00026;P 500</td>
<td valign="top" align="left">Macroeconomic</td>
<td valign="top" align="left">Continuous</td>
</tr>
<tr>
<td valign="top" align="left">Default rate</td>
<td valign="top" align="left">Macroeconomic</td>
<td valign="top" align="left">Continuous</td>
</tr>
<tr>
<td valign="top" align="left">Financial uncertainty</td>
<td valign="top" align="left">Uncertainty</td>
<td valign="top" align="left">Continuous</td>
</tr>
<tr>
<td valign="top" align="left">News-based EPU</td>
<td valign="top" align="left">Uncertainty</td>
<td valign="top" align="left">Continuous</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><xref ref-type="table" rid="T3">Table 3</xref> shows the correlations between macroeconomic and uncertainty variables. The correlation is moderate to strong across the variables. This must be taken into account when interpreting the effects of the variables. The only exception is the financial uncertainty index and the default rate, which have a very weak correlation.</p>
<table-wrap position="float" id="T3">
<label>Table 3</label>
<caption><p>Upper triangle of the correlation matrix of macroeconomic and uncertainty features.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center"><bold>S&#x00026;P500</bold></th>
<th valign="top" align="center"><bold>Default rate</bold></th>
<th valign="top" align="center"><bold>Fin. unc</bold>.</th>
<th valign="top" align="center"><bold>News-based EPU</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">S&#x00026;P500</td>
<td valign="top" align="center">100.00</td>
<td valign="top" align="center">&#x02013;65.18</td>
<td valign="top" align="center">&#x02013;41.69</td>
<td valign="top" align="center">&#x02013;65.21</td>
</tr>
<tr>
<td valign="top" align="left">Default rate</td>
<td/>
<td valign="top" align="center">100.00</td>
<td valign="top" align="center">5.25</td>
<td valign="top" align="center">43.85</td>
</tr>
<tr>
<td valign="top" align="left">Fin. unc.</td>
<td/>
<td/>
<td valign="top" align="center">100.00</td>
<td valign="top" align="center">51.88</td>
</tr>
<tr>
<td valign="top" align="left">News-based EPU</td>
<td/>
<td/>
<td/>
<td valign="top" align="center">100.00</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>All displayed values are expressed as percentages.</p>
</table-wrap-foot>
</table-wrap>
<p><xref ref-type="table" rid="T4">Table 4</xref> shows descriptive statistics for the seniority of the bond. Each subcategory captures the whole range of LGDs, while the mean and the median of Senior Secured bonds are comparably low. In addition, the Senior Secured bonds have almost no skewness, while the skewness of Senior Unsecured bonds is moderate. The skewness of Senior Subordinated and Subordinated bonds is more negative and fairly similar. Comparing the descriptive statistics across seniority, we observe that the locations of the distributions are different, but the variation of the distribution is considerably large. This might be the first indication of large (data) uncertainty.</p>
<table-wrap position="float" id="T4">
<label>Table 4</label>
<caption><p>Descriptive statistics of LGDs according to the seniority of the defaulted bond.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center"><bold><italic>N</italic></bold></th>
<th valign="top" align="center"><bold>Min</bold>.</th>
<th valign="top" align="center"><bold>Median</bold></th>
<th valign="top" align="center"><bold>Mean</bold></th>
<th valign="top" align="center"><bold>Max</bold></th>
<th valign="top" align="center"><bold>St.Dev</bold>.</th>
<th valign="top" align="center"><bold>Skewness</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Senior secured</td>
<td valign="top" align="center">180</td>
<td valign="top" align="center">0.50</td>
<td valign="top" align="center">49.75</td>
<td valign="top" align="center">50.48</td>
<td valign="top" align="center">99.25</td>
<td valign="top" align="center">28.87</td>
<td valign="top" align="center">&#x02013;0.02</td>
</tr>
<tr>
<td valign="top" align="left">Senior unsecured</td>
<td valign="top" align="center">1,305</td>
<td valign="top" align="center">0.50</td>
<td valign="top" align="center">72.50</td>
<td valign="top" align="center">63.37</td>
<td valign="top" align="center">99.97</td>
<td valign="top" align="center">27.93</td>
<td valign="top" align="center">&#x02013;0.53</td>
</tr>
<tr>
<td valign="top" align="left">Senior subordinated</td>
<td valign="top" align="center">353</td>
<td valign="top" align="center">0.50</td>
<td valign="top" align="center">79.0</td>
<td valign="top" align="center">72.07</td>
<td valign="top" align="center">99.99</td>
<td valign="top" align="center">23.97</td>
<td valign="top" align="center">&#x02013;0.99</td>
</tr>
<tr>
<td valign="top" align="left">Subordinated</td>
<td valign="top" align="center">161</td>
<td valign="top" align="center">0.87</td>
<td valign="top" align="center">74.0</td>
<td valign="top" align="center">70.17</td>
<td valign="top" align="center">99.87</td>
<td valign="top" align="center">23.74</td>
<td valign="top" align="center">&#x02013;0.90</td>
</tr>
</tbody>
</table>
</table-wrap>
<p><xref ref-type="table" rid="T5">Table 5</xref> categorizes the LGDs by their default type, which alters some aspects of the overall picture. Compared to <xref ref-type="table" rid="T1">Table 1</xref>, the categories Distressed Exchange and Others have lower mean and median LGD and positive skewness. The biggest difference between these two categories is that Distressed Exchange has a lower standard deviation. Missed Interest Payment and Prepackaged Chapter 11 show similar descriptive statistics compared to the whole sample in <xref ref-type="table" rid="T1">Table 1</xref>. The last category Chapter 11 has even higher mean and median LGD and the skewness is fairly low.</p>
<table-wrap position="float" id="T5">
<label>Table 5</label>
<caption><p>Descriptive statistics of LGDs according to the default type.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th/>
<th valign="top" align="center"><bold><italic>N</italic></bold></th>
<th valign="top" align="center"><bold>Min</bold>.</th>
<th valign="top" align="center"><bold>Median</bold></th>
<th valign="top" align="center"><bold>Mean</bold></th>
<th valign="top" align="center"><bold>Max</bold></th>
<th valign="top" align="center"><bold>St.Dev</bold>.</th>
<th valign="top" align="center"><bold>Skewness</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Chapter 11</td>
<td valign="top" align="center">705</td>
<td valign="top" align="center">0.75</td>
<td valign="top" align="center">85.00</td>
<td valign="top" align="center">73.48</td>
<td valign="top" align="center">99.99</td>
<td valign="top" align="center">25.46</td>
<td valign="top" align="center">&#x02013;1.25</td>
</tr>
<tr>
<td valign="top" align="left">Distressed exchange</td>
<td valign="top" align="center">322</td>
<td valign="top" align="center">0.50</td>
<td valign="top" align="center">40.25</td>
<td valign="top" align="center">44.51</td>
<td valign="top" align="center">94.87</td>
<td valign="top" align="center">24.43</td>
<td valign="top" align="center">0.18</td>
</tr>
<tr>
<td valign="top" align="left">Missed interest payment</td>
<td valign="top" align="center">677</td>
<td valign="top" align="center">1.00</td>
<td valign="top" align="center">73.50</td>
<td valign="top" align="center">66.79</td>
<td valign="top" align="center">99.99</td>
<td valign="top" align="center">23.98</td>
<td valign="top" align="center">&#x02013;0.69</td>
</tr>
<tr>
<td valign="top" align="left">Others</td>
<td valign="top" align="center">161</td>
<td valign="top" align="center">1.00</td>
<td valign="top" align="center">47.00</td>
<td valign="top" align="center">51.22</td>
<td valign="top" align="center">99.75</td>
<td valign="top" align="center">31.38</td>
<td valign="top" align="center">0.20</td>
</tr>
<tr>
<td valign="top" align="left">Prepackaged chapter 11</td>
<td valign="top" align="center">134</td>
<td valign="top" align="center">0.50</td>
<td valign="top" align="center">76.88</td>
<td valign="top" align="center">66.58</td>
<td valign="top" align="center">99.64</td>
<td valign="top" align="center">28.64</td>
<td valign="top" align="center">&#x02013;0.64</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec sec-type="methods" id="s3">
<title>3. Methods</title>
<p>To model the uncertainty of LGDs, we use a framework called deep evidential regression by Amini et al. [<xref ref-type="bibr" rid="B42">42</xref>]. This method is capable of determining the uncertainty of regression tasks and estimating the epistemic and the aleatoric uncertainty. One way to model aleatoric uncertainty in the regression case is to train a neural network with weights <italic>w</italic> based on the negative log-likelihood of the normal distribution, and thus perform a maximum likelihood optimization. The objective function for each observation is Amini et al. [<xref ref-type="bibr" rid="B42">42</xref>]:</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M1"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>L</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>2</mml:mn><mml:mi>&#x003C0;</mml:mi><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mfrac><mml:mrow><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Where <italic>y</italic><sub><italic>i</italic></sub> is the <italic>i</italic>-th LGD observation of the sample with size <italic>N</italic> and &#x003BC;<sub><italic>i</italic></sub> and <inline-formula><mml:math id="M2"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> the mean and the variance of the assumed normal distribution for observation <italic>i</italic>. Since &#x003BC;<sub><italic>i</italic></sub> and <inline-formula><mml:math id="M3"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> are unknown, they can be modeled in a probabilistic manner by assuming they follow prior distributions <italic>q</italic>(&#x003BC;<sub><italic>i</italic></sub>) and <inline-formula><mml:math id="M4"><mml:mi>q</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>. Following Amini et al. [<xref ref-type="bibr" rid="B42">42</xref>], for &#x003BC;<sub><italic>i</italic></sub> a normal distribution and for <inline-formula><mml:math id="M5"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> a inverse gamma distribution is chosen:</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M6"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:msubsup><mml:mrow><mml:mi>&#x003BD;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E3"><label>(3)</label><mml:math id="M7"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup><mml:mo>&#x0007E;</mml:mo><mml:msup><mml:mrow><mml:mtext>&#x00393;</mml:mtext></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>With &#x003B3;<sub><italic>i</italic></sub> &#x02208; &#x0211D;, &#x003BD;<sub><italic>i</italic></sub> &#x0003E; 0, &#x003B1;<sub><italic>i</italic></sub> &#x0003E; 1 and &#x003B2;<sub><italic>i</italic></sub> &#x0003E; 0. Factorizing the joint prior distribution <inline-formula><mml:math id="M8"><mml:mi>q</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>q</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003BC;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mi>q</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> results in a normal inverse gamma distribution:</p>
<disp-formula id="E4"><label>(4)</label><mml:math id="M10"><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mtable columnalign='left'><mml:mtr columnalign='left'><mml:mtd columnalign='left'><mml:mrow><mml:mi>p</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msubsup><mml:mi>&#x003C3;</mml:mi><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:msubsup><mml:mo stretchy='false'>&#x0007C;</mml:mo><mml:msub><mml:mi>&#x003B3;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x003BD;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x003B1;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mi>&#x003B2;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msubsup><mml:mi>&#x003B2;</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:msub><mml:mi>&#x003B1;</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msubsup><mml:msqrt><mml:mrow><mml:msub><mml:mi>&#x003BD;</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msqrt></mml:mrow><mml:mrow><mml:mtext>&#x00393;</mml:mtext><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B1;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:msqrt><mml:mrow><mml:mn>2</mml:mn><mml:mi>&#x003C0;</mml:mi><mml:msubsup><mml:mi>&#x003C3;</mml:mi><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac><mml:msup><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mrow><mml:msubsup><mml:mi>&#x003C3;</mml:mi><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:mrow></mml:mfrac><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mi>&#x003B1;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtable columnalign='left'><mml:mtr columnalign='left'><mml:mtd columnalign='left'><mml:mrow><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>e</mml:mi><mml:mi>x</mml:mi><mml:mi>p</mml:mi><mml:mo stretchy='false'>&#x0007B;</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mfrac><mml:mrow><mml:mn>2</mml:mn><mml:msub><mml:mi>&#x003B2;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mi>&#x003BD;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:msup><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B3;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003BC;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msup></mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:msubsup><mml:mi>&#x003C3;</mml:mi><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:mrow></mml:mfrac><mml:mo stretchy='false'>&#x0007D;</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>This normal inverse gamma distribution can be viewed in terms of virtual observations, which can describe the total evidence &#x003A6;<sub><italic>i</italic></sub>. Contrary to Amini et al. [<xref ref-type="bibr" rid="B42">42</xref>], we take the suggested definition of the total evidence in Meinert et al. [<xref ref-type="bibr" rid="B43">43</xref>] as &#x003A6;<sub><italic>i</italic></sub> &#x0003D; &#x003BD;<sub><italic>i</italic></sub> &#x0002B; 2&#x003B1;<sub><italic>i</italic></sub>, because as derived in Meinert et al. [<xref ref-type="bibr" rid="B47">47</xref>], the parameters &#x003BD;<sub><italic>i</italic></sub> and 2&#x003B1;<sub><italic>i</italic></sub> of the conjugated prior normal inverse gamma distribution can be interpreted as virtual observations of the prior distribution, where &#x003BC;<sub><italic>i</italic></sub> and <inline-formula><mml:math id="M11"><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:math></inline-formula> are estimated from. As a result, the total evidence is the sum of those two expressions. By choosing the negative inverse gamma distribution as the prior distribution, there exists an analytical solution for computing the marginal likelihood or model evidence if the data follows a normal distribution [<xref ref-type="bibr" rid="B42">42</xref>, <xref ref-type="bibr" rid="B43">43</xref>]. The marginal likelihood, therefore, follows a student-t distribution:</p>
<disp-formula id="E5"><label>(5)</label><mml:math id="M12"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>p</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003BD;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mi>S</mml:mi><mml:mi>t</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>y</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003BD;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003BD;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo>,</mml:mo><mml:mn>2</mml:mn><mml:msub><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The marginal likelihood represents the likelihood of obtaining observation <italic>y</italic><sub><italic>i</italic></sub> given the parameter of the prior distribution, in this case, &#x003B3;<sub><italic>i</italic></sub>, &#x003BD;<sub><italic>i</italic></sub>, &#x003B1;<sub><italic>i</italic></sub>, and &#x003B2;<sub><italic>i</italic></sub>. Therefore, maximizing the marginal likelihood maximizes the model fit. This can be achieved by minimizing the negative log likelihood of <italic>p</italic>(<italic>y</italic><sub><italic>i</italic></sub>|&#x003B3;<sub><italic>i</italic></sub>, &#x003BD;<sub><italic>i</italic></sub>, &#x003B1;<sub><italic>i</italic></sub>, &#x003B2;<sub><italic>i</italic></sub>). Due to the special conjugated setting with normally distributed data and normal inverse prior distributions, the marginal likelihood can be calculated in a closed form [<xref ref-type="bibr" rid="B42">42</xref>]:</p>
<disp-formula id="E6"><label>(6)</label><mml:math id="M13"><mml:mrow><mml:mtable columnalign='left'><mml:mtr columnalign='left'><mml:mtd columnalign='left'><mml:mtable columnalign='left'><mml:mtr><mml:mtd><mml:mtable columnalign='left'><mml:mtr columnalign='left'><mml:mtd columnalign='left'><mml:mrow><mml:msubsup><mml:mi>L</mml:mi><mml:mi>i</mml:mi><mml:mrow><mml:mi>N</mml:mi><mml:mi>L</mml:mi><mml:mi>L</mml:mi></mml:mrow></mml:msubsup><mml:mo stretchy='false'>(</mml:mo><mml:mi>w</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mfrac><mml:mi>&#x003C0;</mml:mi><mml:mrow><mml:msub><mml:mi>&#x003BD;</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mfrac><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003B1;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mtext>&#x003A9;</mml:mtext><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0002B;</mml:mo><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B1;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtable columnalign='left'><mml:mtr columnalign='left'><mml:mtd columnalign='left'><mml:mrow><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:msup><mml:mrow><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003B3;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mn>2</mml:mn></mml:msup><mml:msub><mml:mi>&#x003BD;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mtext>&#x003A9;</mml:mtext><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo><mml:mo>&#x0002B;</mml:mo><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mfrac><mml:mrow><mml:mtext>&#x00393;</mml:mtext><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B1;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo stretchy='false'>)</mml:mo></mml:mrow><mml:mrow><mml:mtext>&#x00393;</mml:mtext><mml:mo stretchy='false'>(</mml:mo><mml:msub><mml:mi>&#x003B1;</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mfrac><mml:mn>1</mml:mn><mml:mn>2</mml:mn></mml:mfrac><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mfrac><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:math></disp-formula>
<p>Such that &#x003A9;<sub><italic>i</italic></sub> &#x0003D; 2&#x003B2;<sub><italic>i</italic></sub>(1 &#x0002B; &#x003BD;<sub><italic>i</italic></sub>) and &#x00393;(.) represents the gamma function. This closed-form expression makes deep evidential regression networks fast to compute. To get an accurate estimate of the aleatoric and the epistemic uncertainty the loss function has to be regularized. Contrary to the original formulation of Amini et al. [<xref ref-type="bibr" rid="B42">42</xref>], Meinert et al. [<xref ref-type="bibr" rid="B43">43</xref>] suggest a different regularization term because when using the original formulation the regularized likelihood is insufficient to find the parameters of the marginal likelihood. Therefore, we follow the approach of Meinert et al. [<xref ref-type="bibr" rid="B43">43</xref>] and use the adjusted regularization term. This adjustment scales the residuals by the width of the student-t distribution in Eq. (5), <italic>w</italic><sub><italic>S</italic><sub><italic>t</italic></sub><sub><italic>i</italic></sub></sub>, such that the gradients of &#x003A6;<sub><italic>i</italic></sub> and therefore, of &#x003BD;<sub><italic>i</italic></sub> do not tend to get very large in noisy regions:</p>
<disp-formula id="E7"><label>(7)</label><mml:math id="M14"><mml:mrow><mml:mtable columnalign='left'><mml:mtr columnalign='left'><mml:mtd columnalign='left'><mml:mrow><mml:msubsup><mml:mi>L</mml:mi><mml:mi>i</mml:mi><mml:mi>R</mml:mi></mml:msubsup><mml:mo stretchy='false'>(</mml:mo><mml:mi>w</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo>|</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mi>y</mml:mi><mml:mi>i</mml:mi></mml:msub><mml:mo>&#x02212;</mml:mo><mml:msub><mml:mi>&#x003B3;</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow><mml:mrow><mml:msub><mml:mi>w</mml:mi><mml:mrow><mml:mi>S</mml:mi><mml:msub><mml:mi>t</mml:mi><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow><mml:mo>|</mml:mo></mml:mrow></mml:mrow><mml:mi>p</mml:mi></mml:msup><mml:msub><mml:mtext>&#x003A6;</mml:mtext><mml:mi>i</mml:mi></mml:msub></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mrow></mml:math></disp-formula>
<p>With <italic>p</italic> being the strength of the residuals on the regularization. The loss function for the neural network can therefore be calculated as:</p>
<disp-formula id="E8"><label>(8)</label><mml:math id="M15"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msubsup><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mi>L</mml:mi><mml:mi>L</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mtext>&#x003BB;</mml:mtext><mml:msubsup><mml:mrow><mml:mi>L</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>R</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Where &#x003BB; is a hyperparameter to determine the strength of the regularization in Eq. (7). Since &#x003BB; and <italic>p</italic> have to be determined in advance the network has four output neurons, corresponding to each parameter of the marginal likelihood in Eq. (5). These parameters can be used to quantify uncertainty. Due to the close connection between the student-t and the normal distribution, <italic>w</italic><sub><italic>S</italic><sub><italic>t</italic></sub><sub><italic>i</italic></sub></sub> can be used as an approximation for the aleatoric uncertainty [<xref ref-type="bibr" rid="B43">43</xref>]. Following Meinert et al. [<xref ref-type="bibr" rid="B43">43</xref>], the epistemic and aleatoric uncertainty can be derived as follows:</p>
<disp-formula id="E9"><label>(9)</label><mml:math id="M16"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mi>a</mml:mi><mml:msub><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub></mml:mtd><mml:mtd><mml:mo>&#x02261;</mml:mo><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>S</mml:mi><mml:msub><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msqrt><mml:mrow><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003BD;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:msub><mml:mrow><mml:mi>&#x003BD;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mrow></mml:msqrt></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E10"><label>(10)</label><mml:math id="M17"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>u</mml:mi></mml:mrow><mml:mrow><mml:mi>e</mml:mi><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub></mml:mtd><mml:mtd><mml:mo>&#x02261;</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msqrt><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003BD;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msqrt></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>By employing this approach, we assume that our dependent variable, <italic>y</italic>, follows a normal distribution. LGDs are commonly bound in the interval between zero to one, which is only a part of the space of the normal distribution. Hence, there is the possibility that we obtain predicted values outside this range. However, using the normality assumption is very common in LGD research as the OLS regression is frequently used as the main method or at least as a benchmark to other methods, see, e.g., [<xref ref-type="bibr" rid="B5">5</xref>, <xref ref-type="bibr" rid="B11">11</xref>, <xref ref-type="bibr" rid="B13">13</xref>&#x02013;<xref ref-type="bibr" rid="B15">15</xref>, <xref ref-type="bibr" rid="B17">17</xref>, <xref ref-type="bibr" rid="B29">29</xref>, <xref ref-type="bibr" rid="B48">48</xref>&#x02013;<xref ref-type="bibr" rid="B53">53</xref>]. Anticipating the results in Section 4, we will see that the predicted values for almost all bonds lie in the interval between zero to one and, thus, our approach produces reasonable estimates. Furthermore, the deep evidential regression approach requires some assumptions to obtain a closed-form solution. For other distributional assumptions, e.g., a beta distribution for the LGD, there is no closed-form marginal likelihood known, which, if used, would eliminate the advantages of this approach.</p>
<p>To unveil the relationships modeled by the neural network, we use Accumulated Local Effect (ALE) plots by Apley and Zhu [<xref ref-type="bibr" rid="B54">54</xref>]. ALE plots visualize the average effect of the independent variables on the prediction. Another advantage of ALE plots over other explainable artificial intelligence (XAI) methods is that they are unbiased and fast to compute. As mentioned in Section 2, there is a moderate to high correlation between macroeconomic and uncertainty-related variables. Therefore, the XAI method has to be robust to correlations, which is another advantage of ALE plots. For an independent variable <inline-formula><mml:math id="M18"><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>, the total range of observed values is divided into <italic>K</italic> buckets. This is accomplished by defining <italic>Z</italic><sub><italic>j,k</italic></sub> as the <inline-formula><mml:math id="M19"><mml:mfrac><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>K</mml:mi></mml:mrow></mml:mfrac></mml:math></inline-formula> quantile of the empirical distribution. Therefore <italic>Z</italic><sub><italic>j</italic>, 0</sub> is the minimum and <italic>Z</italic><sub><italic>j,K</italic></sub> the maximum value of <italic>Z</italic><sub><italic>j</italic></sub>. Following this approach, <italic>S</italic><sub><italic>j,k</italic></sub> can be defined as the set of values within the left open interval from <italic>Z</italic><sub><italic>j, k</italic>&#x02212;1</sub> to <italic>Z</italic><sub><italic>j,k</italic></sub> with <italic>n</italic><sub><italic>j,k</italic></sub> as the number of observations in <italic>S</italic><sub><italic>j,k</italic></sub>. Let <italic>k</italic>(<italic>X</italic><sub><italic>j</italic></sub>) be an index that returns the bucket for a value of <italic>X</italic><sub><italic>j</italic></sub>, then the (uncentered) accumulated local effect can be formalized as:</p>
<disp-formula id="E11"><label>(11)</label><mml:math id="M20"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi><mml:mi>L</mml:mi><mml:mi>E</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:munderover></mml:mstyle><mml:msup><mml:mrow><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mstyle displaystyle="true"><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>&#x02208;</mml:mo><mml:msub><mml:mrow><mml:mi>S</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munder></mml:mstyle><mml:mrow><mml:mo>[</mml:mo><mml:mrow><mml:mi>f</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Z</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mo>\</mml:mo><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:mi>f</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>Z</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mo>\</mml:mo><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo>]</mml:mo></mml:mrow><mml:mo>.</mml:mo></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p><inline-formula><mml:math id="M21"><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mo>\</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02208;</mml:mo><mml:msup><mml:mrow><mml:mi>&#x0211D;</mml:mi></mml:mrow><mml:mrow><mml:mi>N</mml:mi><mml:mo>&#x000D7;</mml:mo><mml:mi>P</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula> denotes the set of variables without the variable <italic>j</italic> of <italic>P</italic> variables and <italic>f</italic>(.) describes the neural network&#x00027;s output before the last transformation. The minuend in the square brackets denotes the prediction of <italic>f</italic>(.) if the observation <italic>i</italic> is replaced with <italic>Z</italic><sub><italic>j,k</italic></sub> and the subtrahend represents the prediction with <italic>Z</italic><sub><italic>j, k</italic>&#x02212;1</sub> instead of observation <italic>i</italic>. The differences are summed over every observation in <italic>S</italic><sub><italic>j,k</italic></sub>. This is done for each bucket <italic>k</italic> and therefore <italic>g</italic><sub><italic>ALE</italic></sub>(<italic>X</italic><sub><italic>j</italic></sub>) is the sum of the inner sums weighted by the number of observations in each bucket. In order to get the centered accumulated local effect with mean effect of zero for <italic>X</italic><sub><italic>j</italic></sub> the <italic>g</italic><sub><italic>ALE</italic></sub>(<italic>X</italic><sub><italic>j</italic></sub>) is centered as follows:</p>
<disp-formula id="E12"><label>(12)</label><mml:math id="M22"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mtext>&#x00398;</mml:mtext></mml:mrow><mml:mrow><mml:mi>A</mml:mi><mml:mi>L</mml:mi><mml:mi>E</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi><mml:mi>L</mml:mi><mml:mi>E</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msup><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:msup><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:msub><mml:mrow><mml:mi>g</mml:mi></mml:mrow><mml:mrow><mml:mi>A</mml:mi><mml:mi>L</mml:mi><mml:mi>E</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>X</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Because of the centering of the ALE plot, the y-axis describes the main effect of <italic>Z</italic><sub><italic>j</italic></sub> at a certain point in comparison to the average predicted value.</p>
<p>There exist several other XAI methods to open up the black box of machine learning methods. The aim in our article is to investigate non-linear relationships between features and LGD estimates. We therefore decide to use graphical methods. They include partial dependence plots (PDP) by Friedman [<xref ref-type="bibr" rid="B55">55</xref>] for global explanations and individual conditional expectation (ICE) plots by Goldstein et al. [<xref ref-type="bibr" rid="B56">56</xref>] for local explanations. However, the first method especially can suffer from biased results if features are correlated. This is frequently the case for the macroeconomic variables used in our article. We therefore use ALE plots by Apley et al. [<xref ref-type="bibr" rid="B54">54</xref>] because they are fast to compute and resolve the problem of correlated features as in our article. Moving beyond graphical methods, there are several other alternatives, such as LIME by Ribeiro et al. [<xref ref-type="bibr" rid="B57">57</xref>] or SHAP by Lundberg and Lee [<xref ref-type="bibr" rid="B58">58</xref>]. However, these methods cannot visualize the potential non-linear relationship between features and LGD estimates. Furthermore, both approaches are known to be problematic if features are correlated and are in some cases unstable, see, e.g., [<xref ref-type="bibr" rid="B59">59</xref>, <xref ref-type="bibr" rid="B60">60</xref>]. Thus, we use ALE plots by Apley et al. [<xref ref-type="bibr" rid="B54">54</xref>] as they are well suited for correlated features.</p>
<p>Concerning credit risk, these methods are frequently applied in recent literature. For example, Bellotti et al. [<xref ref-type="bibr" rid="B5">5</xref>] use ALE plots focusing on workout LGDs. Bastos and Matos [<xref ref-type="bibr" rid="B7">7</xref>] compare several XAI methods, including ALE plots as well as Shapley values. Similarly, Bussmann et al. [<xref ref-type="bibr" rid="B61">61</xref>] use SHAP to explain the predictions of the probability of default in fintech markets. Barbaglia et al. [<xref ref-type="bibr" rid="B25">25</xref>] use ALE plots to determine the drivers of mortgage probability of defaults in Europe. In related fields, such as cyber risk management or financial risk management in general, the application of XAI methods becomes more widespread as well, see, e.g., [<xref ref-type="bibr" rid="B62">62</xref>&#x02013;<xref ref-type="bibr" rid="B65">65</xref>].</p>
</sec>
<sec sec-type="results" id="s4">
<title>4. Results</title>
<sec>
<title>4.1. Learning strategy</title>
<p>We use the deep evidential regression framework for LGD estimation to analyze predictions as well as aleatoric and epistemic uncertainty. Our data set contains 1,999 observations from 1990 to 2019. To evaluate the neural network on unseen data, which are from different years than the training data, we split the data such that the observations from 2018 to 2019 are reserved as out-of-time data. The remaining data from 1990 to 2017 are split randomly into an 80:20 ratio. A 20% fraction of this data is preserved as out-of-sample data to compare model performance on unseen data which has the same structure. The 80% fraction of this split is the training data. This training data is used to train the model and validate the hyperparameters. Next, the continuous variables of the training data are standardized to adjust the mean of these variables to zero and the variance to one. This scaling is applied to the out-of-sample as well as the out-of-time data with the scaling parameter of the training data. The categorical variables are one hot encoded and one category is dropped. For seniority, Senior Unsecured, and, for the default type, Chapter 11 is dropped and thus act as reference categories. For the guarantee variable and Industry type, we use the positive category as reference. The last preprocessing step includes scaling the LGD values by a factor of 100, such that the LGDs can be interpreted in percentages and enhance computational stability.</p>
<p>After the preprocessing, hyperparameters for the neural network and the loss function have to be chosen<xref ref-type="fn" rid="fn0004"><sup>4</sup></xref>. The parameter <italic>p</italic> of Eq. (7) is set to 2 to strengthen the effect of the residuals on the regularization, see, e.g., [<xref ref-type="bibr" rid="B43">43</xref>]. The parameter &#x003BB; is set to 0.001. The analysis is also performed with &#x003BB; &#x0003D; 0.01 and &#x003BB; &#x0003D; 0.0001, but the differences are negligible. The most commonly used hyperparameters in a neural network are the learning rate, the number of layers, and the number of neurons. To avoid overfitting we included dropout layers, with a dropout rate, which must also be tuned. We use random search to obtain 200 different model constellations and validate them using 5-fold cross-validation. For the random search, we assume discrete or continuous distributions for each hyperparameter. <xref ref-type="table" rid="T6">Table 6</xref> displays the distributions of the hyperparameters of the neural network. The dropout rate for example is a decimal number, which is usually in the interval between 0, no regularization, and 0.5, strongest regularization. Therefore, we use a continuous uniform distribution to draw the dropout rates. Furthermore, 20% of the data from the iterating training folds are used for early stopping to avoid overfitting. Each of the five iterations is repeated five times, to reduce the effect of random weight initialization, and averaged. The best model is chosen such that the mean RMSE of the five hold-out fold of cross-validation is the smallest. To determine the number of neurons we use an approach similar to Kellner et al. [<xref ref-type="bibr" rid="B6">6</xref>]. As baseline neurons, we use (32, 16) with a maximum of two hidden layers. In this procedure, the multiplier is the factor that scales the baseline number of neurons<xref ref-type="fn" rid="fn0005"><sup>5</sup></xref>. As an activation function ReLU is chosen for all hidden units and the network is optimized via Adam. To ensure that &#x003BD;<sub><italic>i</italic></sub>, &#x003B1;<sub><italic>i</italic></sub>, and &#x003B2;<sub><italic>i</italic></sub> stay within the desired interval, their output neurons are activated by the softplus function, whereby 1 is added to the activated neuron of &#x003B1;<sub><italic>i</italic></sub>.</p>
<table-wrap position="float" id="T6">
<label>Table 6</label>
<caption><p>Setup and final values of the hyperparameter search.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Parameter</bold></th>
<th valign="top" align="center"><bold>Distribution</bold></th>
<th valign="top" align="center"><bold>Final parameter</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Learning rate</td>
<td valign="top" align="center"><italic>U</italic><sup><italic>c</italic></sup>&#x0007E;[0.0001, 0.01]</td>
<td valign="top" align="center">0.0029</td>
</tr>
<tr>
<td valign="top" align="left">Dropout rate</td>
<td valign="top" align="center"><italic>U</italic><sup><italic>c</italic></sup>&#x0007E;[0.0, 0.50]</td>
<td valign="top" align="center">0.4309</td>
</tr>
<tr>
<td valign="top" align="left">Hidden layer</td>
<td valign="top" align="center"><italic>U</italic><sup><italic>d</italic></sup>&#x0007E;[1, 2]</td>
<td valign="top" align="center">2</td>
</tr>
<tr>
<td valign="top" align="left">Multiple</td>
<td valign="top" align="center"><italic>U</italic><sup><italic>d</italic></sup>&#x0007E;[1, 4]</td>
<td valign="top" align="center">4</td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>The table shows the ranges for the hyperparameter search. <italic>U</italic><sup><italic>c</italic></sup> corresponds to the continuous uniform distribution, <italic>U</italic><sup><italic>d</italic></sup> corresponds to the discrete uniform distribution.</p>
</table-wrap-foot>
</table-wrap>
<p>The constellation of column three (final parameter) in <xref ref-type="table" rid="T6">Table 6</xref> is used to form the final network. For that, the network is trained on the training data, 20% of which is used for early stopping. Afterward the trained network is evaluated on the out-of-sample and on the out-of-time data. This procedure is repeated 25 times. <xref ref-type="table" rid="T7">Table 7</xref> provides the average values and summarizes the evaluation of the different data sets and compares it across different models. Since the loss function in Eq. (8) depends on &#x003BB; and <italic>p</italic>, changes in those parameters result in a loss of comparability.</p>
<table-wrap position="float" id="T7">
<label>Table 7</label>
<caption><p>Evaluation metrics.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Data set</bold></th>
<th valign="top" align="left"><bold>Method</bold></th>
<th valign="top" align="center"><bold>Evidential loss</bold></th>
<th valign="top" align="center"><bold>RMSE</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Training</td>
<td valign="top" align="left">Evidential neural network</td>
<td valign="top" align="center">4.1879</td>
<td valign="top" align="center"><underline>0.1813</underline></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Neural network</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center"><bold>0.1427</bold></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Linear regression</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">0.2088</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Transformed linear regression</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">0.2142</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Fractional response regression</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">0.2231</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Beta regression</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">0.2306</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">Out of sample</td>
<td valign="top" align="left">Evidential neural network</td>
<td valign="top" align="center">4.2574</td>
<td valign="top" align="center"><underline>0.1964</underline></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Neural network</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center"><bold>0.1742</bold></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Linear regression</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">0.2091</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Transformed linear regression</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">0.2100</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Fractional response regression</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">0.2183</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Beta regression</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">0.2328</td>
</tr>
<tr>
<td/>
<td/>
<td/>
<td/>
</tr>
<tr>
<td valign="top" align="left">Out of time</td>
<td valign="top" align="left">Evidential neural network</td>
<td valign="top" align="center">6.1888</td>
<td valign="top" align="center">0.4241</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Neural network</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center"><bold>0.3695</bold></td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Linear regression</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">0.4499</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Transformed linear regression</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">0.4674</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Fractional response regression</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center">0.4488</td>
</tr>
<tr>
<td/>
<td valign="top" align="left">Beta regression</td>
<td valign="top" align="center">-</td>
<td valign="top" align="center"><underline>0.3961</underline></td>
</tr>
</tbody>
</table>
<table-wrap-foot>
<p>For the calculation of the RMSE, the observed LGDs and the predicted LGDs, &#x003B3;, are rescaled to the original interval from zero to one by dividing the LGDs by 100 to make the RMSE comparable in the literature. The smallest RMSE per data set is printed in bold and the second best is underlined.</p>
</table-wrap-foot>
</table-wrap>
<p><xref ref-type="table" rid="T7">Table 7</xref> compares the neural network from the deep evidential framework to a neural network trained on the mean squared error and to common methods in the literature. These include the linear regression, the transformed linear regression, the beta regression, and the fractional response regression, see, e.g., [<xref ref-type="bibr" rid="B5">5</xref>, <xref ref-type="bibr" rid="B14">14</xref>]. For the transformed linear regression the LGDs are transformed by a logit transformation, which is then used to fit a linear regression. The predictions of this regression are transformed back to their original scale using the sigmoid function. Each model is trained on the same training data. For the neural network trained with mean squared error, the same grid search and cross-validation approach with early stopping is used<xref ref-type="fn" rid="fn0006"><sup>6</sup></xref>. Since the evidential neural network is the only model with the marginal likelihood as an objective function the evidential loss can only be computed for this model. To compare the evidential neural network with different models, we evaluated the models using the root mean squared error. Note that for computing the root mean squared error only one parameter, &#x003B3;, is needed since this parameter represents the prediction in terms of the LGD. From <xref ref-type="table" rid="T7">Table 7</xref>, we can see that the neural networks perform best on the training and the out-of-sample data. For the out-of-time data, the beta regression scores second best after the neural network trained with mean squared error, but the difference to the evidential neural networks is on the third digit.</p>
</sec>
<sec>
<title>4.2. Aleatoric and epistemic uncertainty in predictions</title>
<p>The deep evidential regression framework allows us to directly calculate the aleatoric and epistemic uncertainty for every prediction of our neural network. <xref ref-type="fig" rid="F2">Figure 2</xref> shows both types for our estimation sample. The x-axis shows the observation number for the predictions sorted in ascending order. The ordered LGDs are on the y-axis. The dark gray band around the ordered prediction is calculated by adding/subtracting the values of Eqs. (9) and (10) on our predictions. The light gray band is obtained by adding/subtracting two times the value of these equations. In the following, we call this &#x0201C;applying one or two standard errors of uncertainty&#x0201D; onto our predictions. The gray dots show the actual observed, i.e., true LGD realizations.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Uncertainty estimation in sample. <bold>(A)</bold> Aleatoric uncertainty. <bold>(B)</bold> Epistemic uncertainty.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-08-1076083-g0002.tif"/>
</fig>
<p>Comparing the two plots of <xref ref-type="fig" rid="F2">Figure 2</xref>, we observe that the aleatoric uncertainty covers a much larger range around our predictions than the epistemic uncertainty. Almost all true LGD realizations lie within the two standard errors of aleatoric uncertainty. Hence, the irreducible error or data uncertainty has the largest share of the total uncertainty. Recall that market-based LGDs are based on market expectations as they are calculated as 1 minus the traded market price 30 days after default. Therefore, the variation of the data also depends on market expectations which are notoriously difficult to estimate and to a large extent not predictable. Thus, it is reasonable that the aleatoric uncertainty is the main driver of the overall uncertainty. In contrast, the epistemic uncertainty, i.e., the model uncertainty, is considerably lower. This may be attributed to our database. This article covers nearly three decades including several recessions and upturns. Hence, we cover LGDs in many different points of the business cycle and across many industries and default reasons. Therefore, the data might be representative for the data generating process of market-based LGDs. Hence, the uncertainty due to limited sample size is relatively small in our application.</p>
<p>As we model all parameters of the evidential distribution dependent on the input features, we can also predict uncertainty for predictions in out-of-sample and out-of-time samples. Comparing <xref ref-type="fig" rid="F2">Figures 2</xref>, <xref ref-type="fig" rid="F3">3</xref> one might have expected that the epistemic uncertainty is increasing due to the lower sample size and the usage of unseen data. However, the functional relation of the epistemic uncertainty is calibrated on the estimation sample and transferred via prediction onto the out-of-sample data. Hence, if the feature values do not differ dramatically, the predicted uncertainty is similar. Only if we observe new realizations of our features in unexpected (untrained) value ranges, the uncertainty prediction should deviate strongly. Thus, we may use the prediction of the uncertainty also as a qualitative check of structural changes.</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Uncertainty estimation out-of-sample. <bold>(A)</bold> Aleatoric uncertainty. <bold>(B)</bold> Epistemic uncertainty.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-08-1076083-g0003.tif"/>
</fig>
<p>Structural changes in LGD estimation are primarily due to changes over time. This is one reason why some researchers argue to validate forecasting methods especially on out-of-time data sets, see, e.g., [<xref ref-type="bibr" rid="B3">3</xref>, <xref ref-type="bibr" rid="B8">8</xref>]. In our application, there is no qualitative sign of structural breaks via diverging uncertainty estimates in 2018 and 2019. Comparing <xref ref-type="fig" rid="F4">Figure 4</xref> with the former two, we observe a similar pattern. This might have been expected as the out-of-time-period is not known for specific crises or special circumstances.</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Uncertainty estimation out-of-time. <bold>(A)</bold> Aleatoric uncertainty. <bold>(B)</bold> Epistemic uncertainty.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-08-1076083-g0004.tif"/>
</fig>
<p>Comparing the course of the epistemic uncertainty in all three figures, we observe that the uncertainty bands become smaller as the predicted LGD values increase. This implies that the neural network becomes more confident in predicting larger LGDs. Comparing this course with the histogram in <xref ref-type="fig" rid="F1">Figure 1</xref>, one explanation for that might be the considerably larger sample size on the right-hand side. As we observe larger LGDs in our sample, the epistemic uncertainty in this area decreases.</p>
</sec>
<sec>
<title>4.3. Explaining LGD predictions</title>
<p>In this subsection, we take a deep dive into the drivers of the mean LGD predictions. As outlined in Section 3, we use ALE Plots to visualize the impact of our continuous features. We choose <italic>K</italic> &#x0003D; 10 buckets for all ALE plots. Overall we have three different sets of drivers. The first one consists of bond specific variables, subsequently we investigate drivers that reflect the overall macroeconomic developments and finally we follow Gambetti et al. [<xref ref-type="bibr" rid="B4">4</xref>] and include uncertainty-related variables. Evaluating the feature effects is important to validate that the inner mechanics of the uncertainty-aware neural network coincide with the economic intuitions. This is of major concern if financial institutions are tempted to use this framework for their capital requirement calculation. The requirement of explaining employed models is documented in many publications of regulatory authorities, see, e.g., [<xref ref-type="bibr" rid="B66">66</xref>&#x02013;<xref ref-type="bibr" rid="B69">69</xref>].</p>
<p><xref ref-type="fig" rid="F5">Figure 5</xref> shows the feature effect of bond-related drivers. The feature value range including a rugplot to visualize the distribution of the feature is shown on the x-axis. The effect of the driver on the LGD prediction is shown on the y-axis. We observe on the left-hand side of <xref ref-type="fig" rid="F5">Figure 5</xref>, a negative effect of the coupon rate up to a value of roughly 8%. This negative relation seems plausible as higher coupon rates may also imply higher reflows during the resolution of the bond and, thus decreases the Loss Given Default. The relation starts to become positive after 8%, which may be explained by the fact that a higher coupon rate also implies higher risk and, thus the potential reflow becomes more uncertain. Maturity has an almost linear and positive relation with the predicted LGD values. In general, the increase in LGD with longer maturity is explained by sell-side pressure from institutional investors which usually hold bonds with a longer maturity, see, e.g., [<xref ref-type="bibr" rid="B48">48</xref>]. These relations were also confirmed by Gambetti et al. [<xref ref-type="bibr" rid="B4">4</xref>], who find that bond-related variables have a significant impact on the mean market-based LGD.</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Bond-related drivers. <bold>(A)</bold> Coupon rate. <bold>(B)</bold> Maturity.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-08-1076083-g0005.tif"/>
</fig>
<p>With regard to features that describe the macroeconomic surrounding, <xref ref-type="fig" rid="F6">Figure 6</xref> shows their effect on the LGD prediction. The default rate is one of the best-known drivers of market-based LGDs and is used in various studies, see, e.g., [<xref ref-type="bibr" rid="B3">3</xref>, <xref ref-type="bibr" rid="B4">4</xref>, <xref ref-type="bibr" rid="B30">30</xref>, <xref ref-type="bibr" rid="B52">52</xref>]. The increasing course reflects the observation that LGDs tend to be higher in recession and crisis periods than in normal periods. This empirical fact also paves the way for generating so-called downturn estimates which should reflect this crisis behavior. These downturn estimates are also included in the calculation of the capital requirements for financial institutions, see, e.g., [<xref ref-type="bibr" rid="B70">70</xref>, <xref ref-type="bibr" rid="B71">71</xref>] or for downturn estimates of EAD see Betz et al. [<xref ref-type="bibr" rid="B72">72</xref>]. Similarly, we observe a negative relation of predicted LGDs and S&#x00026;P 500 returns, implying that LGDs increase if the returns become negative. Interestingly, positive returns have little impact on LGD predictions, which again, reinforces the downturn character of LGDs.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Macroeconomy-related drivers. <bold>(A)</bold> Default rate. <bold>(B)</bold> S&#x00026;P 500 return.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-08-1076083-g0006.tif"/>
</fig>
<p>Consistent with Gambetti et al. [<xref ref-type="bibr" rid="B4">4</xref>], who were the first to document the importance of uncertainty-related variables in the estimation of LGDs, we include two frequently used drivers as well, shown in <xref ref-type="fig" rid="F7">Figure 7</xref>. Financial uncertainty proposed by Jurado et al. [<xref ref-type="bibr" rid="B44">44</xref>] and the News-based EPU index by Baker et al. [<xref ref-type="bibr" rid="B46">46</xref>], which cover uncertainty based on fundamental financial values and news articles. Both show a rather flat course from the low to mid of their feature value range. However, there is a clear positive impact on LGDs when the uncertainty indices reach high levels. Again, this reinforces the crisis behavior of market-based LGDs. The importance of uncertainty-related variables is also confirmed by Sopitpongstorn et al. [<xref ref-type="bibr" rid="B9">9</xref>] who find a significant impact as well. In a similar sense, Nazemi et al. [<xref ref-type="bibr" rid="B30">30</xref>] use news text-based measures for predicting market-based LGDs and underlining their importance. To summarize, recent literature suggests that uncertainty-related variables should be used to include all kinds of expectations of the economics surrounding the model framework.</p>
<fig id="F7" position="float">
<label>Figure 7</label>
<caption><p>Uncertainty-related drivers. <bold>(A)</bold> Financial uncertainty. <bold>(B)</bold> News-based EPU.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fams-08-1076083-g0007.tif"/>
</fig>
</sec>
</sec>
<sec sec-type="conclusions" id="s5">
<title>5. Conclusion</title>
<p>Uncertainty estimation has become an active research domain in statistics and machine learning. However, there is a lack of quantification of uncertainty when applying machine learning to credit risk. This article investigates a recently published approach called Deep Evidential Regression by Amini et al. [<xref ref-type="bibr" rid="B42">42</xref>] and its extension by Meinert et al. [<xref ref-type="bibr" rid="B43">43</xref>]. This uncertainty framework has several advantages. First, it is easy to implement as one only has to change the loss function of the (deep) neural network and sightly adjust the output layer. Second, the predicted parameters of the adjusted network can easily be turned into mean prediction, aleatoric uncertainty, and epistemic uncertainty predictions. There are virtually no additional computational burdens to calculate predictions and their accompanying uncertainty. Third, the overall computational expense is much lower compared to approaches like Bayesian neural networks, ensemble methods, and bootstrapping. Furthermore, deep evidential regression belongs to a small class of frameworks which allow a direct, analytical disentangling of aleatoric and epistemic uncertainty. With these advantages, this framework may also be suitable for applications in financial institutions to accompany the usage of explainable artificial intelligence methods with quantification of aleatoric and epistemic uncertainty. Moreover, it is possible to include other variables, such as firm-specific financial risk factors, or to focus on non-listed companies. Further applications may also include the prediction of risk premiums in other asset pricing or forecasting the sale prices of real estate. Moreover, in other areas where predictions are critical such as health care, the quantification of prediction uncertainty may allow a broader application of machine learning methods.</p>
<p>This article uses almost 30 years of bond data to investigate the suitability of deep evidential regression on the challenging task of estimating market-based LGDs. The performance of the uncertainty-aware neural network is comparable to earlier literature and, thus, we do not see a large trade-off between accuracy and uncertainty quantification. This paper documents a novel finding regarding the ratio of aleatoric and epistemic uncertainty. Our results suggest that aleatoric uncertainty is the main driver of the overall uncertainty in LGD estimation. As this type is commonly known as the irreducible error, this gives rise to the conjecture that LGD estimation is notoriously difficult due to the high amount of data uncertainty. On the other hand, epistemic uncertainty that can be reduced or even set to zero with enough data plays only a minor role. Hence, the advantage of more complex and advanced methods, like machine learning, may be limited. However, this may not hold for all LGD data sets or if we look at different parts or parameters of the distribution other than the mean. Therefore, we do not argue that our results should be generalized to all aspects of LGDs, but are the first important steps to investigating the relation of aleatoric and epistemic uncertainty. Overall, understanding the determinants of both uncertainties can be key to getting a deeper understanding of the underlying process of market-based LGDs and, thus is certainly a fruitful path of future research.</p>
</sec>
<sec sec-type="data-availability" id="s6">
<title>Data availability statement</title>
<p>The data analyzed in this study is subject to the following licenses/restrictions: We do not have permission to provide the data. Requests to access these datasets should be directed to <email>maximilian.nagl&#x00040;ur.de</email>.</p>
</sec>
<sec sec-type="author-contributions" id="s7">
<title>Author contributions</title>
<p>MatN: conceptualization, methodology, software, formal analysis, and writing&#x02014;original draft. MaxN: conceptualization, methodology, data curation, software, formal analysis, and writing&#x02014;original draft. DR: conceptualization, methodology, resources, formal analysis, and writing&#x02014;review and editing. All authors contributed to the article and approved the submitted version.</p>
</sec>
</body>
<back>
<sec sec-type="funding-information" id="s8">
<title>Funding</title>
<p>The publication was supported by the funding program Open Access Publishing (DFG).</p>
</sec>
<ack><p>We would like to thank the referees for their comments and suggestions that have substantially improved the article.</p>
</ack>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s9">
<title>Publisher&#x00027;s note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<fn-group>
<fn id="fn0001"><p><sup>1</sup>Furthermore, several studies use machine learning to estimate PDs, see, e.g., [<xref ref-type="bibr" rid="B19">19</xref>&#x02013;<xref ref-type="bibr" rid="B23">23</xref>]. Concerning mortgage probability of default, see, e.g., [<xref ref-type="bibr" rid="B24">24</xref>&#x02013;<xref ref-type="bibr" rid="B27">27</xref>]. Overall, there is a consensus that machine learning methods outperform linear logit regression.</p></fn>
<fn id="fn0002"><p><sup>2</sup>Gambetti et al. [<xref ref-type="bibr" rid="B4">4</xref>] uses an extended version of the beta regression to model the mean and precision of market-based LGDs. This can be interpreted as focusing on the aleatoric uncertainty. However, the literature using machine learning algorithms lacks uncertainty estimation concerning LGD estimates.</p></fn>
<fn id="fn0003"><p><sup>3</sup>In the original sample with 2,205 bonds, there are 206 bonds with similar LGDs and the same issuer. Since we want to analyze the uncertainty of bonds and not of issuers, we exclude those observations from the data set. However, including these bonds reveals that the uncertainty around their values is considerably smaller, which might have been expected.</p></fn>
<fn id="fn0004"><p><sup>4</sup>Amini et al. [<xref ref-type="bibr" rid="B42">42</xref>] provide a python implementation for their paper at <ext-link ext-link-type="uri" xlink:href="https://github.com/aamini/evidential-deep-learning">https://github.com/aamini/evidential-deep-learning</ext-link>.</p></fn>
<fn id="fn0005"><p><sup>5</sup>For example, if we sample a multiplier of 4 in a two hidden layer network, we have (128, 64).</p></fn>
<fn id="fn0006"><p><sup>6</sup>The final parameters of the neural network trained with a mean squared error are very similar in terms of dropout rate (0.4397) and identical for the multiple and the number of hidden layers. The final learning rate (0.0004) is lower than that of the evidential neural network.</p></fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B1">
<label>1.</label>
<citation citation-type="web"><person-group person-group-type="author"><collab>European Banking Authority</collab></person-group>. <source>Risk Assessment of the European Banking System</source>. (<year>2021</year>). Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.eba.europa.eu/risk-analysis-and-data/risk-assessment-reports">https://www.eba.europa.eu/risk-analysis-and-data/risk-assessment-reports</ext-link></citation>
</ref>
<ref id="B2">
<label>2.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Altman</surname> <given-names>EI</given-names></name> <name><surname>Kalotay</surname> <given-names>EA</given-names></name></person-group>. <article-title>Ultimate recovery mixtures</article-title>. <source>J Bank Finance</source>. (<year>2014</year>) <volume>40</volume>:<fpage>116</fpage>&#x02013;<lpage>29</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbankfin.2013.11.021</pub-id></citation>
</ref>
<ref id="B3">
<label>3.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kalotay</surname> <given-names>EA</given-names></name> <name><surname>Altman</surname> <given-names>EI</given-names></name></person-group>. <article-title>Intertemporal forecasts of defaulted bond recoveries and portfolio losses</article-title>. <source>Rev Finance</source>. (<year>2017</year>) <volume>21</volume>:<fpage>433</fpage>&#x02013;<lpage>63</lpage>. <pub-id pub-id-type="doi">10.1093/rof/rfw028</pub-id></citation>
</ref>
<ref id="B4">
<label>4.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gambetti</surname> <given-names>P</given-names></name> <name><surname>Gauthier</surname> <given-names>G</given-names></name> <name><surname>Vrins</surname> <given-names>F</given-names></name></person-group>. <article-title>Recovery rates: uncertainty certainly matters</article-title>. <source>J Bank Finance</source>. (<year>2019</year>) <volume>106</volume>:<fpage>371</fpage>&#x02013;<lpage>83</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbankfin.2019.07.010</pub-id></citation>
</ref>
<ref id="B5">
<label>5.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bellotti</surname> <given-names>A</given-names></name> <name><surname>Brigo</surname> <given-names>D</given-names></name> <name><surname>Gambetti</surname> <given-names>P</given-names></name> <name><surname>Vrins</surname> <given-names>F</given-names></name></person-group>. <article-title>Forecasting recovery rates on non-performing loans with machine learning</article-title>. <source>Int J Forecast</source>. (<year>2021</year>) <volume>37</volume>:<fpage>428</fpage>&#x02013;<lpage>44</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijforecast.2020.06.009</pub-id></citation>
</ref>
<ref id="B6">
<label>6.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kellner</surname> <given-names>R</given-names></name> <name><surname>Nagl</surname> <given-names>M</given-names></name> <name><surname>R&#x000F6;sch</surname> <given-names>D</given-names></name></person-group>. <article-title>Opening the black box&#x02013;Quantile neural networks for loss given default prediction</article-title>. <source>J Bank Finance</source>. (<year>2022</year>) <volume>134</volume>:<fpage>106334</fpage>. <pub-id pub-id-type="doi">10.1016/j.jbankfin.2021.106334</pub-id></citation>
</ref>
<ref id="B7">
<label>7.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bastos</surname> <given-names>JA</given-names></name> <name><surname>Matos</surname> <given-names>SM</given-names></name></person-group>. <article-title>Explainable models of credit losses</article-title>. <source>Eur J Oper Res</source>. (<year>2021</year>) <volume>301</volume>:<fpage>386</fpage>&#x02013;<lpage>94</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejor.2021.11.009</pub-id></citation>
</ref>
<ref id="B8">
<label>8.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Olson</surname> <given-names>LM</given-names></name> <name><surname>Qi</surname> <given-names>M</given-names></name> <name><surname>Zhang</surname> <given-names>X</given-names></name> <name><surname>Zhao</surname> <given-names>X</given-names></name></person-group>. <article-title>Machine learning loss given default for corporate debt</article-title>. <source>J Empir Finance</source>. (<year>2021</year>) <volume>64</volume>:<fpage>144</fpage>&#x02013;<lpage>59</lpage>. <pub-id pub-id-type="doi">10.1016/j.jempfin.2021.08.009</pub-id></citation>
</ref>
<ref id="B9">
<label>9.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sopitpongstorn</surname> <given-names>N</given-names></name> <name><surname>Silvapulle</surname> <given-names>P</given-names></name> <name><surname>Gao</surname> <given-names>J</given-names></name> <name><surname>Fenech</surname> <given-names>JP</given-names></name></person-group>. <article-title>Local logit regression for loan recovery rate</article-title>. <source>J Bank Finance</source>. (<year>2021</year>) <volume>126</volume>:<fpage>106093</fpage>. <pub-id pub-id-type="doi">10.1016/j.jbankfin.2021.106093</pub-id></citation>
</ref>
<ref id="B10">
<label>10.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fraisse</surname> <given-names>H</given-names></name> <name><surname>Laporte</surname> <given-names>M</given-names></name></person-group>. <article-title>Return on investment on artificial intelligence: the case of bank capital requirement</article-title>. <source>J Bank Finance</source>. (<year>2022</year>) <volume>2022</volume>:<fpage>106401</fpage>. <pub-id pub-id-type="doi">10.1016/j.jbankfin.2022.106401</pub-id></citation>
</ref>
<ref id="B11">
<label>11.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Qi</surname> <given-names>M</given-names></name> <name><surname>Yang</surname> <given-names>X</given-names></name></person-group>. <article-title>Loss given default of high loan-to-value residential mortgages</article-title>. <source>J Bank Finance</source>. (<year>2009</year>) <volume>33</volume>:<fpage>788</fpage>&#x02013;<lpage>99</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbankfin.2008.09.010</pub-id></citation>
</ref>
<ref id="B12">
<label>12.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bastos</surname> <given-names>JA</given-names></name></person-group>. <article-title>Forecasting bank loans loss-given-default</article-title>. <source>J Bank Finance</source>. (<year>2010</year>) <volume>34</volume>:<fpage>2510</fpage>&#x02013;<lpage>17</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbankfin.2010.04.011</pub-id></citation>
</ref>
<ref id="B13">
<label>13.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bellotti</surname> <given-names>T</given-names></name> <name><surname>Crook</surname> <given-names>J</given-names></name></person-group>. <article-title>Loss given default models incorporating macroeconomic variables for credit cards</article-title>. <source>Int J Forecast</source>. (<year>2012</year>) <volume>28</volume>:<fpage>171</fpage>&#x02013;<lpage>82</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijforecast.2010.08.005</pub-id></citation>
</ref>
<ref id="B14">
<label>14.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Loterman</surname> <given-names>G</given-names></name> <name><surname>Brown</surname> <given-names>I</given-names></name> <name><surname>Martens</surname> <given-names>D</given-names></name> <name><surname>Mues</surname> <given-names>C</given-names></name> <name><surname>Baesens</surname> <given-names>B</given-names></name></person-group>. <article-title>Benchmarking regression algorithms for loss given default modeling</article-title>. <source>Int J Forecast</source>. (<year>2012</year>) <volume>28</volume>:<fpage>161</fpage>&#x02013;<lpage>70</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijforecast.2011.01.006</pub-id></citation>
</ref>
<ref id="B15">
<label>15.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Qi</surname> <given-names>M</given-names></name> <name><surname>Zhao</surname> <given-names>X</given-names></name></person-group>. <article-title>Comparison of modeling methods for loss given default</article-title>. <source>J Bank Finance</source>. (<year>2012</year>) <volume>35</volume>:<fpage>2842</fpage>&#x02013;<lpage>55</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbankfin.2011.03.011</pub-id></citation>
</ref>
<ref id="B16">
<label>16.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tong</surname> <given-names>ENC</given-names></name> <name><surname>Mues</surname> <given-names>C</given-names></name> <name><surname>Thomas</surname> <given-names>L</given-names></name></person-group>. <article-title>A zero-adjusted gamma model for mortgage loan loss given default</article-title>. <source>Int J Forecast</source>. (<year>2013</year>) <volume>29</volume>:<fpage>548</fpage>&#x02013;<lpage>62</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijforecast.2013.03.003</pub-id></citation>
</ref>
<ref id="B17">
<label>17.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kr&#x000FC;ger</surname> <given-names>S</given-names></name> <name><surname>R&#x000F6;sch</surname> <given-names>D</given-names></name></person-group>. <article-title>Downturn LGD modeling using quantile regression</article-title>. <source>J Bank Finance</source>. (<year>2017</year>) <volume>79</volume>:<fpage>42</fpage>&#x02013;<lpage>56</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbankfin.2017.03.001</pub-id></citation>
</ref>
<ref id="B18">
<label>18.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tomarchio</surname> <given-names>SD</given-names></name> <name><surname>Punzo</surname> <given-names>A</given-names></name></person-group>. <article-title>Modelling the loss given default distribution via a family of zero-and-one inflated mixture models</article-title>. <source>J R Stat Soc</source>. (<year>2019</year>) <volume>182</volume>:<fpage>1247</fpage>&#x02013;<lpage>66</lpage>. <pub-id pub-id-type="doi">10.1111/rssa.12466</pub-id></citation>
</ref>
<ref id="B19">
<label>19.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>Y</given-names></name> <name><surname>Chen</surname> <given-names>W</given-names></name></person-group>. <article-title>Entropy method of constructing a combined model for improving loan default prediction: a case study in China</article-title>. <source>J Operat Res Soc</source>. (<year>2021</year>) <volume>72</volume>:<fpage>1099</fpage>&#x02013;<lpage>109</lpage>. <pub-id pub-id-type="doi">10.1080/01605682.2019.1702905</pub-id></citation>
</ref>
<ref id="B20">
<label>20.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Petropoulos</surname> <given-names>A</given-names></name> <name><surname>Siakoulis</surname> <given-names>V</given-names></name> <name><surname>Stavroulakis</surname> <given-names>E</given-names></name> <name><surname>Vlachogiannakis</surname> <given-names>NE</given-names></name></person-group>. <article-title>Predicting bank insolvencies using machine learning techniques</article-title>. <source>Int J Forecast</source>. (<year>2020</year>) <volume>36</volume>:<fpage>1092</fpage>&#x02013;<lpage>113</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijforecast.2019.11.005</pub-id></citation>
</ref>
<ref id="B21">
<label>21.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Luo</surname> <given-names>J</given-names></name> <name><surname>Yan</surname> <given-names>X</given-names></name> <name><surname>Tian</surname> <given-names>Y</given-names></name></person-group>. <article-title>Unsupervised quadratic surface support vector machine with application to credit risk assessment</article-title>. <source>Eur J Oper Res</source>. (<year>2020</year>) <volume>280</volume>:<fpage>1008</fpage>&#x02013;<lpage>17</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejor.2019.08.010</pub-id></citation>
</ref>
<ref id="B22">
<label>22.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gunnarsson</surname> <given-names>BR</given-names></name> <name><surname>vanden Broucke</surname> <given-names>S</given-names></name> <name><surname>Baesens</surname> <given-names>B</given-names></name> <name><surname>&#x000D3;skarsd&#x000F3;ttir</surname> <given-names>M</given-names></name> <name><surname>Lemahieu</surname> <given-names>W</given-names></name></person-group>. <article-title>Deep learning for credit scoring: do or don&#x00027;t?</article-title> <source>Eur J Oper Res</source>. (<year>2021</year>) <volume>295</volume>:<fpage>292</fpage>&#x02013;<lpage>305</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejor.2021.03.006</pub-id></citation>
</ref>
<ref id="B23">
<label>23.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dumitrescu</surname> <given-names>E</given-names></name> <name><surname>Hu&#x000E9;</surname> <given-names>S</given-names></name> <name><surname>Hurlin</surname> <given-names>C</given-names></name> <name><surname>Tokpavi</surname> <given-names>S</given-names></name></person-group>. <article-title>Machine learning for credit scoring: improving logistic regression with non-linear decision-tree effects</article-title>. <source>Eur J Oper Res</source>. (<year>2022</year>) <volume>297</volume>:<fpage>1178</fpage>&#x02013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejor.2021.06.053</pub-id></citation>
</ref>
<ref id="B24">
<label>24.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kvamme</surname> <given-names>H</given-names></name> <name><surname>Sellereite</surname> <given-names>N</given-names></name> <name><surname>Aas</surname> <given-names>K</given-names></name> <name><surname>Sjursen</surname> <given-names>S</given-names></name></person-group>. <article-title>Predicting mortgage default using convolutional neural networks</article-title>. <source>Expert Syst Appl</source>. (<year>2018</year>) <volume>102</volume>:<fpage>207</fpage>&#x02013;<lpage>17</lpage>. <pub-id pub-id-type="doi">10.1016/j.eswa.2018.02.029</pub-id></citation>
</ref>
<ref id="B25">
<label>25.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barbaglia</surname> <given-names>L</given-names></name> <name><surname>Manzan</surname> <given-names>S</given-names></name> <name><surname>Tosetti</surname> <given-names>E</given-names></name></person-group>. <article-title>Forecasting loan default in europe with machine learning&#x0002A;</article-title>. <source>J Financial Economet</source>. (<year>2021</year>) 2021, nbab010. <pub-id pub-id-type="doi">10.1093/jjfinec/nbab010</pub-id></citation>
</ref>
<ref id="B26">
<label>26.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sadhwani</surname> <given-names>A</given-names></name> <name><surname>Giesecke</surname> <given-names>K</given-names></name> <name><surname>Sirignano</surname> <given-names>J</given-names></name></person-group>. <article-title>Deep learning for mortgage risk&#x0002A;</article-title>. <source>J Financial Economet</source>. (<year>2021</year>) <volume>19</volume>:<fpage>313</fpage>&#x02013;<lpage>68</lpage>. <pub-id pub-id-type="doi">10.1093/jjfinec/nbaa025</pub-id></citation>
</ref>
<ref id="B27">
<label>27.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>S</given-names></name> <name><surname>Guo</surname> <given-names>Z</given-names></name> <name><surname>Zhao</surname> <given-names>X</given-names></name></person-group>. <article-title>Predicting mortgage early delinquency with machine learning methods</article-title>. <source>Eur J Oper Res</source>. (<year>2021</year>) <volume>290</volume>:<fpage>358</fpage>&#x02013;<lpage>72</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejor.2020.07.058</pub-id></citation>
</ref>
<ref id="B28">
<label>28.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Matuszyk</surname> <given-names>A</given-names></name> <name><surname>Mues</surname> <given-names>C</given-names></name> <name><surname>Thomas</surname> <given-names>LC</given-names></name></person-group>. <article-title>Modelling LGD for unsecured personal loans: decision tree approach</article-title>. <source>J Operat Res Soc</source>. (<year>2010</year>) <volume>61</volume>:<fpage>393</fpage>&#x02013;<lpage>8</lpage>. <pub-id pub-id-type="doi">10.1057/jors.2009.67</pub-id></citation>
</ref>
<ref id="B29">
<label>29.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kaposty</surname> <given-names>F</given-names></name> <name><surname>Kriebel</surname> <given-names>J</given-names></name> <name><surname>L&#x000F6;derbusch</surname> <given-names>M</given-names></name></person-group>. <article-title>Predicting loss given default in leasing: a closer look at models and variable selection</article-title>. <source>Int J Forecast</source>. (<year>2020</year>) <volume>36</volume>:<fpage>248</fpage>&#x02013;<lpage>66</lpage>. <pub-id pub-id-type="doi">10.1016/j.ijforecast.2019.05.009</pub-id></citation>
</ref>
<ref id="B30">
<label>30.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nazemi</surname> <given-names>A</given-names></name> <name><surname>Baumann</surname> <given-names>F</given-names></name> <name><surname>Fabozzi</surname> <given-names>FJ</given-names></name></person-group>. <article-title>Intertemporal defaulted bond recoveries prediction via machine learning</article-title>. <source>Eur J Oper Res</source>. (<year>2021</year>) <pub-id pub-id-type="doi">10.1016/j.ejor.2021.06.047</pub-id></citation>
</ref>
<ref id="B31">
<label>31.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Calabrese</surname> <given-names>R</given-names></name> <name><surname>Zanin</surname> <given-names>L</given-names></name></person-group>. <article-title>Modelling spatial dependence for loss given default in peer-to-peer lending</article-title>. <source>Expert Syst Appl</source>. (<year>2022</year>) <volume>192</volume>:<fpage>116295</fpage>. <pub-id pub-id-type="doi">10.1016/j.eswa.2021.116295</pub-id></citation>
</ref>
<ref id="B32">
<label>32.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sigrist</surname> <given-names>F</given-names></name> <name><surname>Hirnschall</surname> <given-names>C</given-names></name></person-group>. <article-title>Grabit: Gradient tree-boosted Tobit models for default prediction</article-title>. <source>J Bank Finance</source>. (<year>2019</year>) <volume>102</volume>:<fpage>177</fpage>&#x02013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbankfin.2019.03.004</pub-id></citation>
</ref>
<ref id="B33">
<label>33.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Der Kiureghian</surname> <given-names>A</given-names></name> <name><surname>Ditlevsen</surname> <given-names>O</given-names></name></person-group>. <article-title>Aleatory or epistemic? Does it matter?</article-title> <source>Struct Safety</source>. (<year>2009</year>) <volume>31</volume>:<fpage>105</fpage>&#x02013;<lpage>12</lpage>. <pub-id pub-id-type="doi">10.1016/j.strusafe.2008.06.020</pub-id></citation>
</ref>
<ref id="B34">
<label>34.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gawlikowski</surname> <given-names>J</given-names></name> <name><surname>Tassi</surname> <given-names>CRN</given-names></name> <name><surname>Ali</surname> <given-names>M</given-names></name> <name><surname>Lee</surname> <given-names>J</given-names></name> <name><surname>Humt</surname> <given-names>M</given-names></name> <name><surname>Feng</surname> <given-names>J</given-names></name> <etal/></person-group>. <article-title>A survey of uncertainty in deep neural networks</article-title>. <source>ArXiv:2107.03342 [cs, stat</source>] (<year>2022</year>) <pub-id pub-id-type="doi">10.48550/arXiv.2107.03342</pub-id></citation>
</ref>
<ref id="B35">
<label>35.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Blundell</surname> <given-names>C</given-names></name> <name><surname>Cornebise</surname> <given-names>J</given-names></name> <name><surname>Kavukcuoglu</surname> <given-names>K</given-names></name> <name><surname>Wierstra</surname> <given-names>D</given-names></name></person-group>. <article-title>Weight uncertainty in neural network</article-title>. In: <source>International conferEnce on Machine Learning</source>. (<publisher-loc>Lille</publisher-loc>: <publisher-name>PMLR</publisher-name>). (<year>2015</year>). p. <fpage>1613</fpage>&#x02013;<lpage>22</lpage>.</citation>
</ref>
<ref id="B36">
<label>36.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gal</surname> <given-names>Y</given-names></name> <name><surname>Ghahramani</surname> <given-names>Z</given-names></name></person-group>. <article-title>Dropout as a bayesian approximation: representing model uncertainty in deep learning</article-title>. In: <source>international Conference on Machine Learning</source>. (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>PMLR</publisher-name>). (<year>2016</year>). p. <fpage>1050</fpage>&#x02013;<lpage>9</lpage>.</citation>
</ref>
<ref id="B37">
<label>37.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mobiny</surname> <given-names>A</given-names></name> <name><surname>Yuan</surname> <given-names>P</given-names></name> <name><surname>Moulik</surname> <given-names>SK</given-names></name> <name><surname>Garg</surname> <given-names>N</given-names></name> <name><surname>Wu</surname> <given-names>CC</given-names></name> <name><surname>Van Nguyen</surname> <given-names>H</given-names></name></person-group>. <article-title>Dropconnect is effective in modeling uncertainty of bayesian deep networks</article-title>. <source>Sci Rep</source>. (<year>2021</year>) <volume>11</volume>:<fpage>1</fpage>&#x02013;<lpage>14</lpage>. <pub-id pub-id-type="doi">10.1038/s41598-021-84854-x</pub-id><pub-id pub-id-type="pmid">33750847</pub-id></citation></ref>
<ref id="B38">
<label>38.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Krueger</surname> <given-names>D</given-names></name> <name><surname>Huang</surname> <given-names>CW</given-names></name> <name><surname>Islam</surname> <given-names>R</given-names></name> <name><surname>Turner</surname> <given-names>R</given-names></name> <name><surname>Lacoste</surname> <given-names>A</given-names></name> <name><surname>Courville</surname> <given-names>A</given-names></name></person-group>. <article-title>Bayesian hypernetworks</article-title>. <source>arXiv preprint arXiv:171004759.</source> (<year>2017</year>) <pub-id pub-id-type="doi">10.48550/arXiv.1710.04759</pub-id></citation>
</ref>
<ref id="B39">
<label>39.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Lakshminarayanan</surname> <given-names>B</given-names></name> <name><surname>Pritzel</surname> <given-names>A</given-names></name> <name><surname>Blundell</surname> <given-names>C</given-names></name></person-group>. <article-title>Simple and scalable predictive uncertainty estimation using deep ensembles</article-title>. In: <source>Advances in Neural Information Processing Systems. Vol. 30</source>. (<publisher-loc>Long Beach, CA</publisher-loc>: <publisher-name>Curran Associates, Inc</publisher-name>.). (<year>2017</year>).</citation>
</ref>
<ref id="B40">
<label>40.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Valdenegro-Toro</surname> <given-names>M</given-names></name></person-group>. <article-title>Deep sub-ensembles for fast uncertainty estimation in image classification</article-title>. <source>arXiv preprint arXiv:191008168</source>. (<year>2019</year>) <pub-id pub-id-type="doi">10.48550/arXiv.1910.08168</pub-id></citation>
</ref>
<ref id="B41">
<label>41.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wen</surname> <given-names>Y</given-names></name> <name><surname>Tran</surname> <given-names>D</given-names></name> <name><surname>Ba</surname> <given-names>J</given-names></name></person-group>. <article-title>Batchensemble: an alternative approach to efficient ensemble and lifelong learning</article-title>. <source>arXiv preprint arXiv:200206715</source>. (<year>2020</year>) <pub-id pub-id-type="doi">10.48550/arXiv.2002.06715</pub-id></citation>
</ref>
<ref id="B42">
<label>42.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Amini</surname> <given-names>A</given-names></name> <name><surname>Schwarting</surname> <given-names>W</given-names></name> <name><surname>Soleimany</surname> <given-names>A</given-names></name> <name><surname>Rus</surname> <given-names>D</given-names></name></person-group>. <article-title>Deep evidential regression</article-title>. In: <source>Advances in Neural Information Processing Systems. vol. 33</source>. Curran Associates, Inc. (<year>2020</year>). p. <fpage>14927</fpage>&#x02013;<lpage>37</lpage>.</citation>
</ref>
<ref id="B43">
<label>43.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meinert</surname> <given-names>N</given-names></name> <name><surname>Gawlikowski</surname> <given-names>J</given-names></name> <name><surname>Lavin</surname> <given-names>A</given-names></name></person-group>. <article-title>The unreasonable effectiveness of deep evidential regression</article-title>. <source>ArXiv:2205.10060 [cs, stat]</source>. (<year>2022</year>) <pub-id pub-id-type="doi">10.48550/arXiv.2205.10060</pub-id></citation>
</ref>
<ref id="B44">
<label>44.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jurado</surname> <given-names>K</given-names></name> <name><surname>Ludvigson</surname> <given-names>SC</given-names></name> <name><surname>Ng</surname> <given-names>S</given-names></name></person-group>. <article-title>Measuring uncertainty</article-title>. <source>Am Econ Rev</source>. (<year>2015</year>) <volume>105</volume>:<fpage>1177</fpage>&#x02013;<lpage>216</lpage>. <pub-id pub-id-type="doi">10.1257/aer.20131193</pub-id></citation>
</ref>
<ref id="B45">
<label>45.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ludvigson</surname> <given-names>SC</given-names></name> <name><surname>Ma</surname> <given-names>S</given-names></name> <name><surname>Ng</surname> <given-names>S</given-names></name></person-group>. <article-title>Uncertainty and business cycles: exogenous impulse or endogenous response?</article-title> <source>Am Econ J Macroecon</source>. (<year>2021</year>) <volume>13</volume>:<fpage>369</fpage>&#x02013;<lpage>410</lpage>. <pub-id pub-id-type="doi">10.1257/mac.20190171</pub-id></citation>
</ref>
<ref id="B46">
<label>46.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baker</surname> <given-names>SR</given-names></name> <name><surname>Bloom</surname> <given-names>N</given-names></name> <name><surname>Davis</surname> <given-names>SJ</given-names></name></person-group>. <article-title>Measuring economic policy uncertainty&#x0002A;</article-title>. <source>Q J Econ</source>. (<year>2016</year>) <volume>131</volume>:<fpage>1593</fpage>&#x02013;<lpage>636</lpage>. <pub-id pub-id-type="doi">10.1093/qje/qjw024</pub-id></citation>
</ref>
<ref id="B47">
<label>47.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meinert</surname> <given-names>N</given-names></name> <name><surname>Lavin</surname> <given-names>A</given-names></name></person-group>. <article-title>Multivariate deep evidential regression</article-title>. <source>ArXiv:2104.06135 [cs, stat]</source> (<year>2022</year>). <pub-id pub-id-type="doi">10.48550/arXiv.2104.06135</pub-id></citation>
</ref>
<ref id="B48">
<label>48.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jankowitsch</surname> <given-names>R</given-names></name> <name><surname>Nagler</surname> <given-names>F</given-names></name> <name><surname>Subrahmanyam</surname> <given-names>MG</given-names></name></person-group>. <article-title>The determinants of recovery rates in the US corporate bond market</article-title>. <source>J Financ Econ</source>. (<year>2014</year>) <volume>114</volume>:<fpage>155</fpage>&#x02013;<lpage>77</lpage>. <pub-id pub-id-type="doi">10.1016/j.jfineco.2014.06.001</pub-id></citation>
</ref>
<ref id="B49">
<label>49.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tobback</surname> <given-names>E</given-names></name> <name><surname>Martens</surname> <given-names>D</given-names></name> <name><surname>Gestel</surname> <given-names>TV</given-names></name> <name><surname>Baesens</surname> <given-names>B</given-names></name></person-group>. <article-title>Forecasting Loss Given Default models: impact of account characteristics and the macroeconomic state</article-title>. <source>J Operat Res Soc</source>. (<year>2014</year>) <volume>65</volume>:<fpage>376</fpage>&#x02013;<lpage>92</lpage>. <pub-id pub-id-type="doi">10.1057/jors.2013.158</pub-id></citation>
</ref>
<ref id="B50">
<label>50.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nazemi</surname> <given-names>A</given-names></name> <name><surname>Fatemi Pour</surname> <given-names>F</given-names></name> <name><surname>Heidenreich</surname> <given-names>K</given-names></name> <name><surname>Fabozzi</surname> <given-names>FJ</given-names></name></person-group>. <article-title>Fuzzy decision fusion approach for loss-given-default modeling</article-title>. <source>Eur J Oper Res</source>. (<year>2017</year>) <volume>262</volume>:<fpage>780</fpage>&#x02013;<lpage>91</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejor.2017.04.008</pub-id></citation>
</ref>
<ref id="B51">
<label>51.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Miller</surname> <given-names>P</given-names></name> <name><surname>T&#x000F6;ws</surname> <given-names>E</given-names></name></person-group>. <article-title>Loss given default adjusted workout processes for leases</article-title>. <source>J Bank Finance</source>. (<year>2018</year>) <volume>91</volume>:<fpage>189</fpage>&#x02013;<lpage>201</lpage>. <pub-id pub-id-type="doi">10.1016/j.jbankfin.2017.01.020</pub-id></citation>
</ref>
<ref id="B52">
<label>52.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nazemi</surname> <given-names>A</given-names></name> <name><surname>Heidenreich</surname> <given-names>K</given-names></name> <name><surname>Fabozzi</surname> <given-names>FJ</given-names></name></person-group>. <article-title>Improving corporate bond recovery rate prediction using multi-factor support vector regressions</article-title>. <source>Eur J Oper Res</source>. (<year>2018</year>) <volume>271</volume>:<fpage>664</fpage>&#x02013;<lpage>75</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejor.2018.05.024</pub-id></citation>
</ref>
<ref id="B53">
<label>53.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Starosta</surname> <given-names>W</given-names></name></person-group>. <article-title>Loss given default decomposition using mixture distributions of in-default events</article-title>. <source>Eur J Oper Res</source>. (<year>2021</year>) <volume>292</volume>:<fpage>1187</fpage>&#x02013;<lpage>99</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejor.2020.11.034</pub-id></citation>
</ref>
<ref id="B54">
<label>54.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Apley</surname> <given-names>DW</given-names></name> <name><surname>Zhu</surname> <given-names>J</given-names></name></person-group>. <article-title>Visualizing the effects of predictor variables in black box supervised learning models</article-title>. <source>J R Stat Soc B</source>. (<year>2020</year>) <volume>82</volume>:<fpage>1059</fpage>&#x02013;<lpage>86</lpage>. <pub-id pub-id-type="doi">10.1111/rssb.12377</pub-id></citation>
</ref>
<ref id="B55">
<label>55.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friedman</surname> <given-names>JH</given-names></name></person-group>. <article-title>Greedy function approximation: a gradient boosting machine</article-title>. <source>Ann Stat</source>. (<year>2001</year>) <volume>29</volume>:<fpage>1189</fpage>&#x02013;<lpage>232</lpage>. <pub-id pub-id-type="doi">10.1214/aos/1013203451</pub-id></citation>
</ref>
<ref id="B56">
<label>56.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Goldstein</surname> <given-names>A</given-names></name> <name><surname>Kapelner</surname> <given-names>A</given-names></name> <name><surname>Bleich</surname> <given-names>J</given-names></name> <name><surname>Pitkin</surname> <given-names>E</given-names></name></person-group>. <article-title>Peeking inside the black box: visualizing statistical learning with plots of individual conditional expectation</article-title>. <source>J Comput Graph Stat</source>. (<year>2015</year>) <volume>24</volume>:<fpage>44</fpage>&#x02013;<lpage>65</lpage>. <pub-id pub-id-type="doi">10.1080/10618600.2014.907095</pub-id></citation>
</ref>
<ref id="B57">
<label>57.</label>
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Ribeiro</surname> <given-names>MT</given-names></name> <name><surname>Singh</surname> <given-names>S</given-names></name> <name><surname>Guestrin</surname> <given-names>C</given-names></name></person-group>. <article-title>&#x0201C;Why Should I Trust You?&#x0201D;: explaining the predictions of any classifier</article-title>. In: <source>Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD &#x00027;16</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Association for Computing Machinery</publisher-name> (<year>2016</year>). p. <fpage>1135</fpage>&#x02013;<lpage>44</lpage>.</citation>
</ref>
<ref id="B58">
<label>58.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lundberg</surname> <given-names>S</given-names></name> <name><surname>Lee</surname> <given-names>SI</given-names></name></person-group>. <article-title>A unified approach to interpreting model predictions</article-title>. <source>arXiv:170507874 [cs, stat</source>]. (<year>2017</year>).</citation>
</ref>
<ref id="B59">
<label>59.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alvarez-Melis</surname> <given-names>D</given-names></name> <name><surname>Jaakkola</surname> <given-names>TS</given-names></name></person-group>. <article-title>On the robustness of interpretability methods</article-title>. <source>arXiv preprint arXiv:180608049</source>. (<year>2018</year>) <pub-id pub-id-type="doi">10.48550/arXiv.1806.08049</pub-id></citation>
</ref>
<ref id="B60">
<label>60.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Visani</surname> <given-names>G</given-names></name> <name><surname>Bagli</surname> <given-names>E</given-names></name> <name><surname>Chesani</surname> <given-names>F</given-names></name> <name><surname>Poluzzi</surname> <given-names>A</given-names></name> <name><surname>Capuzzo</surname> <given-names>D</given-names></name></person-group>. <article-title>Statistical stability indices for LIME: obtaining reliable explanations for machine learning models</article-title>. <source>J Operat Res Soc</source>. (<year>2022</year>) <volume>73</volume>:<fpage>91</fpage>&#x02013;<lpage>101</lpage>. <pub-id pub-id-type="doi">10.1080/01605682.2020.1865846</pub-id></citation>
</ref>
<ref id="B61">
<label>61.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bussmann</surname> <given-names>N</given-names></name> <name><surname>Giudici</surname> <given-names>P</given-names></name> <name><surname>Marinelli</surname> <given-names>D</given-names></name> <name><surname>Papenbrock</surname> <given-names>J</given-names></name></person-group>. <article-title>Explainable AI in fintech risk management</article-title>. <source>Front Artif Intell</source>. (<year>2020</year>) <volume>3</volume>:<fpage>26</fpage>. <pub-id pub-id-type="doi">10.3389/frai.2020.00026</pub-id><pub-id pub-id-type="pmid">33733145</pub-id></citation></ref>
<ref id="B62">
<label>62.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Giudici</surname> <given-names>P</given-names></name> <name><surname>Raffinetti</surname> <given-names>E</given-names></name></person-group>. <article-title>Explainable AI methods in cyber risk management</article-title>. <source>Quality and Reliabil Eng Int</source>. (<year>2022</year>) <volume>38</volume>:<fpage>1318</fpage>&#x02013;<lpage>26</lpage>. <pub-id pub-id-type="doi">10.1002/qre.2939</pub-id></citation>
</ref>
<ref id="B63">
<label>63.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Babaei</surname> <given-names>G</given-names></name> <name><surname>Giudici</surname> <given-names>P</given-names></name> <name><surname>Raffinetti</surname> <given-names>E</given-names></name></person-group>. <article-title>Explainable artificial intelligence for crypto asset allocation</article-title>. <source>Finance Res Lett</source>. (<year>2022</year>) <volume>47</volume>:<fpage>102941</fpage>. <pub-id pub-id-type="doi">10.1016/j.frl.2022.102941</pub-id></citation>
</ref>
<ref id="B64">
<label>64.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bussmann</surname> <given-names>N</given-names></name> <name><surname>Giudici</surname> <given-names>P</given-names></name> <name><surname>Marinelli</surname> <given-names>D</given-names></name> <name><surname>Papenbrock</surname> <given-names>J</given-names></name></person-group>. <article-title>Explainable machine learning in credit risk management</article-title>. <source>Comput Econ</source>. (<year>2021</year>) <volume>57</volume>:<fpage>203</fpage>&#x02013;<lpage>16</lpage>. <pub-id pub-id-type="doi">10.1007/s10614-020-10042-0</pub-id><pub-id pub-id-type="pmid">35295866</pub-id></citation></ref>
<ref id="B65">
<label>65.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Giudici</surname> <given-names>P</given-names></name> <name><surname>Raffinetti</surname> <given-names>E</given-names></name></person-group>. <article-title>Shapley-Lorenz eXplainable artificial intelligence</article-title>. <source>Expert Syst Appl</source>. (<year>2021</year>) <volume>167</volume>:<fpage>114104</fpage>. <pub-id pub-id-type="doi">10.1016/j.eswa.2020.114104</pub-id></citation>
</ref>
<ref id="B66">
<label>66.</label>
<citation citation-type="web"><person-group person-group-type="author"><collab>Bank of Canada</collab></person-group>. <source>Financial System Survey</source>. Bank of Canada (<year>2018</year>). Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.bankofcanada.ca/2018/11/financial-system-survey-highlights/">https://www.bankofcanada.ca/2018/11/financial-system-survey-highlights/</ext-link></citation>
</ref>
<ref id="B67">
<label>67.</label>
<citation citation-type="web"><person-group person-group-type="author"><collab>Bank of England</collab></person-group>. <source>Machine Learning in UK Financial Services</source>. Bank of England and Financial Conduct Authority (<year>2019</year>). Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.bankofengland.co.uk/report/2019/machine-learning-in-uk-financial-services">https://www.bankofengland.co.uk/report/2019/machine-learning-in-uk-financial-services</ext-link></citation>
</ref>
<ref id="B68">
<label>68.</label>
<citation citation-type="web"><person-group person-group-type="author"><collab>Basel Committee on Banking Supervision</collab></person-group>. <article-title>High-level summary: BCBS SIG industry workshop on the governance and oversight of artificial intelligence and machine learning in financial services</article-title> (<year>2019</year>). Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.bis.org/bcbs/events/191003_sig_tokyo.htm">https://www.bis.org/bcbs/events/191003_sig_tokyo.htm</ext-link></citation>
</ref>
<ref id="B69">
<label>69.</label>
<citation citation-type="web"><person-group person-group-type="author"><collab>Deutsche Bundesbank</collab></person-group>. <source>The Use of Artificial Intelligence Machine Learning in the Financial Sector</source>. (<year>2020</year>). Available online at: <ext-link ext-link-type="uri" xlink:href="https://www.bundesbank.de/resource/blob/598256/d7d26167bceb18ee7c0c296902e42162/mL/2020-11-policy-dp-aiml-data.pdf">https://www.bundesbank.de/resource/blob/598256/d7d26167bceb18ee7c0c296902e42162/mL/2020-11-policy-dp-aiml-data.pdf</ext-link></citation>
</ref>
<ref id="B70">
<label>70.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Calabrese</surname> <given-names>R</given-names></name></person-group>. <article-title>Downturn Loss Given Default: Mixture distribution estimation</article-title>. <source>Eur J Oper Res</source>. (<year>2014</year>) <volume>237</volume>:<fpage>271</fpage>&#x02013;<lpage>77</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejor.2014.01.043</pub-id></citation>
</ref>
<ref id="B71">
<label>71.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Betz</surname> <given-names>J</given-names></name> <name><surname>Kellner</surname> <given-names>R</given-names></name> <name><surname>R&#x000F6;sch</surname> <given-names>D</given-names></name></person-group>. <article-title>Systematic effects among loss given defaults and their implications on downturn estimation</article-title>. <source>Eur J Oper Res</source>. (<year>2018</year>) <volume>271</volume>:<fpage>1113</fpage>&#x02013;<lpage>44</lpage>. <pub-id pub-id-type="doi">10.1016/j.ejor.2018.05.059</pub-id></citation>
</ref>
<ref id="B72">
<label>72.</label>
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Betz</surname> <given-names>J</given-names></name> <name><surname>Nagl</surname> <given-names>M</given-names></name> <name><surname>R&#x000F6;sch</surname> <given-names>D</given-names></name></person-group>. <article-title>Credit line exposure at default modelling using Bayesian mixed effect quantile regression</article-title>. <source>J R Stat Soc A</source>. (<year>2022</year>) <pub-id pub-id-type="doi">10.1111/rssa.12855</pub-id></citation>
</ref>
</ref-list>
</back>
</article> 