<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Neuroinform.</journal-id>
<journal-title>Frontiers in Neuroinformatics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Neuroinform.</abbrev-journal-title>
<issn pub-type="epub">1662-5196</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fninf.2021.738342</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroinformatics</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Providing Evidence for the Null Hypothesis in Functional Magnetic Resonance Imaging Using Group-Level Bayesian Inference</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Masharipov</surname> <given-names>Ruslan</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/393925/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Knyazeva</surname> <given-names>Irina</given-names></name>
</contrib>
<contrib contrib-type="author">
<name><surname>Nikolaev</surname> <given-names>Yaroslav</given-names></name>
</contrib>
<contrib contrib-type="author">
<name><surname>Korotkov</surname> <given-names>Alexander</given-names></name>
<uri xlink:href="http://loop.frontiersin.org/people/400536/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Didur</surname> <given-names>Michael</given-names></name>
</contrib>
<contrib contrib-type="author">
<name><surname>Cherednichenko</surname> <given-names>Denis</given-names></name>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Kireev</surname> <given-names>Maxim</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/45145/overview"/>
</contrib>
</contrib-group>
<aff><institution>N. P. Bechtereva Institute of the Human Brain, Russian Academy of Sciences</institution>, <addr-line>Saint Petersburg</addr-line>, <country>Russia</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Ting Zhao, Janelia Research Campus, United States</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Marc De Kamps, University of Leeds, United Kingdom; Yikang Liu, United Imaging Intelligence, United States</p></fn>
<corresp id="c001">&#x002A;Correspondence: Maxim Kireev, <email>kireev@ihb.spb.ru</email></corresp>
</author-notes>
<pub-date pub-type="epub">
<day>02</day>
<month>12</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>15</volume>
<elocation-id>738342</elocation-id>
<history>
<date date-type="received">
<day>08</day>
<month>07</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>05</day>
<month>11</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x00A9; 2021 Masharipov, Knyazeva, Nikolaev, Korotkov, Didur, Cherednichenko and Kireev.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Masharipov, Knyazeva, Nikolaev, Korotkov, Didur, Cherednichenko and Kireev</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract>
<p>Classical null hypothesis significance testing is limited to the rejection of the point-null hypothesis; it does not allow the interpretation of non-significant results. This leads to a bias against the null hypothesis. Herein, we discuss statistical approaches to &#x2018;null effect&#x2019; assessment focusing on the Bayesian parameter inference (BPI). Although Bayesian methods have been theoretically elaborated and implemented in common neuroimaging software packages, they are not widely used for &#x2018;null effect&#x2019; assessment. BPI considers the posterior probability of finding the effect within or outside the region of practical equivalence to the null value. It can be used to find both &#x2018;activated/deactivated&#x2019; and &#x2018;not activated&#x2019; voxels or to indicate that the obtained data are not sufficient using a single decision rule. It also allows to evaluate the data as the sample size increases and decide to stop the experiment if the obtained data are sufficient to make a confident inference. To demonstrate the advantages of using BPI for fMRI data group analysis, we compare it with classical null hypothesis significance testing on empirical data. We also use simulated data to show how BPI performs under different effect sizes, noise levels, noise distributions and sample sizes. Finally, we consider the problem of defining the region of practical equivalence for BPI and discuss possible applications of BPI in fMRI studies. To facilitate &#x2018;null effect&#x2019; assessment for fMRI practitioners, we provide Statistical Parametric Mapping 12 based toolbox for Bayesian inference.</p>
</abstract>
<kwd-group>
<kwd>null results</kwd>
<kwd>fMRI</kwd>
<kwd>Bayesian analyses</kwd>
<kwd>human brain</kwd>
<kwd>statistical parametric mapping</kwd>
</kwd-group>
<contract-sponsor id="cn001">Russian Science Foundation<named-content content-type="fundref-id">10.13039/501100006769</named-content></contract-sponsor>
<contract-sponsor id="cn002">Ministry of Education and Science of the Russian Federation<named-content content-type="fundref-id">10.13039/501100003443</named-content></contract-sponsor>
<counts>
<fig-count count="19"/>
<table-count count="1"/>
<equation-count count="16"/>
<ref-count count="154"/>
<page-count count="31"/>
<word-count count="23144"/>
</counts>
</article-meta>
</front>
<body>
<sec id="S1" sec-type="intro">
<title>Introduction</title>
<p>In the neuroimaging field, it is a common practice to identify statistically significant differences in local brain activity using the general linear model approach for mass-univariate null hypothesis significance testing (NHST) (<xref ref-type="bibr" rid="B46">Friston et al., 1994</xref>). NHST considers the probability of obtaining the observed data, or more extreme data, given that the null hypothesis of no difference is true. This probability, or <italic>p</italic>-value, of 0.01, means that, on average, in one out of 100 &#x2018;hypothetical&#x2019; replications of the experiment, we find a difference no less than the one found under the null hypothesis. We conventionally suppose that this is unlikely, therefore, we &#x2018;reject the null&#x2019;; that is, NHST employs &#x2018;proof by contradiction&#x2019; (<xref ref-type="bibr" rid="B21">Cohen, 1994</xref>). Conversely, when the <italic>p</italic>-value is large, it is tempting to &#x2018;accept the null.&#x2019; However, the absence of evidence is not evidence of absence (<xref ref-type="bibr" rid="B4">Altman and Bland, 1995</xref>). Using NHST, we can only state that we have &#x2018;failed to reject the null.&#x2019; Therefore, in the classical NHST framework, the question of interpreting non-significant results remains.</p>
<p>The most pervasive misinterpretation of non-significant results is that they provide evidence for the null hypothesis that there is no difference, or &#x2018;no effect&#x2019; (<xref ref-type="bibr" rid="B102">Nickerson, 2000</xref>; <xref ref-type="bibr" rid="B59">Greenland et al., 2016</xref>; <xref ref-type="bibr" rid="B147">Wasserstein and Lazar, 2016</xref>). In fact, non-significant results can be obtained in two cases (<xref ref-type="bibr" rid="B30">Dienes, 2014</xref>): (1) the data are insufficient to distinguish the alternative from the null hypothesis, or (2) an effect is indeed null or trivial. To date, the extent to which the problem of making &#x2018;no effect&#x2019; conclusions from non-significant results have affected the field of neuroimaging remains unclear, particularly in functional magnetic resonance imaging (fMRI) studies<sup><xref ref-type="fn" rid="footnote1">1</xref></sup>. Regarding other fields of science such as psychology, neuropsychology, and biology, it was found that in 38&#x2013;72% of surveyed articles, the null hypothesis was accepted based on non-significant results only (<xref ref-type="bibr" rid="B38">Finch et al., 2001</xref>; <xref ref-type="bibr" rid="B124">Schatz et al., 2005</xref>; <xref ref-type="bibr" rid="B37">Fidler et al., 2006</xref>; <xref ref-type="bibr" rid="B64">Hoekstra et al., 2006</xref>; <xref ref-type="bibr" rid="B2">Aczel et al., 2018</xref>).</p>
<p>Not mentioning non-significant results at all is another problem. Firstly, some authors may consider non-significant results disappointing or not worth publishing. Secondly, papers with non-significant results are less likely to be published. This publishing bias is also known as the &#x2018;file-drawer problem&#x2019; (<xref ref-type="bibr" rid="B118">Rosenthal, 1979</xref>; <xref ref-type="bibr" rid="B68">Ioannidis et al., 2014</xref>; <xref ref-type="bibr" rid="B29">de Winter and Dodou, 2015</xref>; for evidence in fMRI studies, see <xref ref-type="bibr" rid="B71">Jennings and Van Horn, 2012</xref>; <xref ref-type="bibr" rid="B1">Acar et al., 2018</xref>; <xref ref-type="bibr" rid="B28">David et al., 2018</xref>; <xref ref-type="bibr" rid="B123">Samartsidis et al., 2020</xref>). Prejudice against the null hypothesis systematically biases our knowledge of true effects (<xref ref-type="bibr" rid="B60">Greenwald, 1975</xref>).</p>
<p>This problem is further compounded by the fact that NHST is usually based on the point-null hypothesis, that is, the hypothesis that the effect is <italic>exactly</italic> zero. However, the probability thereof is zero (<xref ref-type="bibr" rid="B90">Meehl, 1967</xref>; <xref ref-type="bibr" rid="B41">Friston et al., 2002a</xref>). This means that studies with a sufficiently large sample size will find statistically significant differences even when the effect is trivial or has no <italic>practical</italic> significance (<xref ref-type="bibr" rid="B19">Cohen, 1965</xref>, <xref ref-type="bibr" rid="B21">1994</xref>; <xref ref-type="bibr" rid="B130">Serlin and Lapsley, 1985</xref>; <xref ref-type="bibr" rid="B76">Kirk, 1996</xref>).</p>
<p>Having the means to assess non-significant results would mitigate these problems. To this end, two main alternatives are available: Firstly, there are frequentist approaches that shift from point-null to interval-null hypothesis testing, for example, equivalence testing based on the two one-sided tests (TOST) procedure (<xref ref-type="bibr" rid="B128">Schuirmann, 1987</xref>; <xref ref-type="bibr" rid="B148">Wellek, 2010</xref>). Secondly, Bayesian approaches that are based on posterior parameter distributions (<xref ref-type="bibr" rid="B85">Lindley, 1965</xref>; <xref ref-type="bibr" rid="B60">Greenwald, 1975</xref>; <xref ref-type="bibr" rid="B78">Kruschke, 2010</xref>) and Bayes factors (<xref ref-type="bibr" rid="B70">Jeffreys, 1939/1948</xref>; <xref ref-type="bibr" rid="B75">Kass and Raftery, 1995</xref>; <xref ref-type="bibr" rid="B120">Rouder et al., 2009</xref>). The advantage of frequentist approaches is that they do not require a substantial paradigm shift (<xref ref-type="bibr" rid="B82">Lakens, 2017</xref>; <xref ref-type="bibr" rid="B14">Campbell and Gustafson, 2018</xref>). However, it has been argued that Bayesian approaches may be more natural and straightforward than frequentist approaches (<xref ref-type="bibr" rid="B32">Edwards et al., 1963</xref>; <xref ref-type="bibr" rid="B87">Lindley, 1975</xref>; <xref ref-type="bibr" rid="B41">Friston et al., 2002a</xref>; <xref ref-type="bibr" rid="B143">Wagenmakers, 2007</xref>; <xref ref-type="bibr" rid="B120">Rouder et al., 2009</xref>; <xref ref-type="bibr" rid="B30">Dienes, 2014</xref>; <xref ref-type="bibr" rid="B80">Kruschke and Liddell, 2017b</xref>). It has long been noted that we tend to perceive lower <italic>p</italic>-values as stronger evidence for the alternative hypothesis, and higher <italic>p</italic>-values as evidence for the null, i.e., the &#x2018;inverse probability&#x2019; fallacy as it is referred to by <xref ref-type="bibr" rid="B21">Cohen (1994)</xref>. This is what we obtain in Bayesian approaches by calculating posterior probabilities. Instead of considering infinite &#x2018;hypothetical&#x2019; replications and employing probabilistic &#x2018;proof by contradiction,&#x2019; Bayesian approaches directly provide evidence for the null and alternative hypotheses given the data, updating our prior beliefs in light of new relevant information. Bayesian inference allows us to &#x2018;reject and accept&#x2019; the null hypothesis on an equal footing. Moreover, it allows us to talk about &#x2018;low confidence,&#x2019; indicating the need to either accumulate more data or revise the study design (see <xref ref-type="fig" rid="F1">Figure 1</xref>).</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption><p>Possible results for the same data, obtained using classical NHST and Bayesian parameter inference. Classical NHST detects only areas with a statistically significant difference (&#x2018;number one&#x2019;). Bayesian parameter inference based on the logarithm of posterior probability odds (<italic>LPO</italic>) provides us with additional information that is not available in classical NHST: (1) it provides relative evidence for the null (<italic>H</italic><sub>0</sub>) and alternative (<italic>H</italic><sub>1</sub>) hypotheses, (2) it detects areas with a trivial effect size (&#x2018;number zero&#x2019;), (3) it indicates &#x2018;low confidence&#x2019; areas surrounding the &#x2018;number one&#x2019; and &#x2018;number zero.&#x2019; To make this conceptual illustration, we generated 100 images consisted of 50 &#x00D7; 50 voxels smoothed by 2 voxel full width at half maximum (FWHM) Gaussian kernel. Data were drawn from normal distributions with different mean, m, and standard deviation, SD. For the &#x2018;number one,&#x2019; <italic>m</italic> = 0.1, SD = 0.37. For the &#x2018;number zero,&#x2019; <italic>m</italic> = 0, SD = 0.6. For the &#x2018;low confidence&#x2019; area, <italic>m</italic> = 0.01, SD = 0.37. <italic>LPOs</italic> were calculated using an effect size threshold of 0.02. The code to recreate the illustration is available online <ext-link ext-link-type="uri" xlink:href="https://github.com/Masharipov/BPI_2021/tree/main/conceptual_illustration">https://github.com/Masharipov/BPI_2021/tree/main/conceptual_illustration</ext-link>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g001.tif"/>
</fig>
<p>Despite the importance of this issue, and the high level of theoretical elaboration and implementation of Bayesian methods in common neuroimaging software programs, for example, Statistical Parametric Mapping 12 (SPM12) and FMRIB&#x2019;s Software Library (FSL), to date, only a few fMRI studies implemented the Bayesian inference to assess &#x2018;null effects&#x2019; (for example, see subject-level analysis in <xref ref-type="bibr" rid="B89">Magerkurth et al., 2015</xref>, group-level analysis in <xref ref-type="bibr" rid="B27">Dandolo and Schwabe, 2019</xref>; <xref ref-type="bibr" rid="B36">Feng et al., 2019</xref>). Therefore, this study is intended to introduce fMRI practitioners to the methods for assessing &#x2018;null effects.&#x2019; In particular, we focus on Bayesian parameter inference (<xref ref-type="bibr" rid="B44">Friston and Penny, 2003</xref>; <xref ref-type="bibr" rid="B105">Penny and Ridgway, 2013</xref>), as implemented in SPM12. Although Bayesian methods have been described elsewhere, the distinguishing feature of this study is that we aim to demonstrate the practical implementation of Bayesian inference to the assessment of &#x2018;null effects,&#x2019; and reemphasize its contributions over and above those of classical NHST. We deliberately aim to avoid mathematical details, which can be found elsewhere (<xref ref-type="bibr" rid="B49">Genovese, 2000</xref>; <xref ref-type="bibr" rid="B41">Friston et al., 2002a</xref>,<xref ref-type="bibr" rid="B43">2007</xref>; <xref ref-type="bibr" rid="B44">Friston and Penny, 2003</xref>; <xref ref-type="bibr" rid="B104">Penny et al., 2003</xref>, <xref ref-type="bibr" rid="B106">2005</xref>, <xref ref-type="bibr" rid="B103">2007</xref>; <xref ref-type="bibr" rid="B105">Penny and Ridgway, 2013</xref>; <xref ref-type="bibr" rid="B152">Woolrich et al., 2004</xref>). Firstly, we briefly review the frequentist and Bayesian approaches for the assessment of the &#x2018;null effects.&#x2019; Next, we compare the classical NHST and Bayesian parameter inference using the Human Connectome Project (HCP) and the UCLA Consortium for Neuropsychiatric Phenomics datasets, focusing on group-level analysis. We then consider the choice of the threshold of the effect size for Bayesian parameter inference and estimate the typical effect sizes in different fMRI task designs. To demonstrate how the common sources of variability in empirical data influence NHST and Bayesian parameter inference, we examined their behavior for different sample sizes and spatial smoothing. We also used simulated data to assess BPI performance under different effect sizes, noise levels, noise distributions and sample sizes. Finally, we discuss practical research and clinical applications of Bayesian inference.</p>
</sec>
<sec id="S2">
<title>Theory</title>
<p>In this section, we briefly describe the classical NHST framework and review statistical methods which can be used to assess the &#x2018;null effect.&#x2019; We also considered two historical trends in statistical analysis: the shift from point-null hypothesis testing to interval estimation and interval-null hypothesis testing (<xref ref-type="bibr" rid="B99">Murphy and Myors, 2004</xref>; <xref ref-type="bibr" rid="B148">Wellek, 2010</xref>; <xref ref-type="bibr" rid="B26">Cumming, 2013</xref>), and the shift from frequentist to Bayesian approaches (<xref ref-type="bibr" rid="B80">Kruschke and Liddell, 2017b</xref>).</p>
<sec id="S2.SS1">
<title>Classical Null Hypothesis Significance Testing Framework</title>
<p>Most task-based fMRI studies rely on the general linear model approach (<xref ref-type="bibr" rid="B46">Friston et al., 1994</xref>; <xref ref-type="bibr" rid="B112">Poline and Brett, 2012</xref>). It provides a simple way to separate blood-oxygenated-level dependent (BOLD) signals associated with particular task conditions from nuisance signals and residual noise when analyzing single-subject data (subject-level analysis). At the same time, it allows us to analyze mean BOLD signals within one group of subjects or between different groups (group-level analysis). Firstly, we must specify a general linear model and estimate its parameters:</p>
<disp-formula id="S2.E1"><label>(1)</label><mml:math id="M1"><mml:mrow><mml:mi>Y</mml:mi><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mi>X</mml:mi><mml:mi mathvariant="normal">&#x03B2;</mml:mi></mml:mrow><mml:mo>+</mml:mo><mml:mpadded width="+5pt"><mml:mi mathvariant="normal">&#x03B5;</mml:mi></mml:mpadded></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>where <italic>Y</italic> are the data (further, <italic>D</italic>), <italic>X</italic> is the design matrix, which includes regressors of interest and nuisance regressors, &#x03B2; are the model parameters (&#x2018;beta values&#x2019;), and &#x03B5; is residual noise or error, which is assumed to have a zero-mean normal distribution. At the subject level of analysis, the data are BOLD-signals. At the group level, the data are linear contrasts of parameters estimated at the subject level, which typically reflect individual subject amplitudes of BOLD responses evoked in particular task conditions. In turn, the parameters of the group-level general linear model reflect the group mean BOLD responses evoked in particular task conditions and groups of subjects. The linear contrast of these parameters, &#x03B8; = <italic>c</italic>&#x03B2;, represents the experimental effect of interest (hereinafter &#x2018;<italic>the effect</italic>&#x2019;), expressed as the difference between conditions or groups of subjects.</p>
<p>Next, we test the effect against the point-null hypothesis, <italic>H</italic><sub>0</sub>: &#x03B8; = <italic>&#x03B3;</italic> (usually, &#x03B8; = 0). To do this, we use test statistics that summarize the data in a single value, for example, the t-value. For the one-sample case, the t-value is the ratio of the discrepancy of the estimated effect from the hypothetical null value to its standard error. Finally, we calculate the probability of obtaining the observed t-value or a more extreme value, given that the null hypothesis is true (<italic>p</italic>-value). This is also commonly formulated as the probability of obtaining the observed data or more extreme data, given that the null hypothesis is true (<xref ref-type="bibr" rid="B21">Cohen, 1994</xref>). It can be simply written as a conditional probability <italic>P</italic>(<italic>D</italic>+|<italic>H</italic><sub>0</sub>), where &#x2018;<italic>D</italic> +&#x2019; denotes the observed data or more extreme data which can be obtained in infinite &#x2018;hypothetical&#x2019; replications under the null (<xref ref-type="bibr" rid="B125">Schneider, 2014</xref>, <xref ref-type="bibr" rid="B126">2018</xref>). If this probability is lower than some conventional threshold, or alpha level (for example, &#x03B1; = 0.05), then we can &#x2018;reject the null hypothesis&#x2019; and state that we found a statistically significant effect. When this procedure is repeated for a massive number of voxels, it is referred to as &#x2018;mass-univariate analysis.&#x2019; However, if we consider <italic>m</italic> = 100 000 voxels with no true effect and repeat significance testing for each voxel at &#x03B1; = 0.05, we would expect to obtain 5000 false rejections of the null hypothesis (false positives). To control the number of false positives, we must reduce the alpha level for each significance test by applying the multiple comparison correction (<xref ref-type="bibr" rid="B50">Genovese et al., 2002</xref>; <xref ref-type="bibr" rid="B100">Nichols and Hayasaka, 2003</xref>; <xref ref-type="bibr" rid="B101">Nichols, 2012</xref>).</p>
<p>To date, the classical NHST has been the most widely used statistical inference method in neuroscience, psychology, and biomedicine (<xref ref-type="bibr" rid="B140">Szucs and Ioannidis, 2017</xref>, <xref ref-type="bibr" rid="B139">2020</xref>; <xref ref-type="bibr" rid="B69">Ioannidis, 2019</xref>). It is often criticized for the use of the point-null hypothesis (<xref ref-type="bibr" rid="B90">Meehl, 1967</xref>), also known as the &#x2018;nil null&#x2019; (<xref ref-type="bibr" rid="B21">Cohen, 1994</xref>) or &#x2018;sharp null&#x2019; hypothesis (<xref ref-type="bibr" rid="B32">Edwards et al., 1963</xref>). It was argued that the point-null hypothesis could be appropriate only in hard sciences such as physics, but it is always false in soft sciences; this problem is sometimes known as the Meehl&#x2019;s paradox (<xref ref-type="bibr" rid="B90">Meehl, 1967</xref>, <xref ref-type="bibr" rid="B91">1978</xref>; <xref ref-type="bibr" rid="B130">Serlin and Lapsley, 1985</xref>, <xref ref-type="bibr" rid="B131">1993</xref>; <xref ref-type="bibr" rid="B21">Cohen, 1994</xref>; <xref ref-type="bibr" rid="B76">Kirk, 1996</xref>). In the case of fMRI research, we face complex brain activity which is influenced by numerous psychophysiological factors. This means that with a large amount of data, we find a statistically significant effect in all voxels for any linear contrast (<xref ref-type="bibr" rid="B41">Friston et al., 2002a</xref>). For example, <xref ref-type="bibr" rid="B53">Gonzalez-Castillo et al. (2012)</xref> showed a statistically significant difference between simple visual stimulation and rest in over 95% of the brain when averaging single-subject data from 100 runs (approximately 8 h of scanning), which consisted of five blocks of stimulation (20 s of visual stimulation, 40 s of rest). Approximately half of the brain areas showed statistically significant positive effects or &#x2018;activations,&#x2019; whereas the other half showed statistically significant negative effects or &#x2018;deactivations.&#x2019;</p>
<p>Whole-brain &#x2018;&#x2019;activations/deactivations&#x2019; can also be found when analyzing large datasets such as the HCP (<italic>N</italic> &#x003E; 1000) or UK Biobank (<italic>N</italic> &#x003E; 10 000) datasets. For example, <xref ref-type="bibr" rid="B134">Smith and Nichols (2018)</xref> showed significant positive and negative effects for the emotion processing task (&#x2018;Emotional faces vs. Shapes&#x2019; contrast) in 81% of voxels using data from UK Biobank (<italic>N</italic> = 12 600) and conservative Bonferroni multiple comparison correction. When we increase the sample size, the effect estimate does not change much. Still, the standard error in the denominator of the t-value becomes increasingly smaller, resulting in negligible effects becoming statistically significant. Thus, the classical NHST ignores the magnitude of the effect. Attempts to overcome this problem led to the proposal of making a distinction between &#x2018;statistical significance&#x2019; and &#x2018;material significance&#x2019; (<xref ref-type="bibr" rid="B63">Hodges and Lehmann, 1954</xref>) or &#x2018;practical significance&#x2019; (<xref ref-type="bibr" rid="B19">Cohen, 1965</xref>; <xref ref-type="bibr" rid="B76">Kirk, 1996</xref>). That is, we can test whether the effect size is larger or smaller than some practically meaningful value using interval-null hypothesis testing (<xref ref-type="bibr" rid="B41">Friston et al., 2002a</xref>,<xref ref-type="bibr" rid="B42">b</xref>; <xref ref-type="bibr" rid="B40">Friston, 2013</xref>). In this case, we use the terms &#x2018;activations&#x2019; and &#x2018;deactivations&#x2019; for those voxels that show a practically significant positive or negative effect.</p>
</sec>
<sec id="S2.SS2">
<title>Frequentist Approach to Interval-Null Hypothesis Testing</title>
<p>Interval-null hypothesis testing is widely used in medicine and biology (<xref ref-type="bibr" rid="B92">Meyners, 2012</xref>). Consider, for example, a pharmacological study designed to compare a new treatment with an old treatment that has already shown its effectiveness. Let &#x03B2;<sub><italic>new</italic></sub> be the mean effect on brain activity of the new treatment and &#x03B2;<sub><italic>old</italic></sub> the mean effect of the old treatment. Then, &#x03B8; = (&#x03B2;<sub><italic>new</italic></sub> &#x2013; &#x03B2;<sub><italic>old</italic></sub>) is the relative effect of the new treatment. The practical significance is defined by the effect size (ES) threshold <italic>&#x03B3;</italic>. If a larger effect on brain activity is preferable, then we can test whether there is a practically meaningful difference in a positive direction (<italic>H</italic><sub>1</sub>: &#x03B8; &#x003E; <italic>&#x03B3;</italic> vs. <italic>H</italic><sub>0</sub>: &#x03B8; &#x2264; <italic>&#x03B3;</italic>). This procedure is known as the <italic>superiority test</italic> (see <xref ref-type="fig" rid="F2">Figure 2A</xref>). We can also test whether the effect of the new treatment is no worse (practically smaller) than the effect of the old treatment (<italic>H</italic><sub>1</sub>: &#x03B8; &#x003E; &#x2013;<italic>&#x03B3;</italic> vs. <italic>H</italic><sub>0</sub>: &#x03B8; &#x2264; &#x2013;<italic>&#x03B3;</italic>). This procedure is sometimes known as the <italic>non-inferiority test</italic> (see <xref ref-type="fig" rid="F2">Figure 2B</xref>). If a smaller effect on brain activity is preferable, we can use the superiority or non-inferiority test in the opposite direction (see <xref ref-type="fig" rid="F2">Figures 2C,D</xref>). The combination of these two superiority tests allows us to find a practically meaningful difference in both directions (<italic>H</italic><sub>1</sub>: &#x03B8; &#x003E; <italic>&#x03B3;</italic> and &#x03B8; &#x003C; &#x2013;<italic>&#x03B3;</italic> vs. <italic>H</italic><sub>0</sub>: &#x2013;<italic>&#x03B3;</italic> &#x2264; &#x03B8; &#x2264; <italic>&#x03B3;</italic>), that is, the <italic>minimum-effect test</italic> (see <xref ref-type="fig" rid="F2">Figure 2E</xref>). The combination of the two non-inferiority tests allows us to reject the hypothesis of practically meaningful differences in any direction (<italic>H</italic><sub>1</sub>: &#x2013;<italic>&#x03B3;</italic> &#x2264; &#x03B8; &#x2264; <italic>&#x03B3;</italic> vs. <italic>H</italic><sub>0</sub>: &#x03B8; &#x003E; <italic>&#x03B3;</italic> and &#x03B8; &#x003C; &#x2013;<italic>&#x03B3;</italic>). This is the most widely used approach to <italic>equivalence testing</italic>, known as the <italic>two one-sided tests</italic> (TOST) procedure (see <xref ref-type="fig" rid="F2">Figure 2F</xref>). For more details on the superiority and minimum-effect tests, see <xref ref-type="bibr" rid="B130">Serlin and Lapsley (1985</xref>, <xref ref-type="bibr" rid="B131">1993)</xref>, <xref ref-type="bibr" rid="B98">Murphy and Myors (1999</xref>, <xref ref-type="bibr" rid="B99">2004)</xref>. For more details on the non-inferiority test and TOST procedure see <xref ref-type="bibr" rid="B128">Schuirmann (1987)</xref>, <xref ref-type="bibr" rid="B116">Rogers et al. (1993)</xref>, <xref ref-type="bibr" rid="B148">Wellek (2010)</xref>, <xref ref-type="bibr" rid="B92">Meyners (2012)</xref>, <xref ref-type="bibr" rid="B82">Lakens (2017)</xref>).</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption><p>The alternative (<italic>H</italic><sub>1</sub>) and null (<italic>H</italic><sub>0</sub>) hypotheses for different types of interval-null hypotheses tests. <bold>(A,B)</bold> One-sided tests in the positive direction (&#x2018;the larger is better&#x2019;). <bold>(C,D)</bold> One-sided tests in the negative direction (&#x2018;the smaller is better&#x2019;). <bold>(E)</bold> Combination of both superiority tests. <bold>(F)</bold> Combination of both non-inferiority tests.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g002.tif"/>
</fig>
<p>The interval [&#x2013;<italic>&#x03B3;; &#x03B3;</italic>] defines trivially small effect sizes that we consider to be equivalent to the &#x2018;null effect&#x2019; for practical purposes. This interval is also known as the &#x2018;equivalence interval&#x2019; (<xref ref-type="bibr" rid="B128">Schuirmann, 1987</xref>) or &#x2018;region of practical equivalence (ROPE)&#x2019; (<xref ref-type="bibr" rid="B79">Kruschke, 2011</xref>). The TOST procedure, in contrast to classical NHST, allows us to assess the &#x2018;null effects.&#x2019; If we reject the null hypothesis of a practically meaningful difference, we can conclude that the effect is trivially small. The TOST procedure can also be intuitively related to frequentist interval estimates, known as confidence intervals (&#x2018;confidence interval approach,&#x2019; <xref ref-type="bibr" rid="B150">Westlake, 1972</xref>; <xref ref-type="bibr" rid="B128">Schuirmann, 1987</xref>). Confidence intervals reflect the uncertainty in the point estimation of the parameters defined by its standard error. The confidence level of (1 &#x2013; &#x03B1;) means that among infinite &#x2018;hypothetical&#x2019; replications, (1 &#x2013; &#x03B1;)% of the confidence intervals will contain the true effect under the null. Therefore, the TOST procedure is operationally identical to considering whether the (1 &#x2013; 2&#x03B1;)% confidence interval falls entirely into the ROPE, as it uses two one-sided tests with an alpha level of &#x03B1;.</p>
<p>Interval-null hypothesis testing can be used in fMRI studies not only to compare the effects of different treatments. For example, we can apply superiority tests in the positive and negative directions to detect &#x2018;activated&#x2019; and &#x2018;deactivated&#x2019; voxels and additionally apply the TOST procedure to detect &#x2018;not activated&#x2019; voxels. However, even though we can solve the Meehl&#x2019;s paradox and assess the &#x2018;null effects&#x2019; by switching from point-null to interval-null hypothesis testing within the frequentist approach, this approach still has fundamental philosophical and practical difficulties which can be effectively addressed using Bayesian statistics.</p>
</sec>
<sec id="S2.SS3">
<title>Difficulties of the Frequentist Approach</title>
<p>The pitfalls of the frequentist approach have been actively discussed by statisticians and researchers for decades. Here, we briefly mention a few of the main problems associated with the frequency approach.</p>
<p>(1) NHST is a hybrid of Fisher&#x2019;s approach that focuses on the <italic>p</italic>-value (thought to be a measure of evidence against the null hypothesis), and Neyman-Pearson&#x2019;s approach that focuses on controlling false positives with the alpha level while maximizing true positives in long-run replications. These two approaches are argued to be incompatible and have given rise to several misinterpretations among researchers, for example, confusing the meaning of <italic>p</italic>-values and alpha levels (<xref ref-type="bibr" rid="B32">Edwards et al., 1963</xref>; <xref ref-type="bibr" rid="B51">Gigerenzer, 1993</xref>; <xref ref-type="bibr" rid="B55">Goodman, 1993</xref>; <xref ref-type="bibr" rid="B122">Royall, 1997</xref>; <xref ref-type="bibr" rid="B38">Finch et al., 2001</xref>; <xref ref-type="bibr" rid="B9">Berger, 2003</xref>; <xref ref-type="bibr" rid="B66">Hubbard and Bayarri, 2003</xref>; <xref ref-type="bibr" rid="B141">Turkheimer et al., 2004</xref>; <xref ref-type="bibr" rid="B125">Schneider, 2014</xref>; <xref ref-type="bibr" rid="B107">Perezgonzalez, 2015</xref>; <xref ref-type="bibr" rid="B140">Szucs and Ioannidis, 2017</xref>; <xref ref-type="bibr" rid="B58">Greenland, 2019</xref>).</p>
<p>(2) The logical structure of NHST is the same as that of &#x2018;proof by contradiction&#x2019; or &#x2018;indirect proof,&#x2019; which becomes formally invalid when applied to probabilistic statements (<xref ref-type="bibr" rid="B113">Pollard and Richardson, 1987</xref>; <xref ref-type="bibr" rid="B21">Cohen, 1994</xref>; <xref ref-type="bibr" rid="B35">Falk and Greenbaum, 1995</xref>; <xref ref-type="bibr" rid="B102">Nickerson, 2000</xref>; <xref ref-type="bibr" rid="B135">Sober, 2008</xref>; <xref ref-type="bibr" rid="B125">Schneider, 2014</xref>, <xref ref-type="bibr" rid="B126">2018</xref>; <xref ref-type="bibr" rid="B146">Wagenmakers et al., 2017</xref>; but see <xref ref-type="bibr" rid="B62">Hagen, 1997</xref>). Valid &#x2018;proof by contradiction&#x2019; can be expressed in syllogistic form as: (1) &#x2018;If A, then B&#x2019; (Premise N<underline>o</underline> 1), (2) &#x2018;Not B&#x2019; (Premise N<underline>o</underline> 2), (3) &#x2018;Therefore not A&#x2019; (Conclusion). Probabilistic &#x2018;proof by contradiction&#x2019; in relation to NHST can be formulated as: (1) &#x2018;If <italic>H</italic><sub>0</sub> is true, then <italic>D</italic>&#x002B; are highly unlikely, (2) &#x2018;<italic>D</italic>&#x002B; was obtained,&#x2019; (3) &#x2018;Therefore <italic>H</italic><sub>0</sub> is highly unlikely.&#x2019; This problem is also referred to as the &#x2018;illusion of probabilistic proof by contradiction&#x2019; (<xref ref-type="bibr" rid="B35">Falk and Greenbaum, 1995</xref>). To illustrate the fallacy of such logic, consider the following example from <xref ref-type="bibr" rid="B113">Pollard and Richardson (1987)</xref>: (1) &#x2018;If a person is an American (<italic>H</italic><sub>0</sub>), then he is most probably not a member of Congress,&#x2019; (2) &#x2018;The person is a member of Congress,&#x2019; (3) &#x2018;Therefore the person is most probably not an American.&#x2019; Based on this, one &#x2018;rejects the null&#x2019; and makes an obviously wrong inference, as only American citizens can be a member of Congress. At the same time, using Bayesian statistics, we can show that the null hypothesis (&#x2018;the person is an American&#x2019;) is true (see the Bayesian solution of the &#x2018;Congress example&#x2019; in the <xref ref-type="supplementary-material" rid="DS1">Supplementary Materials</xref>). The &#x2018;illusion of probabilistic proof by contradiction&#x2019; leads to widespread confusion between the probability of obtaining the data, or more extreme data, under the null <italic>P</italic>(<italic>D</italic>+|<italic>H</italic><sub>0</sub>) and the probability of the null under the data <italic>P</italic>(<italic>H</italic><sub>0</sub>|D) (<xref ref-type="bibr" rid="B113">Pollard and Richardson, 1987</xref>; <xref ref-type="bibr" rid="B51">Gigerenzer, 1993</xref>; <xref ref-type="bibr" rid="B21">Cohen, 1994</xref>; <xref ref-type="bibr" rid="B35">Falk and Greenbaum, 1995</xref>; <xref ref-type="bibr" rid="B102">Nickerson, 2000</xref>; <xref ref-type="bibr" rid="B38">Finch et al., 2001</xref>; <xref ref-type="bibr" rid="B64">Hoekstra et al., 2006</xref>; <xref ref-type="bibr" rid="B54">Goodman, 2008</xref>; <xref ref-type="bibr" rid="B59">Greenland et al., 2016</xref>; <xref ref-type="bibr" rid="B147">Wasserstein and Lazar, 2016</xref>; <xref ref-type="bibr" rid="B5">Amrhein et al., 2017</xref>). The latter is a posterior probability calculated based on Bayes&#x2019; rule. The fact that researchers usually treat the <italic>p</italic>-value as a continuous measure of evidence (the Fisherian interpretation) only exacerbates this problem. &#x2018;The lower the <italic>p</italic>-value, the stronger the evidence against the null&#x2019; statement can be erroneously transformed to statements such as &#x2018;the lower the <italic>p</italic>-value, the stronger the evidence for the alternative&#x2019; or &#x2018;the higher the <italic>p</italic>-value, the stronger the evidence for the null.&#x2019; NHST can only provide evidence <italic>against</italic>, but never <italic>for</italic>, a hypothesis. In contrast, posterior probability provides direct evidence for a hypothesis; hence, it has a simple intuitive interpretation.</p>
<p>(3) The <italic>p</italic>-value is not a plausible measure of evidence (<xref ref-type="bibr" rid="B10">Berger and Berry, 1988</xref>; <xref ref-type="bibr" rid="B11">Berger and Sellke, 1987</xref>; <xref ref-type="bibr" rid="B22">Cornfield, 1966</xref>; <xref ref-type="bibr" rid="B55">Goodman, 1993</xref>; <xref ref-type="bibr" rid="B67">Hubbard and Lindsay, 2008</xref>; <xref ref-type="bibr" rid="B72">Johansson, 2011</xref>; <xref ref-type="bibr" rid="B121">Royall, 1986</xref>; <xref ref-type="bibr" rid="B143">Wagenmakers, 2007</xref>; <xref ref-type="bibr" rid="B144">Wagenmakers et al., 2008</xref>, <xref ref-type="bibr" rid="B146">2017</xref>; <xref ref-type="bibr" rid="B147">Wasserstein and Lazar, 2016</xref>; bet see <xref ref-type="bibr" rid="B58">Greenland, 2019</xref>). The frequentist approach considers infinite &#x2018;hypothetical&#x2019; replications of the experiment (sampling distribution); that is, the <italic>p</italic>-value depends on unobserved (&#x2018;more extreme&#x2019;) data. One of the most prominent theorists of Bayesian statistics, Harold Jeffreys, put it as follows: &#x2018;<italic>What the use of P implies, therefore, is that a hypothesis that may be true may be rejected because it has not predicted observable results that have not occurred</italic>&#x2019; (<xref ref-type="bibr" rid="B70">Jeffreys, 1939/1948</xref>, p. 357). In turn, the sampling distribution depends on the researcher&#x2019;s intentions. These intentions may include different kinds of <italic>multiplicities</italic>, such as multiple comparisons, double-sided comparisons, secondary analyses, subgroup analyses, exploratory analyses, preliminary analyses, and interim analyses of sequentially obtained data with optional stopping (<xref ref-type="bibr" rid="B56">Gopalan and Berry, 1998</xref>). Two researchers with different intentions may obtain different <italic>p</italic>-values based on the same dataset. The problem is that these intentions are usually unknown. When null findings are considered disappointing, it is tempting to increase the sample size until one obtains a statistically significant result. However, a statistically significant result may arise when the null is, in fact, true, which can be shown by Bayesian statistics. That is, the <italic>p</italic>-value usually exaggerates evidence against the null hypothesis. The discrepancy that may arise between frequentist and Bayesian inference is also known as the Jeffreys&#x2013;Lindley paradox (<xref ref-type="bibr" rid="B70">Jeffreys, 1939/1948</xref>; <xref ref-type="bibr" rid="B86">Lindley, 1957</xref>). In addition, it is argued that a consistent measure of evidence should not depend on the sample size (<xref ref-type="bibr" rid="B22">Cornfield, 1966</xref>). However, identical <italic>p</italic>-values provide different evidence against the null hypothesis for small and large sample sizes (<xref ref-type="bibr" rid="B143">Wagenmakers, 2007</xref>). In contrast, evidence provided by posterior probabilities and Bayes factors depends only on the exact observed data and the prior, and does not depend on the testing or stopping intentions or the sample size (<xref ref-type="bibr" rid="B143">Wagenmakers, 2007</xref>; <xref ref-type="bibr" rid="B80">Kruschke and Liddell, 2017b</xref>).</p>
<p>(4) Although frequentist interval estimates (<xref ref-type="bibr" rid="B20">Cohen, 1990</xref>, <xref ref-type="bibr" rid="B21">1994</xref>; <xref ref-type="bibr" rid="B26">Cumming, 2013</xref>) and interval-based hypothesis testing (<xref ref-type="bibr" rid="B99">Murphy and Myors, 2004</xref>; <xref ref-type="bibr" rid="B148">Wellek, 2010</xref>; <xref ref-type="bibr" rid="B92">Meyners, 2012</xref>; <xref ref-type="bibr" rid="B82">Lakens, 2017</xref>) greatly facilitate the mitigation of the abovementioned pitfalls in data interpretation, they are still subject to some of the same types of problems as the <italic>p</italic>-values and classic NHST (<xref ref-type="bibr" rid="B23">Cortina and Dunlap, 1997</xref>; <xref ref-type="bibr" rid="B102">Nickerson, 2000</xref>; <xref ref-type="bibr" rid="B8">Belia et al., 2005</xref>; <xref ref-type="bibr" rid="B144">Wagenmakers et al., 2008</xref>; <xref ref-type="bibr" rid="B65">Hoekstra et al., 2014</xref>; <xref ref-type="bibr" rid="B93">Morey et al., 2015</xref>; <xref ref-type="bibr" rid="B59">Greenland et al., 2016</xref>; <xref ref-type="bibr" rid="B81">Kruschke and Liddell, 2017a</xref>). Confidence intervals also depend on unobserved data and the intentions of the researcher. Moreover, the meaning of confidence intervals seems counterintuitive to many researchers. For example, one of the most common misinterpretations of the (1 &#x2013; &#x03B1;)% confidence interval is that the probability of finding an effect within the confidence interval is (1 &#x2013; &#x03B1;)%. In fact, it is a Bayesian interval estimate known as a <italic>credible</italic> interval.</p>
<p>Nevertheless, we would like to emphasize that we do not advocate abandoning the frequency approach. Correctly interpreted frequentist interval-based hypothesis testing with <italic>a priori</italic> power analysis defining the sample size and proper multiplicity adjustments often lead to conclusions similar to those of Bayesian inference (<xref ref-type="bibr" rid="B83">Lakens et al., 2018</xref>). However, it may be logically and practically difficult to carry out an appropriate power analysis and make multiplicity adjustments (<xref ref-type="bibr" rid="B13">Berry and Hochberg, 1999</xref>; <xref ref-type="bibr" rid="B24">Cramer et al., 2015</xref>; <xref ref-type="bibr" rid="B137">Streiner, 2015</xref>; <xref ref-type="bibr" rid="B127">Sch&#x00F6;nbrodt et al., 2017</xref>; <xref ref-type="bibr" rid="B133">Sj&#x00F6;lander and Vansteelandt, 2019</xref>). These procedures may be even more complicated in fMRI research than in psychological or social studies (see discussion on power analysis in <xref ref-type="bibr" rid="B97">Mumford and Nichols, 2008</xref>; <xref ref-type="bibr" rid="B74">Joyce and Hayasaka, 2012</xref>; <xref ref-type="bibr" rid="B96">Mumford, 2012</xref>; <xref ref-type="bibr" rid="B25">Cremers et al., 2017</xref>; <xref ref-type="bibr" rid="B110">Poldrack et al., 2017</xref>; multiple comparisons in <xref ref-type="bibr" rid="B100">Nichols and Hayasaka, 2003</xref>; <xref ref-type="bibr" rid="B101">Nichols, 2012</xref>; <xref ref-type="bibr" rid="B34">Eklund et al., 2016</xref>; and other types of multiplicities in <xref ref-type="bibr" rid="B141">Turkheimer et al., 2004</xref>; <xref ref-type="bibr" rid="B15">Chen et al., 2018</xref>, <xref ref-type="bibr" rid="B18">2019</xref>, <xref ref-type="bibr" rid="B17">2020</xref>; <xref ref-type="bibr" rid="B3">Alberton et al., 2020</xref>). For example, at the beginning of a long-term study, one may want to check whether stimulus onset timings are precisely synchronized with fMRI data collection and perform preliminary analysis on the first five subjects. The question of whether the researcher must make an adjustment for this technical check when reporting the results for the final sample become important in the frequentist approach. Such preliminary analyses (or other forms of interim analyses) are generally not considered a source of concern in Bayesian inference because posterior probabilities do not depend on the sampling plan (for discussion, see <xref ref-type="bibr" rid="B12">Berry, 1988</xref>; <xref ref-type="bibr" rid="B10">Berger and Berry, 1988</xref>; <xref ref-type="bibr" rid="B32">Edwards et al., 1963</xref>; <xref ref-type="bibr" rid="B143">Wagenmakers, 2007</xref>; <xref ref-type="bibr" rid="B80">Kruschke and Liddell, 2017b</xref>; <xref ref-type="bibr" rid="B119">Rouder, 2014</xref>; <xref ref-type="bibr" rid="B127">Sch&#x00F6;nbrodt et al., 2017</xref>). Or, for example, one may want to find both &#x2018;activated/deactivated&#x2019; and &#x2018;not activated&#x2019; brain areas and use two superiority tests in combination with the TOST procedure. It is not trivial to make appropriate multiplicity adjustments in this case. In contrast, Bayesian inference suggests a single decision rule without the need for additional adjustments. Moreover, to our knowledge, practical implementations of superiority tests and the TOST procedure in common software for fMRI data analysis do not yet exist. At the same time, Bayesian analysis has already been implemented in SPM12<sup><xref ref-type="fn" rid="footnote2">2</xref></sup> and is easily accessible to end-users. It consists of two steps: Bayesian parameter estimation and Bayesian inference. In general, it is not necessary to use Bayesian analysis at the subject level of analysis to apply it at the group level. One can combine computationally less demanding frequentist parameter estimation for single subjects with Bayesian estimation and inference at the group level. In the next sections, we consider the group-level Bayesian analysis implemented in SPM12.</p>
</sec>
<sec id="S2.SS4">
<title>Bayesian Parameter Estimation</title>
<p>Bayesian statistics is based on Bayes&#x2019; rule:</p>
<disp-formula id="S2.E2"><label>(2)</label><mml:math id="M2"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mpadded width="-1.7pt"><mml:mi>H</mml:mi></mml:mpadded><mml:mo rspace="0.8pt">|</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mpadded width="+5pt"><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mpadded width="-1.7pt"><mml:mi>D</mml:mi></mml:mpadded><mml:mo rspace="0.8pt">|</mml:mo><mml:mi>H</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>H</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mpadded></mml:mrow></mml:math></disp-formula>
<p>where <italic>P</italic>(<italic>H|D</italic>) is the probability of the hypothesis given the obtained data or posterior probability. <italic>P</italic>(<italic>D</italic>|<italic>H</italic>) is the probability of obtaining the <italic>exact</italic> data given the hypothesis or the likelihood (notice the difference from <italic>P</italic>(<italic>D</italic>+|<italic>H</italic>), which includes <italic>more extreme</italic> data). <italic>P</italic>(<italic>H</italic>) is the prior probability of the hypothesis (our knowledge of the hypothesis before we obtain the data). <italic>P</italic>(<italic>D</italic>) is a normalizing constant ensuring that the sum of posterior probabilities over all possible hypotheses equals one (marginal likelihood). In the case of mutually exclusive hypotheses, the denominator of Bayes&#x2019;s rule is the sum of the probabilities of obtaining the data under any of the possible hypotheses, multiplied by its prior probability. For example, if we consider two mutually exclusive hypotheses <italic>H</italic><sub>0</sub> and <italic>H</italic><sub>1</sub>, then <italic>P</italic>(<italic>D</italic>) = <italic>P</italic>(<italic>D</italic>|<italic>H</italic><sub>0</sub>) <italic>P</italic>(<italic>H</italic><sub>0</sub>) + <italic>P</italic>(<italic>D</italic>|<italic>H</italic><sub>1</sub>)<italic>P</italic>(<italic>H</italic><sub>1</sub>) and <italic>P</italic>(<italic>H</italic><sub>0</sub><italic>|D</italic>) + <italic>P</italic>(<italic>H</italic><sub>1</sub><italic>|D</italic>) = 1. When we consider continuous hypotheses, the denominator is obtained by integrating over all hypotheses (parameter spaces). For relatively simple models, these integrals can be solved analytically. However, for more complex models, the integrals become analytically intractable. In this case, there are two main approaches to obtain the posterior probability: (1) use computationally demanding numerical integration (Markov chain Monte Carlo methods); (2) use less accurate but computationally efficient analytical approximations to the posterior distribution (e.g., Expectation Maximization or Variational Bayes techniques). Describing these procedures go beyond the scope of this paper and described elsewhere (for their implementations in fMRI analysis, see <xref ref-type="bibr" rid="B49">Genovese, 2000</xref>; <xref ref-type="bibr" rid="B41">Friston et al., 2002a</xref>,<xref ref-type="bibr" rid="B43">2007</xref>; <xref ref-type="bibr" rid="B44">Friston and Penny, 2003</xref>; <xref ref-type="bibr" rid="B104">Penny et al., 2003</xref>, <xref ref-type="bibr" rid="B106">2005</xref>, <xref ref-type="bibr" rid="B103">2007</xref>; <xref ref-type="bibr" rid="B105">Penny and Ridgway, 2013</xref>; <xref ref-type="bibr" rid="B152">Woolrich et al., 2004</xref>).</p>
<p>In verbal form, Bayes&#x2019; rule can be expressed as:</p>
<disp-formula id="S2.Ex1"><mml:math id="M3"><mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi></mml:mrow><mml:mo>&#x221D;</mml:mo><mml:mrow><mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mi>i</mml:mi><mml:mi>k</mml:mi><mml:mi>e</mml:mi><mml:mi>l</mml:mi><mml:mi>i</mml:mi><mml:mi>h</mml:mi><mml:mi>o</mml:mi><mml:mi>o</mml:mi><mml:mi>d</mml:mi></mml:mrow><mml:mo>&#x00D7;</mml:mo><mml:mi>P</mml:mi></mml:mrow><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>This means that we can update our prior beliefs about the hypothesis based on the obtained data.</p>
<p>One of the main difficulties in using Bayesian statistics, in addition to the computational complexity, is the choice of appropriate prior assumptions. The prior can be chosen based on theoretical arguments or from independent experimental data (full Bayes approach). At the same time, if the data are organized hierarchically, which is the case for neuroimaging data, priors can be specified based on the obtained data itself using an empirical Bayes approach. The lower level of the hierarchy corresponds to the experimental effects at any given voxel, and the higher level of the hierarchy comprises the effect over all voxels. Thus, the variance of the experimental effect over all voxels can be used as the prior variance of the effect at any given voxel. This approach is known as the parametric empirical Bayes (PEB) with the &#x2018;global shrinkage&#x2019; prior (<xref ref-type="bibr" rid="B44">Friston and Penny, 2003</xref>). The prior variance is estimated from the data under the assumption that the prior probability density corresponds to a Gaussian distribution with zero mean. In other words, a global experimental effect is assumed to be absent. An increase in local activity can be detected in some brain areas; a decrease can be found in others, but the total change in neural metabolism in the whole brain is approximately zero. This is a reasonable physiological assumption because studies of brain energy metabolism have shown that the global metabolism is &#x2018;remarkably constant despite widely varying mental and motoric activity&#x2019; (<xref ref-type="bibr" rid="B114">Raichle and Gusnard, 2002</xref>), and &#x2018;the changes in the global measurements of blood flow and metabolism&#x2019; are &#x2018;too small to be measured&#x2019; by functional imaging techniques such as PET and fMRI (<xref ref-type="bibr" rid="B61">Gusnard and Raichle, 2001</xref>).</p>
<p>Now, we can rewrite Bayes&#x2019; rule (eq. 2) for the effect &#x03B8; = <italic>c</italic>&#x03B2;:</p>
<disp-formula id="S2.E3"><label>(3)</label><mml:math id="M4"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>|</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mpadded width="+5pt"><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>|</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mpadded></mml:mrow></mml:math></disp-formula>
<p>In the process of Bayesian updating with the &#x2018;global shrinkage&#x2019; prior, the effect estimate &#x2018;shrinks&#x2019; toward zero. The greater the uncertainty of the effect estimate (variability) in a particular voxel, the less confidence in this estimate, and the more it shrinks (see <xref ref-type="fig" rid="F3">Figure 3</xref>).</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption><p>Schematic of Bayesian updating with the &#x2018;global shrinkage&#x2019; prior.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g003.tif"/>
</fig>
<p>The assumption of a Gaussian prior, likelihood, and posterior essentially reduces computational demands for Bayesian analysis. However, the normality assumption can be violated for empirical data. For example, violations can be observed in the presence of outliers, particularly with small sample sizes or unbalanced designs, which diminishes the validity of the statistical analysis. This problem is not specific to Bayesian analysis but is inherent to all group-level analyses that assume a normal distribution of the effect. Nevertheless, in fMRI studies, the most common approach is to use the Gaussian general linear models (<xref ref-type="bibr" rid="B112">Poline and Brett, 2012</xref>), which have been shown to be robust against violations of the normality assumption (<xref ref-type="bibr" rid="B77">Knief and Forstmeier, 2021</xref>). Still, we need to be ensured that these assumptions are not violated substantially. If that is the case, one can use Bayesian estimation based on non-Gaussian distributions. In this work, we consider Bayesian estimation with Gaussian &#x2018;global shrinkage&#x2019; prior implemented in SPM12.</p>
<p>After Bayesian parameter estimation, we can apply one of the two main types of Bayesian inference (<xref ref-type="bibr" rid="B105">Penny and Ridgway, 2013</xref>): <italic>Bayesian parameter inference (BPI)</italic> or <italic>Bayesian model inference (BMI)</italic>. BPI is also known as Bayesian parameter estimation (<xref ref-type="bibr" rid="B80">Kruschke and Liddell, 2017b</xref>). However, we deliberately separate these two terms, as they correspond to two different steps of data analysis in SPM12. BMI is also known as Bayesian model comparison, Bayesian model selection, or Bayesian hypothesis testing (<xref ref-type="bibr" rid="B80">Kruschke and Liddell, 2017b</xref>). We chose the term BMI as it is consonant with the term BPI.</p>
</sec>
<sec id="S2.SS5">
<title>Bayesian Parameter Inference</title>
<p>The BPI is based on the posterior probability of finding the effect within or outside the ROPE. Let effects larger than the ES threshold <italic>&#x03B3;</italic> be &#x2018;activations,&#x2019; those smaller than &#x2013;<italic>&#x03B3;</italic> be &#x2018;deactivations,&#x2019; and those falling within the ROPE [&#x2013;<italic>&#x03B3;</italic>; <italic>&#x03B3;</italic>] be &#x2018;no activations.&#x2019; Then, we can classify voxels as &#x2018;activated,&#x2019; &#x2018;deactivated,&#x2019; or &#x2018;not activated&#x2019; if:</p>
<disp-formula id="S2.E4"><label>(4.1)</label><mml:math id="M5"><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x003E;</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>|</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2265;</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></disp-formula>
<disp-formula id="S2.E5"><label>(4.2)</label><mml:math id="M6"><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>d</mml:mi><mml:mi>e</mml:mi><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x003C;</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>|</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2265;</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></disp-formula>
<disp-formula id="S2.E6"><label>(4.3)</label><mml:math id="M7"><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>|</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>&#x2265;</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>t</mml:mi><mml:mi>h</mml:mi><mml:mi>r</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:math></disp-formula>
<p>where <italic>P</italic><sub><italic>thr</italic></sub> is the posterior probability threshold (usually <italic>P</italic><sub><italic>thr</italic></sub> = 95%). Note that <italic>P</italic><sub><italic>act</italic></sub> + <italic>P</italic><sub><italic>deact</italic></sub> + <italic>P</italic><sub><italic>null</italic></sub> = 1.</p>
<p>If none of the above criteria are satisfied, the data in a particular voxel are insufficient to distinguish voxels that are &#x2018;activated/deactivated&#x2019; from those that are &#x2018;not activated.&#x2019; Hereinafter, we refer to them as &#x2018;low confidence&#x2019; voxels (<xref ref-type="bibr" rid="B89">Magerkurth et al., 2015</xref>). This decision rule is also known as the &#x2018;ROPE-only&#x2019; rule (<xref ref-type="bibr" rid="B81">Kruschke and Liddell, 2017a</xref>), see also <xref ref-type="bibr" rid="B60">Greenwald (1975)</xref>; <xref ref-type="bibr" rid="B148">Wellek (2010)</xref>, <xref ref-type="bibr" rid="B84">Liao et al. (2019)</xref>. To the best of our knowledge, the application of this decision rule to neuroimaging data was pioneered by <xref ref-type="bibr" rid="B41">Friston et al. (2002a</xref>; <xref ref-type="bibr" rid="B42">2002b</xref>; <xref ref-type="bibr" rid="B44">Friston and Penny, 2003)</xref>. For convenience and visualization purposes, we can use the natural logarithm of the posterior probability odds (LPO), for example:</p>
<disp-formula id="S2.E7"><label>(5)</label><mml:math id="M8"><mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mi>P</mml:mi><mml:msub><mml:mi>O</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>d</mml:mi><mml:mi>e</mml:mi><mml:mi>a</mml:mi><mml:mi>c</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mi>n</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mi>P</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo rspace="7.5pt">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>This allows us to more effectively discriminate voxels with a posterior probability close to unity (<xref ref-type="bibr" rid="B105">Penny and Ridgway, 2013</xref>). <italic>LPO</italic><sub><italic>null</italic></sub> &#x003E; 3 corresponds to <italic>P</italic><sub><italic>null</italic></sub> &#x003E; 95%. In addition, <italic>LPO</italic> also allows us to identify the connection between BPI and BMI. The maps of the <italic>LPO</italic> are termed posterior probability maps (PPMs) in SPM12.</p>
<p>Another possible decision rule considers the overlap between ROPE and the 95% highest density interval (HDI). HDI is a type of credible interval (Bayesian analog of the confidence interval), which contains only the effects with the highest posterior probability density. If the HDI falls entirely inside the ROPE, we can classify voxels as &#x2018;not activated.&#x2019; In contrast, if the HDI lies completely outside the ROPE, we can classify voxels as either &#x2018;activated&#x2019; or &#x2018;deactivated.&#x2019; If the HDI overlaps with the ROPE, we cannot make a confident decision (we can consider them to be &#x2018;low confidence&#x2019; voxels). This decision rule is known as the &#x2018;HDI+ROPE&#x2019; rule (<xref ref-type="bibr" rid="B81">Kruschke and Liddell, 2017a</xref>). It is more conservative than the &#x2018;ROPE-only&#x2019; rule because it does not consider the effects from the low-density tails of the posterior probability distribution. Differences between the &#x2018;HDI+ROPE&#x2019; rule and the &#x2018;ROPE-only&#x2019; are most evident for strongly skewed distributions. In such cases, the ROPE may contain more than 95% of the posterior probability distribution, but the 95% HDI may overlap with the ROPE. In the case of a Gaussian posterior probability distribution, both decision rules should produce similar results. The &#x2018;HDI+ROPE rule is advocated by <xref ref-type="bibr" rid="B81">Kruschke and Liddell (2017a)</xref> and the &#x2018;ROPE-only&#x2019; rule is preferred by <xref ref-type="bibr" rid="B41">Friston et al. (2002a</xref>; <xref ref-type="bibr" rid="B42">2002b</xref>; <xref ref-type="bibr" rid="B44">Friston and Penny, 2003)</xref>, <xref ref-type="bibr" rid="B148">Wellek (2010)</xref>; <xref ref-type="bibr" rid="B84">Liao et al. (2019)</xref>. These decision rules are illustrated in <xref ref-type="fig" rid="F4">Figure 4</xref>.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption><p>Possible variants of the posterior probability distributions of the effect &#x03B8; = <italic>c</italic>&#x03B2; in <bold>(A)</bold> &#x2018;activated&#x2019; voxels, <bold>(B)</bold> &#x2018;deactivated&#x2019; voxels, <bold>(C)</bold> &#x2018;not activated&#x2019; voxels, <bold>(D)</bold> &#x2018;low confidence&#x2019; voxels. The &#x2018;ROPE only&#x2019; rule considers only the colored parts of the distributions. The &#x2018;HDI+ROPE&#x2019; rule considers overlap between the ROPE and 95% HDI.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g004.tif"/>
</fig>
</sec>
<sec id="S2.SS6">
<title>Bayesian Model Inference</title>
<p>With BPI, we consider the posterior probabilities of the linear contrast of parameters &#x03B8; = <italic>c</italic>&#x03B2;. Instead, we can consider models using BMI.</p>
<p>Let <italic>H</italic><sub><italic>alt</italic></sub> and <italic>H</italic><sub><italic>null</italic></sub> be two non-overlapping hypotheses represented by models <italic>M</italic><sub><italic>alt</italic></sub> and <italic>M</italic><sub><italic>null</italic></sub>. These models are defined by two parameter spaces: (1) <italic>M</italic><sub><italic>alt</italic></sub>: &#x03B8; &#x003E; <italic>&#x03B3;</italic> and &#x03B8; &#x003C; &#x2013;<italic>&#x03B3;</italic>, and (2) <italic>M</italic><sub><italic>null</italic></sub>: &#x2013;<italic>&#x03B3;</italic> &#x2264; &#x03B8; &#x2264; <italic>&#x03B3;.</italic></p>
<p>Now, we can rewrite Bayes&#x2019; rule (eq. 2) for <italic>M</italic><sub><italic>alt</italic></sub> and <italic>M</italic><sub><italic>null</italic></sub></p>
<disp-formula id="S2.E8"><label>(6.1)</label><mml:math id="M9"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>
<disp-formula id="S2.E9"><label>(6.2)</label><mml:math id="M10"><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow></mml:math></disp-formula>
<p>If we divide equation (6.1) by (6.2), <italic>P</italic>(<italic>D</italic>) is canceled out, and we obtain:</p>
<disp-formula id="S2.E10"><label>(7)</label><mml:math id="M11"><mml:mrow><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo><mml:mi>D</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mrow><mml:mpadded width="+5pt"><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mpadded><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>In verbal form equation (7) can be expressed as:</p>
<p><italic>Posterior Odds = Bayes Factor</italic> &#x00D7; <italic>Prior Odds</italic></p>
<p>The Bayes factor (<italic>BF</italic>) is a multiplier that converts prior model probability odds to posterior model probability odds. It indicates the relative evidence for one model against another. For example, if <inline-formula><mml:math id="INEQ1"><mml:mrow><mml:mrow><mml:mi>B</mml:mi><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>p</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi>D</mml:mi><mml:mo>|</mml:mo><mml:msub><mml:mi>M</mml:mi><mml:mrow><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>=</mml:mo><mml:mn>2</mml:mn></mml:mrow></mml:math></inline-formula>, then the observed data are twice as likely under the null model than under the alternative.</p>
<p>A connection exists between the BPI (eq. 3&#x2013;5), and BMI (eq. 7) (see <xref ref-type="bibr" rid="B94">Morey and Rouder, 2011</xref>; <xref ref-type="bibr" rid="B84">Liao et al., 2019</xref>):</p>
<disp-formula id="S2.E11"><label>(8)</label><mml:math id="M12"><mml:mrow><mml:mrow><mml:mi>B</mml:mi><mml:msub><mml:mi>F</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo stretchy="false">|</mml:mo><mml:mi>D</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo stretchy="false">|</mml:mo><mml:mi>D</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo>)</mml:mo></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mo>-</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x2264;</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo rspace="7.5pt">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>or, in verbal form:</p>
<disp-formula id="S2.Ex2"><mml:math id="M13"><mml:mrow><mml:mrow><mml:mi>B</mml:mi><mml:mi>F</mml:mi><mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>R</mml:mi><mml:mi>O</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>R</mml:mi><mml:mi>O</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mi>o</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mpadded width="+5pt"><mml:mi>r</mml:mi></mml:mpadded><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x2209;</mml:mo><mml:mi>R</mml:mi><mml:mi>O</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x2209;</mml:mo><mml:mi>R</mml:mi><mml:mi>O</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>R</mml:mi><mml:mi>O</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>For convenience, <italic>BF</italic> may also be expressed in the form of a natural logarithm:</p>
<disp-formula id="S2.E12"><label>(9)</label><mml:math id="M14"><mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mi>B</mml:mi><mml:mi>F</mml:mi><mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>R</mml:mi><mml:mi>O</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mrow><mml:mi>L</mml:mi><mml:mi>P</mml:mi><mml:mpadded width="+5pt"><mml:msub><mml:mi>O</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mpadded></mml:mrow><mml:mo>+</mml:mo><mml:mrow><mml:mi>l</mml:mi><mml:mpadded width="+5pt"><mml:mi>n</mml:mi></mml:mpadded><mml:mrow><mml:mo>(</mml:mo><mml:mfrac><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x2209;</mml:mo><mml:mi>R</mml:mi><mml:mi>O</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:mi>r</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mi mathvariant="normal">&#x03B8;</mml:mi><mml:mo>&#x2208;</mml:mo><mml:mi>R</mml:mi><mml:mi>O</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:mfrac><mml:mo rspace="7.5pt">)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mrow></mml:math></disp-formula>
<disp-formula id="S2.E13"><label>(10)</label><mml:math id="M15"><mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mi>B</mml:mi><mml:mi>F</mml:mi><mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>R</mml:mi><mml:mi>O</mml:mi><mml:mi>P</mml:mi><mml:mi>E</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x221D;</mml:mo><mml:mrow><mml:mi>L</mml:mi><mml:mi>P</mml:mi><mml:mpadded width="+5pt"><mml:msub><mml:mi>O</mml:mi><mml:mrow><mml:mi>n</mml:mi><mml:mi>u</mml:mi><mml:mi>l</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mpadded></mml:mrow></mml:mrow></mml:math></disp-formula>
<p>The calculation of <italic>BF</italic> may be computationally challenging, as it requires integration over the parameter space. However, if the ROPE has zero width (point-null hypothesis), then the <italic>BF</italic> has an analytical solution known as the Savage&#x2013;Dickey ratio (SDR) (<xref ref-type="bibr" rid="B145">Wagenmakers et al., 2010</xref>; <xref ref-type="bibr" rid="B45">Friston and Penny, 2011</xref>; <xref ref-type="bibr" rid="B117">Rosa et al., 2012</xref>; <xref ref-type="bibr" rid="B105">Penny and Ridgway, 2013</xref>). <italic>BF(SDR)</italic><sub><italic>null</italic></sub> is calculated by dividing the prior probability density by the posterior probability density at &#x03B8; = 0. The interpretation of the <italic>BF(SDR)</italic><sub><italic>null</italic></sub> is simple: if the effect size is less likely to equal zero after obtaining the data than before, then <italic>BF(SDR)</italic><sub><italic>null</italic></sub> &#x003C; 1: that is, we have more evidence for <italic>M</italic><sub><italic>alt</italic></sub>. See schematic illustration of BMI based on interval-null and point-null hypotheses and its relation to BPI in <xref ref-type="fig" rid="F5">Figure 5</xref>.</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption><p>Schematic of <italic>BFs</italic> used in BMI and their relation to <italic>LPO</italic> used in BPI. <bold>(A)</bold> BPI based on the &#x2018;ROPE-only&#x2019; decision rule. <bold>(B)</bold> <italic>BF(ROPE)</italic> is related to the areas under the functions of the posterior and prior probability densities inside and outside the ROPE. <bold>(C)</bold> <italic>BF(SDR)</italic> is the relation between the posterior and prior probability at &#x03B8; = 0. <italic>LPOs</italic> and <italic>BFs</italic> provide relative evidence for the null and alternative hypotheses.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g005.tif"/>
</fig>
</sec>
<sec id="S2.SS7">
<title>Relations Between Frequentist and Bayesian Approaches</title>
<p>Now we can point out the conceptual links between the frequentist and Bayesian approaches.</p>
<p>(1) <bold>Parameter estimation</bold>: When we have no prior information, that is, all parameter values are <italic>a priori</italic> equally probable (&#x2018;flat&#x2019; prior), the PEB estimation reduces to the frequentist parameter estimation (maximum likelihood estimation; <xref ref-type="bibr" rid="B41">Friston et al., 2002a</xref>).</p>
<p>(2) <bold>Multiplicity adjustments</bold>: One of the major concerns in frequentist inference is the multiplicity problem. In general, after the Bayesian parameter estimation, it is not necessary to classify any voxel as &#x2018;activated/deactivated &#x2018; or &#x2018;not activated.&#x2019; If we consider <italic>unthresholded</italic> maps of posterior probabilities, <italic>LPOs</italic>, or <italic>LogBFs</italic>, the multiple comparisons problem does not arise (<xref ref-type="bibr" rid="B44">Friston and Penny, 2003</xref>). However, if we apply a decision rule to classify voxels, we should control for wrong decisions across multiple comparisons (<xref ref-type="bibr" rid="B153">Woolrich et al., 2009</xref>, see also possible loss functions in <xref ref-type="bibr" rid="B95">Muller et al., 2006</xref>; <xref ref-type="bibr" rid="B81">Kruschke and Liddell, 2017a</xref>). The advantage of PEB with the &#x2018;global shrinkage&#x2019; prior is that it automatically accounts for multiple comparisons without the need for <italic>ad hoc</italic> multiplicity adjustments (<xref ref-type="bibr" rid="B12">Berry, 1988</xref>; <xref ref-type="bibr" rid="B44">Friston and Penny, 2003</xref>; <xref ref-type="bibr" rid="B48">Gelman et al., 2012</xref>). The frequentist approach processes every voxel independently, whereas the PEB algorithm considers joint information from all voxels. Frequentist inference uncorrected for multiple independent comparisons is prone to label noise-driven, random extremes as &#x2018;statistically significant.&#x2019; Bayesian analysis specifies that extreme values are unlikely <italic>a priori</italic>, and thus they shrink toward a common mean (<xref ref-type="bibr" rid="B88">Lindley, 1990</xref>; <xref ref-type="bibr" rid="B149">Westfall et al., 1997</xref>; <xref ref-type="bibr" rid="B13">Berry and Hochberg, 1999</xref>; <xref ref-type="bibr" rid="B41">Friston et al., 2002a</xref>,<xref ref-type="bibr" rid="B42">b</xref>; <xref ref-type="bibr" rid="B48">Gelman et al., 2012</xref>; <xref ref-type="bibr" rid="B80">Kruschke and Liddell, 2017b</xref>). If we consider <italic>thresholded</italic> maps of posterior probabilities, for example, <italic>P</italic><sub><italic>act</italic></sub> &#x003E; 95%, then as many as 5% of &#x2018;activated&#x2019; voxels could be falsely labeled so. This is conceptually similar to the false discovery rate (FDR) correction (<xref ref-type="bibr" rid="B13">Berry and Hochberg, 1999</xref>; <xref ref-type="bibr" rid="B42">Friston et al., 2002b</xref>; <xref ref-type="bibr" rid="B44">Friston and Penny, 2003</xref>; <xref ref-type="bibr" rid="B136">Storey, 2003</xref>; <xref ref-type="bibr" rid="B95">Muller et al., 2006</xref>; <xref ref-type="bibr" rid="B129">Schwartzman et al., 2009</xref>). In practice, BPI with <italic>&#x03B3;</italic> = 0 should produce similar results (in terms of the number of &#x2018;activated/deactivated&#x2019; voxels) as classical NHST with FDR correction. If we increase the ES threshold, fewer voxels will be classified as &#x2018;activated/deactivated,&#x2019; and at some <italic>&#x03B3;</italic> value, BPI will produce results similar to the more conservative Family Wise Error (FWE) correction<sup><xref ref-type="fn" rid="footnote3">3</xref></sup>.</p>
<p>(3) <bold>Interval-based hypothesis testing</bold>: Frequentist interval-based hypothesis testing is conceptually connected with BPI, particularly, the &#x2018;HDI+ROPE&#x2019; decision rule. The former considers the intersection between ROPE and the confidence intervals. The latter considers the intersection between ROPE and the HDI (credible intervals).</p>
<p>(4) <bold>BPI and BMI</bold>: BMI based on <italic>BF(ROPE)</italic> is conceptually linked to BPI based on the &#x2018;ROPE-only&#x2019; decision rule. The interval-based Bayes factor <italic>BF(ROPE)</italic> is proportional to the posterior probability odds. When ROPE is infinitesimally narrow, <italic>BF</italic> can be approximated using the <italic>SDR</italic>. Note that even though <italic>BF(SDR)</italic> is based on the point-null hypothesis, it can still provide evidence for the null hypothesis, in contrast to BPI with <italic>&#x03B3;</italic> = 0. However, <italic>BF(SDR)</italic> in PEB settings has not yet been tested using empirical fMRI data. Because the point-null hypothesis is always false (<xref ref-type="bibr" rid="B90">Meehl, 1967</xref>), BPI and <italic>BF(ROPE)</italic> may be preferred over <italic>BF(SDR)</italic>.</p>
</sec>
<sec id="S2.SS8">
<title>Definition of the Effect Size Threshold</title>
<p>The main difficulty in applying interval-based methods is the choice of the ES threshold <italic>&#x03B3;</italic>. To date, only a few studies have been devoted to determining the minimal relevant effect size. One of them suggested a method to objectively define <italic>&#x03B3;</italic> at the subject level of analysis which was calibrated by clinical experts and may be implemented for pre-surgical planning (<xref ref-type="bibr" rid="B89">Magerkurth et al., 2015</xref>). At the same time, the problem of choosing the ES threshold <italic>&#x03B3;</italic> for the group-level Bayesian analysis remains unresolved.</p>
<p>Several ways in which to define the ES threshold are available. Firstly, we can conduct a pilot study to determine the expected effect sizes. Secondly, we can use data from the literature to determine the typical effect sizes for the condition of interest. Thirdly, we can use the default ES thresholds that are commonly accepted in the field. One of the first ES thresholds proposed in the neuroimaging literature was <italic>&#x03B3;</italic> = 0.1% (<xref ref-type="bibr" rid="B42">Friston et al., 2002b</xref>). This is the default ES threshold for the subject-level BPI in SPM12. For the group-level BPI, the default ES threshold is one prior standard deviation of the effect <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> (<xref ref-type="bibr" rid="B44">Friston and Penny, 2003</xref>). Fourthly, <italic>&#x03B3;</italic> can be selected in such a way as to ensure maximum similarity of the activation patterns revealed by classical NHST and Bayesian inference. This would allow us to reanalyze the data using Bayesian inference, reveal similar activation patterns as was previously the case for classic inference, and detect the &#x2018;not activated&#x2019; and &#x2018;low confidence&#x2019; voxels. Lastly, we can consider the posterior probabilities at multiple ES thresholds or compute the ROPE maps (see below).</p>
<p>The ES threshold can be expressed as unstandardized (raw &#x03B2; values or percent signal change) and standardized values (for example, Cohen&#x2019;s d). Raw &#x03B2; values calculated by SPM12 at the first level of analysis represent the BOLD signal in arbitrary units. However, they can be scaled to a more meaningful unit, the BOLD percent signal change (PSC) (<xref ref-type="bibr" rid="B111">Poldrack et al., 2011</xref>; <xref ref-type="bibr" rid="B16">Chen et al., 2017</xref>). Unstandardized and standardized values have disadvantages and advantages. Different ways exist in which to scale &#x03B2; values to PSC (<xref ref-type="bibr" rid="B108">Pernet, 2014</xref>; <xref ref-type="bibr" rid="B16">Chen et al., 2017</xref>), which is problematic when comparing the results of different studies. Standardized values represent the effect size in terms of the standard deviation units, which supposedly facilitate the comparison of results between different experiments. However, standardized values are relatively more unstable between measurements and less interpretable (<xref ref-type="bibr" rid="B6">Baguley, 2009</xref>; <xref ref-type="bibr" rid="B16">Chen et al., 2017</xref>). Moreover, Cohen&#x2019;s d is closely related to the t-value (for one sample case, <inline-formula><mml:math id="INEQ3"><mml:mrow><mml:mi>d</mml:mi><mml:mo rspace="7.5pt">=</mml:mo><mml:mrow><mml:mi>t</mml:mi><mml:mo>/</mml:mo><mml:msqrt><mml:mi>N</mml:mi></mml:msqrt></mml:mrow></mml:mrow></mml:math></inline-formula>) and may share some drawbacks with t-values. <xref ref-type="bibr" rid="B115">Reimold et al. (2005)</xref> showed that spatial smoothing has a nonlinear effect on voxel variance. Using t-values or Cohen&#x2019;s d for inference in neuroimaging may lead to spatially inaccurate results (spatial shift of local maxima in t-maps or Cohen&#x2019;s d maps compared to PSC-maps). In this study, we focused on PSCs.</p>
<p>It is also important to note that effect sizes (both BOLD PSC and Cohen&#x2019;s d) depend on the MRI scanner (e.g., field strength, coil sensitivity), acquisition parameters (e.g., echo time, spin echo vs. gradient echo sequences) and field inhomogeneity (<xref ref-type="bibr" rid="B142">UIudag et al., 2009</xref>). For example, the effect sizes may be underestimated in brain areas near air&#x2013;tissue interfaces because of field inhomogeneities. This fact further complicates the selection of the ES threshold. However, this does not mean that we should ignore the effect size and return to the point-null hypothesis. One may choose different ES thresholds for different regions of interest, scanners or acquisition parameters.</p>
</sec>
</sec>
<sec id="S3">
<title>Methods</title>
<sec id="S3.SS1">
<title>Datasets</title>
<p>Seven block-design tasks were considered from the HCP dataset, including working memory, gambling, motor, language, social cognition, relation processing, and emotion processing tasks (<xref ref-type="bibr" rid="B7">Barch et al., 2013</xref>). Two event-related tasks, including the stop-signal and task-switching tasks were considered from the UCLA dataset (<xref ref-type="bibr" rid="B109">Poldrack et al., 2016</xref>). The length, conditions, and number of scans of the tasks are provided in the <xref ref-type="supplementary-material" rid="DS1">Supplementary Materials</xref> (<xref ref-type="supplementary-material" rid="DS1">Supplementary Table 1</xref>). A subset of 100 unrelated subjects (S1200 release) was selected from the HCP dataset (54 females, 46 males, mean age = 29.1 &#x00B1; 3.7 years) for assessment. A total of 115 subjects from the UCLA dataset were included in the analysis (55 females, 60 males, mean age = 31.7 &#x00B1; 8.9 years) after removing subjects with no data for the stop-signal task, a high level (&#x003E;15%) of errors in the Go-trials, and those of which the raw data were reported to be problematic (<xref ref-type="bibr" rid="B57">Gorgolewski et al., 2017</xref>). See the fMRI acquisition parameters in the <xref ref-type="supplementary-material" rid="DS1">Supplementary Materials</xref>, Par. 1.</p>
</sec>
<sec id="S3.SS2">
<title>Preprocessing</title>
<p>The minimal preprocessing pipelines for the HCP and UCLA datasets were described by <xref ref-type="bibr" rid="B52">Glasser et al. (2013)</xref> and <xref ref-type="bibr" rid="B57">Gorgolewski et al. (2017)</xref>, respectively. Spatial smoothing was applied to the preprocessed images with a 4 mm full width at half maximum (FWHM) Gaussian smoothing kernel. Additionally, to compare the extent to which the performance of classical NHST and BPI depended on the smoothing, we applied 8 mm FWHM smoothing to the emotion processing task. Spatial smoothing was performed using SPM12. The results are reported for the 4 mm FWHM smoothing filter, unless otherwise specified.</p>
</sec>
<sec id="S3.SS3">
<title>Parameter Estimation</title>
<p>Frequentist parameter estimation was applied at the subject level of analysis. A detailed description of the general linear models for each task design is available in the <xref ref-type="supplementary-material" rid="DS1">Supplementary Materials</xref>, Par. 2. Fixation blocks and null events were not modeled explicitly in any of the tasks. Twenty-four head motion regressors were included in each subject-level model (six head motion parameters, six head motion parameters one time point before, and 12 corresponding squared items) to minimize head motion artifacts (<xref ref-type="bibr" rid="B47">Friston et al., 1996</xref>). Raw &#x03B2; values were converted to PSC relative to the mean whole-brain &#x2018;baseline&#x2019; signal (<xref ref-type="supplementary-material" rid="DS1">Supplementary Materials</xref>, Par. 3). The linear contrasts of the &#x03B2; values were calculated to describe the effects of interest &#x03B8; = <italic>c</italic>&#x03B2; in different tasks. The sum of positive terms in the contrast vector, <italic>c</italic>, is equal to one. The list of contrasts calculated in the current study to explore typical effect sizes is presented in <xref ref-type="supplementary-material" rid="DS1">Supplementary Table 1</xref>. At the group level of analysis, the Bayesian parameter estimation with the &#x2018;global shrinkage&#x2019; prior was applied using SPM12 (v6906). We performed a one-sample test on the linear contrasts created at the subject level of analysis.</p>
</sec>
<sec id="S3.SS4">
<title>Classical Null Hypothesis Significance Testing and Bayesian Parameter Inference</title>
<p>Classical inference was performed using voxel-wise FWE correction with &#x03B1; = 0.05. This is the default SPM threshold and is known to be conservative and to guarantee protection from false positives (<xref ref-type="bibr" rid="B34">Eklund et al., 2016</xref>). Although voxel-wise FWE correction may be too conservative for small sample sizes, it is recommended when large sample sizes are available (<xref ref-type="bibr" rid="B151">Woo et al., 2014</xref>).</p>
<p>Bayesian parameter inference, accessible via the SPM12 GUI, allows the user to declare only whether the voxels are &#x2018;activated&#x2019; or &#x2018;deactivated.&#x2019; The classification of voxels as being either &#x2018;not activated&#x2019; or &#x2018;low confidence&#x2019; requires the posterior mean and variance. At the group level of analysis, SPM12 does not save the posterior variance image. However, the posterior variance can be reconstructed from the image of the noise hyperparameter using a first-order Taylor series approximation (<xref ref-type="bibr" rid="B105">Penny and Ridgway, 2013</xref>). Therefore, in the current study, BPI was performed using the developed SPM12-based toolbox<sup><xref ref-type="fn" rid="footnote4">4</xref></sup>. For the &#x2018;ROPE-only&#x2019; rule, the posterior probability threshold was <italic>P</italic><sub><italic>thr</italic></sub> = 95% (<italic>LPO</italic> &#x003E; 3). For the &#x2018;HDI+ROPE&#x2019; rule, we used the 95% HDI.</p>
<p>We compared the number of &#x2018;activated&#x2019; voxels (as a percentage of the total number of voxels) detected by Bayesian and classical inference. We also compared the number of &#x2018;activated,&#x2019; &#x2018;deactivated,&#x2019; and &#x2018;not activated&#x2019; voxels detected using BPI with the &#x2018;ROPE-only&#x2019; and &#x2018;HDI+ROPE&#x2019; decision rules and different ES thresholds. To estimate the influence of the sample size on the results, all the above-mentioned analyses were performed with samples of different sizes: 5 to 100 subjects from the HCP dataset (the emotion processing task, &#x2018;Emotion &#x003E; Shape&#x2019; contrast) and 5 to 115 subjects from the UCLA dataset (the stop signal task, &#x2018;Correct Stop &#x003E; Go&#x2019; contrast), in steps of 5 subjects. Ten random groups were sampled for each step.</p>
</sec>
<sec id="S3.SS5">
<title>Effect Size Thresholds</title>
<p>We considered three ES thresholds: firstly, the default ES threshold for the subject-level <italic>&#x03B3;</italic> = 0.1% (BOLD PSC); secondly, the default ES threshold for the group-level <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub>; thirdly, the <italic>&#x03B3;</italic>(<italic>Dice</italic><sub><italic>max</italic></sub>) threshold, which ensures maximum similarity of the activation patterns revealed by classical NHST and BPI. The similarity was assessed using the Dice coefficient:</p>
<disp-formula id="S3.E14"><label>(11)</label><mml:math id="M16"><mml:mrow><mml:mrow><mml:mi>D</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi><mml:mi>e</mml:mi><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mo>=</mml:mo><mml:mpadded width="+5pt"><mml:mfrac><mml:mrow><mml:mrow><mml:mn>2</mml:mn><mml:mo>&#x002A;</mml:mo><mml:msub><mml:mi>V</mml:mi><mml:mrow><mml:mi>o</mml:mi><mml:mi>v</mml:mi><mml:mi>e</mml:mi><mml:mi>r</mml:mi><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>p</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:msub><mml:mi>V</mml:mi><mml:mrow><mml:mi>c</mml:mi><mml:mi>l</mml:mi><mml:mi>a</mml:mi><mml:mi>s</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>c</mml:mi></mml:mrow></mml:msub><mml:mo>+</mml:mo><mml:mrow><mml:msub><mml:mi>V</mml:mi><mml:mrow><mml:mi>b</mml:mi><mml:mi>a</mml:mi><mml:mi>y</mml:mi><mml:mi>e</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>a</mml:mi><mml:mi>n</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo>(</mml:mo><mml:mi mathvariant="normal">&#x03B3;</mml:mi><mml:mo>)</mml:mo></mml:mrow></mml:mrow></mml:mrow></mml:mfrac></mml:mpadded></mml:mrow></mml:math></disp-formula>
<p>where <italic>V<sub>classic</sub></italic> is the number of &#x2018;activated&#x2019; voxels detected using classical NHST, <italic>V</italic><sub><italic>bayesian</italic></sub>(<italic>&#x03B3;</italic>) is the number of &#x2018;activated&#x2019; voxels detected using BMI with the ES threshold <italic>&#x03B3;</italic>, and <italic>V</italic><sub><italic>overlap</italic></sub> is the number of &#x2018;activated&#x2019; voxels detected by both methods. A Dice coefficient of 0 indicates no overlap between the patterns, and 1 indicates complete overlap. Dice coefficients were calculated for <italic>&#x03B3;</italic> ranging from 0 to 0.4% in steps of 0.001%.</p>
</sec>
<sec id="S3.SS6">
<title>Estimation of Typical Effect Sizes</title>
<p>In the current study, we aimed to provide a reference set of typical effect sizes for different task designs (block and event-related) and different contrasts (&#x2018;task-condition &#x003E; control-condition,&#x2019; &#x2018;task-condition &#x003E; baseline,&#x2019;) in a set of <italic>a priori</italic> defined regions of interest (ROI). Effect sizes were expressed in PSC and Cohen&#x2019;s d. ROI masks were defined using anatomical and <italic>a priori</italic> functional masks. For more details, see <xref ref-type="supplementary-material" rid="DS1">Supplementary Materials</xref>, Par. 4.</p>
</sec>
<sec id="S3.SS7">
<title>Evaluating Bayesian Parameter Inference on Contrasts With No Expected Practically Significant Difference</title>
<p>Bayesian parameter inference should be able to detect the &#x2018;null effect&#x2019; in the majority of voxels when comparing samples with no expected practically significant difference. For example, there may be two groups of healthy adult subjects performing the same task or two sessions with the same task instructions. To test this, we used fMRI data from the emotion processing task. To emulate two &#x2018;similar&#x2019; <italic>independent</italic> samples, 100 healthy adult subjects&#x2019; contrasts (&#x2018;Emotion &#x003E; Shape&#x2019;) were randomly divided into two groups of 50 subjects. A two-sample test comparing the &#x2018;Group #1&#x2019; and &#x2018;Group #2&#x2019; was performed with the assumption of unequal variances between the groups (SPM12 default option). To emulate &#x2018;similar&#x2019; <italic>dependent</italic> samples, we randomized &#x2018;Emotion &#x003E; Shape&#x2019; contrasts from right-to-left (RL) and left-to-right (LR) phase encoding sessions in the &#x2018;Session #1&#x2019; and &#x2018;Session #2&#x2019; samples. Each sample consisted of 50 contrasts from the RL session and 50 from the LR session. A paired test designed to compare &#x2018;Session #1&#x2019; and &#x2018;Session #2&#x2019; was equivalent to the one-sample test on 50 &#x2018;RL &#x003E; LR session&#x2019; and 50 &#x2018;LR &#x003E; RL session&#x2019; contrasts.</p>
</sec>
<sec id="S3.SS8">
<title>Normality Check</title>
<p>To check for violations of the normality assumption we performed Shapiro-Wilk test (<xref ref-type="bibr" rid="B132">Shapiro and Wilk, 1965</xref>) for each voxel for one block-design task (&#x2018;Emotion &#x003E;Shape&#x2019; contrast) and one event-related task (&#x2018;Correct Stop &#x003E; Go&#x2019; contrast). We reported the number of voxels that were significantly non-Gaussian (using &#x03B1; = 0.001 uncorrected for multiple comparisons and &#x03B1; = 0.05 with Bonferroni correction). We also calculated median kurtosis and skewness across voxels. Kurtosis is a measure of the heaviness of the tails. Skewness is a measure of asymmetry of distribution.</p>
</sec>
<sec id="S3.SS9">
<title>Simulations</title>
<p>The main limitation of using empirical data to assess the performance of statistical methods lies in the lack of knowledge of the ground truth. Therefore, we performed group-level simulations to better understand how the sample size and effect size threshold affect BPI results given different known effect sizes and noises. Simulations also allowed us to assess the robustness of BPI to the violations of the normality assumption. We generated the parameter maps (contrast images) similar to <xref ref-type="bibr" rid="B100">Nichols and Hayasaka (2003); Schwartzman et al. (2009)</xref> and <xref ref-type="bibr" rid="B25">Cremers et al. (2017)</xref>. Each contrast image consisted of &#x2018;activated&#x2019; and &#x2018;deactivated&#x2019; voxels and &#x2018;trivial&#x2019; background voxels surrounding them. Locations of &#x2018;activated&#x2019; and &#x2018;deactivated&#x2019; voxels were specified based on the NeuroSynth meta-analysis results (<xref ref-type="bibr" rid="B154">Yarkoni et al., 2011</xref>) obtained using the search terms &#x2018;task&#x2019; and &#x2018;default,&#x2019; respectfully (association test, &#x03B1; = 0.01 with FDR correction). Data were drawn from the Pearson system distribution (<xref ref-type="bibr" rid="B73">Johnson et al., 1994</xref>) with kurtosis, <italic>Ku</italic> = 2.2, 3, 7 and skewness, <italic>Sk</italic> = &#x2212;0.7, 0, 0.7. The normal distribution corresponds to <italic>Ku</italic> = 3 and <italic>Sk</italic> = 0. Other combinations of <italic>Ku</italic> and <italic>Sk</italic> resulted in four-parameter beta distributions. The mean effect in practically significant (&#x2018;activated&#x2019; and &#x2018;deactivated&#x2019;) voxels was &#x03B8; = &#x00B1; 0.1, 0.2, 0.3%. For practically non-significant or &#x2018;trivial&#x2019; voxels, the mean effect was &#x03B8; = &#x00B1; 0.04%, which can be considered equivalent to the null value for practical purposes (&#x2018;not activated&#x2019; voxels). Noise standard deviation was <italic>SD</italic> = 0.2, 0.3, 0.4%. The mean effect size and noise were consistent with those observed in the empirical data (see <xref ref-type="supplementary-material" rid="DS1">Supplementary Tables 11</xref>&#x2013;<xref ref-type="supplementary-material" rid="DS1">19</xref>). Contrast-to-noise ratio was varied from 0.25 to 1.5. For each combination of the Pearson system distribution parameters, we generated 1000 images.</p>
<p>To evaluate sample size dependencies, we randomly drawn images from the full sample (<italic>N</italic> = 1000) ranging from <italic>N</italic> = 10 to 100 (with step 10) and from <italic>N</italic> = 150 to 500 (step 50). This procedure was repeated ten times for each step. The analysis was limited to the single axial slice (<italic>z</italic> = 36 mm) containing 579 &#x2018;activated&#x2019; voxels, 500 &#x2018;deactivated&#x2019; voxels and 3067 &#x2018;trivial&#x2019; or &#x2018;not activated&#x2019; voxels. For classical NHST and BPI, we calculated the number of &#x2018;activated&#x2019; voxels in relation to the total number of voxels. For BPI, we additionally calculated:</p>
<list list-type="simple">
<list-item>
<label>(1)</label>
<p>Correct decision rate. The number of correctly classified &#x2018;activated,&#x2019; &#x2018;deactivated,&#x2019; and &#x2018;not activated&#x2019; voxels to its true number (c.f. &#x2018;hit rate&#x2019; in detection theory or &#x2018;true positive rate&#x2019; in frequentist framework).</p>
</list-item>
<list-item>
<label>(2)</label>
<p>Incorrect decision rate. The number of voxels incorrectly classified as &#x2018;activated,&#x2019; &#x2018;deactivated,&#x2019; and &#x2018;not activated&#x2019; to the true number of voxels not belonging to &#x2018;activated,&#x2019; &#x2018;deactivated,&#x2019; and &#x2018;not activated&#x2019; categories, respectfully (c.f. &#x2018;false alarm rate&#x2019; in detection theory or &#x2018;false positive rate&#x2019; in frequentist framework);</p>
</list-item>
<list-item>
<label>(3)</label>
<p>Low confidence decision rate. The number of &#x2018;low confidence&#x2019; voxels to the total number of voxels.</p>
</list-item>
</list>
<p>The code for the simulations is available online<sup><xref ref-type="fn" rid="footnote5">5</xref></sup>.</p>
</sec>
</sec>
<sec id="S4" sec-type="results">
<title>Results</title>
<sec id="S4.SS1">
<title>Results for Contrasts With No Expected Practically Significant Difference</title>
<p>Classical NHST did not show a significant difference between &#x2018;Group #1&#x2019; and &#x2018;Group #2&#x2019; (see <xref ref-type="supplementary-material" rid="DS1">Supplementary Figure 1</xref>). BPI with the &#x2018;ROPE-only&#x2019; decision rule and default ES threshold <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> = 0.190% classified 83.4% of voxels as having &#x2018;no difference&#x2019; in which the null hypothesis was accepted (see <xref ref-type="supplementary-material" rid="DS1">Supplementary Figure 1</xref>). The &#x2018;HDI+ROPE&#x2019; rule classified 76.2% of voxels as having &#x2018;no difference.&#x2019;</p>
<p>Classical NHST did not reveal a significant difference between &#x2018;Session #1&#x2019; and &#x2018;Session #2&#x2019; (see <xref ref-type="supplementary-material" rid="DS1">Supplementary Figure 2</xref>). The <italic>prior SD</italic><sub>&#x03B8;</sub> was 0.005%. In this case, using the default ES threshold <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> did not allow the detection of any &#x2018;no difference&#x2019; voxels, because the ROPE was unreasonably narrow. The &#x2018;null effect&#x2019; was detected in all voxels beginning with a <italic>&#x03B3;</italic> = 0.013% threshold using the &#x2018;ROPE-only&#x2019; and &#x2018;HDI+ROPE&#x2019; decision rules (see <xref ref-type="supplementary-material" rid="DS1">Supplementary Figure 2</xref>).</p>
<p>In this way, when comparing two &#x2018;similar&#x2019; <italic>independent</italic> samples (two groups of healthy subjects performing the same task), BPI with the default group-level threshold (<italic>one prior SD</italic><sub>&#x03B8;</sub>) allowed us to correctly label voxels as having &#x2018;no difference&#x2019; for the majority of the voxels of the brain. However, when comparing two &#x2018;similar&#x2019; <italic>dependent</italic> samples (two sessions from the same task), the <italic>one prior SD</italic><sub>&#x03B8;</sub> threshold became inadequately small.</p>
<p>Therefore, the default <italic>one prior SD</italic><sub>&#x03B8;</sub> threshold is not suitable when the difference between <italic>dependent</italic> conditions is very small (paired sample test or one-sample test). In such cases, one can use an <italic>a priori</italic> defined ES threshold based on previously reported effect sizes or provide an ES threshold at which most of the voxels can be labeled as having &#x2018;no difference,&#x2019; allowing the critical reader to decide whether this speaks in favor of the absence of differences.</p>
</sec>
<sec id="S4.SS2">
<title>Comparison of Classical Null Hypothesis Significance Testing and Bayesian Parameter Inference Results</title>
<p>Generally, classical NHST with voxel-wise FWE correction and BPI with the &#x2018;ROPE-only&#x2019; decision rule and default group-level ES threshold <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> revealed similar (de)activation patterns in all considered contrasts (see <xref ref-type="fig" rid="F6">Figure 6</xref>, <xref ref-type="table" rid="T1">Table 1</xref>, and <xref ref-type="supplementary-material" rid="DS1">Supplementary Tables 2</xref>&#x2013;<xref ref-type="supplementary-material" rid="DS1">10</xref>). The median ES threshold based on <italic>Dice</italic><sub><italic>max</italic></sub> and median default group-level ES threshold across all considered contrasts were close in magnitude to the default subject-level ES threshold <italic>&#x03B3;</italic> = 0.1%: <italic>&#x03B3;</italic>(<italic>Dice</italic><sub><italic>max</italic></sub>) = 0.118% and <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> = 0.142%. The median <italic>Dice</italic><sub><italic>max</italic></sub> across all the considered contrasts reached 0.904. At the same time, BPI allowed us to classify &#x2018;non-significant&#x2019; voxels as &#x2018;not activated&#x2019; or &#x2018;low confidence.&#x2019; As it can be clearly seen from <xref ref-type="fig" rid="F6">Figure 6</xref>, areas with &#x2018;non-activated&#x2019; voxels surround clusters with &#x2018;activated/deactivated&#x2019; voxels. Both are separated by areas comprising &#x2018;low confidence&#x2019; voxels.</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption><p>Examples of results obtained with classical NHST and BPI. Four contrasts were chosen for the illustration purposes (two event-related and two block-design tasks). Classical NHST was implemented using voxel-wise FWE correction (&#x03B1; = 0.05). BPI was implemented using the &#x2018;ROPE-only&#x2019; decision rule, <italic>P</italic><sub><italic>thr</italic></sub> = 95% (<italic>LPO</italic> &#x003E; 3) and <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub>. Axial slice <italic>z</italic> = 18 mm (MNI152 standard space).</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g006.tif"/>
</fig>
<table-wrap position="float" id="T1">
<label>TABLE 1</label>
<caption><p>Maximum Dice coefficient and corresponding effect size thresholds for each task.</p></caption>
<table cellspacing="5" cellpadding="5" frame="hsides" rules="groups">
<thead>
<tr>
<td valign="top" align="left">Contrast, &#x03B8;</td>
<td valign="top" align="center"><italic>Prior SD<sub>&#x03B8;</sub>, %</italic></td>
<td valign="top" align="center" colspan="2">&#x2018;ROPE-only&#x2019; decision rule<hr/></td>
<td valign="top" align="center" colspan="2">&#x2018;HDI+ROPE&#x2019; decision rule<hr/></td>
</tr>
<tr>
<td/>
<td/>
<td valign="top" align="center"><italic>&#x03B3;</italic>(<italic>Dice</italic><sub><italic>max</italic></sub>), <italic>%</italic></td>
<td valign="top" align="center"><italic>Dice</italic><sub><italic>max</italic></sub></td>
<td valign="top" align="center"><italic>&#x03B3;</italic>(<italic>Dice</italic><sub><italic>max</italic></sub>), <italic>%</italic></td>
<td valign="top" align="center">Dice</td>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left" colspan="6"><bold>Emotion processing</bold></td>
</tr>
<tr>
<td valign="top" align="left">Emotion &#x003E; Shape</td>
<td valign="top" align="center">0.135</td>
<td valign="top" align="center">0.116</td>
<td valign="top" align="center">0.904</td>
<td valign="top" align="center">0.104</td>
<td valign="top" align="center">0.912</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6"><bold>Working memory</bold></td>
</tr>
<tr>
<td valign="top" align="left">2-back &#x003E; baseline</td>
<td valign="top" align="center">0.325</td>
<td valign="top" align="center">0.136</td>
<td valign="top" align="center">0.925</td>
<td valign="top" align="center">0.125</td>
<td valign="top" align="center">0.932</td>
</tr>
<tr>
<td valign="top" align="left">2-back &#x003E; 0-back</td>
<td valign="top" align="center">0.089</td>
<td valign="top" align="center">0.095</td>
<td valign="top" align="center">0.891</td>
<td valign="top" align="center">0.089</td>
<td valign="top" align="center">0.903</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6"><bold>Language</bold></td>
</tr>
<tr>
<td valign="top" align="left">Story &#x003E; Math</td>
<td valign="top" align="center">0.255</td>
<td valign="top" align="center">0.119</td>
<td valign="top" align="center">0.896</td>
<td valign="top" align="center">0.108</td>
<td valign="top" align="center">0.904</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6"><bold>Motor</bold></td>
</tr>
<tr>
<td valign="top" align="left">Left finger &#x003E; baseline</td>
<td valign="top" align="center">0.149</td>
<td valign="top" align="center">0.148</td>
<td valign="top" align="center">0.897</td>
<td valign="top" align="center">0.135</td>
<td valign="top" align="center">0.907</td>
</tr>
<tr>
<td valign="top" align="left">Right finger &#x003E; baseline</td>
<td valign="top" align="center">0.171</td>
<td valign="top" align="center">0.160</td>
<td valign="top" align="center">0.886</td>
<td valign="top" align="center">0.144</td>
<td valign="top" align="center">0.897</td>
</tr>
<tr>
<td valign="top" align="left">Tongue &#x003E; baseline</td>
<td valign="top" align="center">0.268</td>
<td valign="top" align="center">0.205</td>
<td valign="top" align="center">0.904</td>
<td valign="top" align="center">0.181</td>
<td valign="top" align="center">0.913</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6"><bold>Gambling</bold></td>
</tr>
<tr>
<td valign="top" align="left">Reward &#x003E; baseline</td>
<td valign="top" align="center">0.254</td>
<td valign="top" align="center">0.132</td>
<td valign="top" align="center">0.917</td>
<td valign="top" align="center">0.122</td>
<td valign="top" align="center">0.924</td>
</tr>
<tr>
<td valign="top" align="left">Loss &#x003E; baseline</td>
<td valign="top" align="center">0.249</td>
<td valign="top" align="center">0.134</td>
<td valign="top" align="center">0.918</td>
<td valign="top" align="center">0.118</td>
<td valign="top" align="center">0.925</td>
</tr>
<tr>
<td valign="top" align="left">Reward &#x003E; Loss</td>
<td valign="top" align="center">0.032</td>
<td valign="top" align="center">0.044</td>
<td valign="top" align="center">0.894</td>
<td valign="top" align="center">0.037</td>
<td valign="top" align="center">0.886</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6"><bold>Social cognition</bold></td>
</tr>
<tr>
<td valign="top" align="left">Social &#x003E; baseline</td>
<td valign="top" align="center">0.325</td>
<td valign="top" align="center">0.139</td>
<td valign="top" align="center">0.939</td>
<td valign="top" align="center">0.124</td>
<td valign="top" align="center">0.944</td>
</tr>
<tr>
<td valign="top" align="left">Social &#x003E; Random</td>
<td valign="top" align="center">0.104</td>
<td valign="top" align="center">0.114</td>
<td valign="top" align="center">0.896</td>
<td valign="top" align="center">0.104</td>
<td valign="top" align="center">0.907</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6"><bold>Relational processing</bold></td>
</tr>
<tr>
<td valign="top" align="left">Relational &#x003E; baseline</td>
<td valign="top" align="center">0.390</td>
<td valign="top" align="center">0.154</td>
<td valign="top" align="center">0.935</td>
<td valign="top" align="center">0.143</td>
<td valign="top" align="center">0.940</td>
</tr>
<tr>
<td valign="top" align="left">Relational &#x003E; Match</td>
<td valign="top" align="center">0.051</td>
<td valign="top" align="center">0.073</td>
<td valign="top" align="center">0.892</td>
<td valign="top" align="center">0.066</td>
<td valign="top" align="center">0.894</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6"><bold>Stop-signal task</bold></td>
</tr>
<tr>
<td valign="top" align="left">Correct Stop &#x003E; baseline</td>
<td valign="top" align="center">0.069</td>
<td valign="top" align="center">0.066</td>
<td valign="top" align="center">0.895</td>
<td valign="top" align="center">0.061</td>
<td valign="top" align="center">0.906</td>
</tr>
<tr>
<td valign="top" align="left">Correct Stop &#x003E; Go</td>
<td valign="top" align="center">0.064</td>
<td valign="top" align="center">0.052</td>
<td valign="top" align="center">0.906</td>
<td valign="top" align="center">0.047</td>
<td valign="top" align="center">0.917</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6"><bold>Task-switching</bold></td>
</tr>
<tr>
<td valign="top" align="left">Switch &#x003E; baseline</td>
<td valign="top" align="center">0.133</td>
<td valign="top" align="center">0.075</td>
<td valign="top" align="center">0.907</td>
<td valign="top" align="center">0.067</td>
<td valign="top" align="center">0.916</td>
</tr>
<tr>
<td valign="top" align="left">Switch &#x003E; No switch</td>
<td valign="top" align="center">0.030</td>
<td valign="top" align="center">0.037</td>
<td valign="top" align="center">0.924</td>
<td valign="top" align="center">0.033</td>
<td valign="top" align="center">0.925</td>
</tr>
<tr>
<td valign="top" align="left" colspan="6"><bold>Summary</bold></td>
</tr>
<tr>
<td valign="top" align="left">Median</td>
<td valign="top" align="center">0.142</td>
<td valign="top" align="center">0.118</td>
<td valign="top" align="center">0.904</td>
<td valign="top" align="center">0.106</td>
<td valign="top" align="center">0.913</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As expected, compared with the &#x2018;HDI+ROPE&#x2019; rule, using the &#x2018;ROPE-only&#x2019; rule slightly increases the number of &#x2018;activated/deactivated&#x2019; and &#x2018;not activated&#x2019; voxels (see <xref ref-type="table" rid="T1">Table 1</xref> and <xref ref-type="supplementary-material" rid="DS1">Supplementary Tables 2</xref>&#x2013;<xref ref-type="supplementary-material" rid="DS1">10</xref>). The &#x2018;HDI+ROPE&#x2019; rule labeled more voxels as &#x2018;low confidence.&#x2019;</p>
</sec>
<sec id="S4.SS3">
<title>Comparison of Bayesian Parameter Inference Results With Different Effect Size Thresholds</title>
<p>Here, we focus on the &#x2018;ROPE-only&#x2019; rule. We first consider the results for the emotional processing task and then consider other tasks. Using the default single-subject ES threshold <italic>&#x03B3;</italic> = 0.1% for the emotional processing task (&#x2018;Emotion &#x003E; Shape&#x2019; contrast), 58.8% of all voxels can be classified as &#x2018;not activated,&#x2019; 30.8% as &#x2018;low confidence,&#x2019; and 10.1% as &#x2018;activated&#x2019; (see <xref ref-type="fig" rid="F7">Figure 7</xref> and <xref ref-type="supplementary-material" rid="DS1">Supplementary Table 2</xref>). The default group-level ES threshold <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> = 0.135% allowed us to classify 75.0% of all voxels as &#x2018;non-activated,&#x2019; 17.5% as &#x2018;low confidence,&#x2019; and 7.4% as &#x2018;activated&#x2019; (see <xref ref-type="fig" rid="F7">Figure 7</xref> and <xref ref-type="supplementary-material" rid="DS1">Supplementary Table 2</xref>). Both types of thresholds were comparable to those of classical NHST for the detection of &#x2018;activated&#x2019; voxels. The maximum overlap between &#x2018;activations&#x2019; patterns revealed by classical NHST and BPI was observed at <italic>&#x03B3;(Dice<sub><italic>max</italic></sub>)</italic> = 0.116% (see <xref ref-type="fig" rid="F8">Figure 8</xref> and <xref ref-type="table" rid="T1">Table 1</xref>).</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption><p>Number of voxels classified into the four categories depending on the ES threshold <italic>&#x03B3;</italic>. The results for the emotion processing task (&#x2018;Emotion&#x003E;Shape&#x2019; contrast) are presented for illustration. L AMY, left amygdala.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g007.tif"/>
</fig>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption><p>Dependence of the Dice coefficient on the ES threshold <italic>&#x03B3;</italic>. Results for the emotion processing task (&#x2018;Emotion&#x003E;Shape&#x2019; contrast). The red lines denote <italic>&#x03B3;(Dice<sub><italic>max</italic></sub>)</italic>. L AMY, left amygdala.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g008.tif"/>
</fig>
<p>For the &#x2018;2-back &#x003E; 0-back,&#x2019; &#x2018;Left Finger &#x003E; baseline,&#x2019; &#x2018;Right Finger &#x003E; baseline,&#x2019; and &#x2018;Social &#x003E; Random&#x2019; contrasts, the three ES thresholds that were considered&#x2014;0.1%, <italic>one prior SD</italic><sub>&#x03B8;</sub>, <italic>&#x03B3;(Dice<sub><italic>max</italic></sub>)</italic>&#x2014;produced similar results (see <xref ref-type="table" rid="T1">Table 1</xref> and <xref ref-type="supplementary-material" rid="DS1">Supplementary Tables 3</xref>, <xref ref-type="supplementary-material" rid="DS1">5</xref>, <xref ref-type="supplementary-material" rid="DS1">7</xref>). For the event-related stop-signal task (&#x2018;Correct Stop &#x003E; baseline&#x2019; and &#x2018;Correct Stop &#x003E; Go&#x2019; contrasts), <italic>one prior SD</italic><sub>&#x03B8;</sub> and <italic>&#x03B3;(Dice<sub><italic>max</italic></sub>)</italic> were close in terms of their values but smaller than 0.1% (see <xref ref-type="table" rid="T1">Table 1</xref>). Block designs tend to evoke higher BOLD PSC than event-related designs; therefore, a lower <italic>prior SD</italic><sub>&#x03B8;</sub> should be expected for event-related designs and higher <italic>prior SD</italic><sub>&#x03B8;</sub> for block designs. Within a single design, in contrasts such as &#x2018;task-condition &#x003E; baseline,&#x2019; higher BOLD PSC and <italic>prior SD</italic><sub>&#x03B8;</sub> would be expected than in contrasts in which the experimental conditions are compared directly. For example, the contrast &#x2018;2-back &#x003E; baseline&#x2019; has <italic>prior SD</italic><sub>&#x03B8;</sub> = 0.325% and contrast &#x2018;2-back &#x003E; 0-back&#x2019; has <italic>prior SD</italic><sub>&#x03B8;</sub> = 0.089%.</p>
<p>As previously noted, some contrasts did not elicit robust activations: &#x2018;Reward &#x003E; Loss,&#x2019; &#x2018;Relational &#x003E; Match,&#x2019; (<xref ref-type="bibr" rid="B7">Barch et al., 2013</xref>) and &#x2018;Switch &#x003E; No switch&#x2019; (<xref ref-type="bibr" rid="B57">Gorgolewski et al., 2017</xref>). The corresponding <italic>&#x03B3;(Dice<sub><italic>max</italic></sub>)</italic> thresholds were 0.044, 0.073, and 0.037% (see <xref ref-type="table" rid="T1">Table 1</xref> and <xref ref-type="supplementary-material" rid="DS1">Supplementary Tables 6</xref>, <xref ref-type="supplementary-material" rid="DS1">8</xref>, <xref ref-type="supplementary-material" rid="DS1">10</xref>). The <italic>prior SD</italic><sub>&#x03B8;</sub> were 0.032, 0.051, and 0.030%. Correspondingly, BPI with the <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> threshold classified 0, 18.4, and 42.2% of voxels as &#x2018;not activated.&#x2019; This demonstrates that when we compare conditions with similar neural activity and minor differences, it becomes more difficult to separate &#x2018;activations/deactivations&#x2019; from the &#x2018;null effects&#x2019; using the <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> threshold.</p>
</sec>
<sec id="S4.SS4">
<title>Typical Effect Sizes in Functional Magnetic Resonance Imaging Studies</title>
<p>A complete list of effect sizes (BOLD PSC and Cohen&#x2019;s d) estimated for different tasks and <italic>a priori</italic> defined ROIs is presented in the <xref ref-type="supplementary-material" rid="DS1">Supplementary Materials</xref> (<xref ref-type="supplementary-material" rid="DS1">Supplementary Tables 11</xref>&#x2013;<xref ref-type="supplementary-material" rid="DS1">19</xref>). Here, we focus only on the BOLD PSC. The violin plots for some of these are shown in <xref ref-type="fig" rid="F9">Figure 9</xref>.</p>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption><p>Typical BOLD PSC in fMRI studies. The box plots inside the violins represent the first and third quartile, and the black circles represent median values. Contrasts from the same task are indicated in one color. L/R, left/right; AMY, amygdala; V1, primary visual cortex; DLPFC, dorsolateral prefrontal cortex; BA, Brodmann area; STG, superior temporal gyrus; A1, primary auditory cortex; NAc, nucleus accumbens; IPL, inferior parietal lobule; IFG/FO, inferior frontal gyrus/frontal operculum.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g009.tif"/>
</fig>
<p>For example, the median BOLD PSC in the left amygdala ROI, one of the key brain areas for emotional processing, was 0.263%, which is approximately twice as large as <italic>one prior SD</italic><sub>&#x03B8;</sub> (see <xref ref-type="fig" rid="F7">Figure 7</xref>). Thus, using this PSC as the ES threshold in future studies may cause the ROPE to become too wide compared to the effect sizes typical for tasks with such designs. Therefore, such a threshold can be used to detect large and highly localized effects. However, it may fail to detect small but widely distributed effects previously described for HCP data (<xref ref-type="bibr" rid="B25">Cremers et al., 2017</xref>).</p>
<p>In general, median PSCs within ROIs were up to 1% for block designs and 0.5% for event-related designs. The maximum PSCs reached 2.5% and were usually observed in the primary visual cortex (V1) for visual tasks comparing experimental conditions with baseline activity. For &#x2018;moderate&#x2019; physiological effects, PSC varied in the range 0.1&#x2212;0.2%, for example, for the &#x2018;2-back &#x003E; 0-back&#x2019; contrast, the median PSC in the right dorsolateral prefrontal cortex (R DLPFC in <xref ref-type="fig" rid="F9">Figure 9</xref>) was 0.137%. Likewise, for the &#x2018;Social &#x003E; Random&#x2019; contrast, the right inferior parietal lobule (R IPL) median PSC was 0.137%, for the &#x2018;Correct Stop &#x003E; Go,&#x2019; the right inferior frontal gyrus/frontal operculum (R IFG/FO) median PSC was 0.120%. For more &#x2018;strong&#x2019; physiological effects, the PSC was in the range 0.2&#x2212;0.3%, for example, for the &#x2018;Emotion &#x003E; Shape&#x2019; contrast, the median PSC in the left amygdala was 0.263%, and for the &#x2018;Story &#x003E; Math&#x2019; contrast, the median PSC in the left Brodmann area 45 (Broca&#x2019;s area) was 0.269%. For the motor activity, for example the &#x2018;Right Finger &#x003E; baseline&#x2019; contrast, the median PSC in the left precentral gyrus was 0.239%, in the left postcentral gyrus was 0.362%, in the left putamen was 0.290%, and in the right cerebellum was 0.401%. For the contrasts that did not elicit robust activations (<xref ref-type="bibr" rid="B7">Barch et al., 2013</xref>), the PSC was approximately 0.05&#x2013;0.1%; for example, for the &#x2018;Reward &#x003E; Loss&#x2019; contrast, the median PSC in the left nucleus accumbens was 0.043%, and for the &#x2018;Relational &#x003E; Match&#x2019; contrast, the median PSC in the left dorsolateral prefrontal cortex was 0.062%.</p>
</sec>
<sec id="S4.SS5">
<title>Region of Practical Equivalence Maps</title>
<p>We considered BPI with two consecutive thresholding steps: (1) calculate the <italic>LPOs</italic> (or PPMs) with a selected ES threshold <italic>&#x03B3;</italic>, (2) apply the posterior probability threshold <italic>p</italic><sub><italic>th</italic></sub> = 95% or consider the overlap between the 95% HDI and ROPE. We can <italic>reverse the thresholding sequence</italic> and calculate <italic>the ROPE maps</italic>.</p>
<p>For the &#x2018;activated/deactivated&#x2019; voxels, the ROPE map contains the maximum ES thresholds that allow voxels to be classified as &#x2018;activated/deactivated&#x2019; based on the &#x2018;ROPE-only&#x2019; or &#x2018;HDI+ROPE&#x2019; decision rules. For the &#x2018;not activated&#x2019; voxels, the map contains the minimum effect size thresholds that allow voxels to be classified as &#x2018;not activated.&#x2019;</p>
<p>The procedure for calculating the ROPE map can be performed as follows. Let us consider a gradual increase in the ROPE radius (i.e., the half-width of ROPE or the ES threshold <italic>&#x03B3;</italic>) from zero to the maximum effect size in observed volume. (1) For voxels in which PSC is close to zero, at a certain ROPE radius, the posterior probability of finding the effect within the ROPE becomes higher than 95%. This width is indicated on the ROPE map for &#x2018;not activated&#x2019; voxels. (2) For voxels in which the PSC deviates from zero, at a certain ROPE radius, the posterior probability of finding the effect outside the ROPE becomes lower than 95%. This width is indicated on the ROPE map for &#x2018;activated/deactivated&#x2019; voxels. The same maps can be calculated for the &#x2018;HDI+ROPE&#x2019; decision rule.</p>
<p>Examples of the ROPE maps are shown in <xref ref-type="fig" rid="F10">Figure 10</xref>. From our point of view, ROPE maps, as well as unstandardized effect size (PSC) maps, may facilitate an intuitive understanding of the spatial distribution of a physiological effect under investigation (<xref ref-type="bibr" rid="B16">Chen et al., 2017</xref>). They can also be a valuable addition to standard PPMs, allowing researchers to flexibly choose the ES threshold based on expected effect size for specific experimental conditions, ROIs and MR acquisition parameters. The default ES thresholds may be more conservative to brain areas near air&#x2013;tissue interfaces due to signal dropout. The researcher may choose a lower ES threshold to increase sensitivity to these brain areas.</p>
<fig id="F10" position="float">
<label>FIGURE 10</label>
<caption><p>The ROPE maps. Four contrasts were chosen for the illustration purposes (two event-related and two block-design tasks). The ROPE maps are presented using different colors for the &#x2018;activated,&#x2019; &#x2018;deactivated,&#x2019; and &#x2018;not activated&#x2019; voxels. The green bars represent the minimum ROPE radii at which voxels with a PSC close to zero can be classified as &#x2018;not activated&#x2019; based on the &#x2018;ROPE-only&#x2019; decision rule. The red and blue bars represent the maximum ROPE radii at which voxels of which the PSC deviates from zero can be classified as &#x2018;activated&#x2019; and &#x2018;deactivated,&#x2019; respectively.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g010.tif"/>
</fig>
</sec>
<sec id="S4.SS6">
<title>Effects of Spatial Smoothing on Classical Null Hypothesis Significance Testing and Bayesian Parameter Inference</title>
<p>Two main effects of spatial smoothing were identified. Firstly, higher spatial smoothing increased the number of both &#x2018;activated/deactivated&#x2019; and &#x2018;not activated&#x2019; voxels classified by BPI, reducing the number of &#x2018;low confidence&#x2019; voxels. Secondly, higher smoothing blurred the spatial localisation of local maxima of t-maps and PPMs (<italic>LPO</italic>-maps) to a different extent. Consider, for example, the emotion processing task (&#x2018;Emotion &#x003E; Shape&#x2019; contrast). The broadening of two peaks in the left and right amygdala was more noticeable on the t-map than on the PPM (see <xref ref-type="fig" rid="F11">Figure 11</xref>).</p>
<fig id="F11" position="float">
<label>FIGURE 11</label>
<caption><p>Influence of spatial smoothing on classical NHST and BPI: results for the emotion processing task (&#x2018;Emotion &#x003E; Shape&#x2019; contrast). Classical NHST was implemented using voxel-wise FWE correction (&#x03B1; = 0.05). BPI was implemented using the &#x2018;ROPE-only&#x2019; decision rule, <italic>P</italic><sub><italic>thr</italic></sub> = 95% (<italic>LPO</italic> &#x003E; 3) and <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> = 0.135%. Axial slice <italic>z</italic> = &#x2013;14 mm (MNI152 standard space). Slice images have different outlines due to spatial smoothing (higher spatial smoothing increases the size of implicit masks for single subjects and group of subjects). In the panels on the right, 1-D images are presented for t-values and <italic>LPOs</italic> along the x-axis for <italic>y</italic> = &#x2013;4 mm. The red arrows indicate a noticeable broadening of two peaks of local maxima (left and right amygdala) at higher smoothing.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g011.tif"/>
</fig>
<p>Smoothing was previously shown to have a nonlinear effect on the voxel variances and thus to affect more t-maps than &#x03B2; value maps, sometimes leading to counterintuitive artifacts (<xref ref-type="bibr" rid="B115">Reimold et al., 2005</xref>). This is especially noticeable at the border between two different tissues or between the two narrow peaks of the local maxima. If the peak is localized close to white matter voxels with low variability, then smoothing can shift the peak to the white matter. If low-variance white matter voxels separate two close peaks, then after smoothing, they may serve as a &#x2018;bridge&#x2019; between the two peaks. To avoid this problem, <xref ref-type="bibr" rid="B115">Reimold et al. (2005)</xref> recommended using masked &#x03B2; value maps. In the present study, we suggest that PPMs based on BOLD PSC thresholding can mitigate this problem. Importantly, smoothing artifacts can also arise on Cohen&#x2019;s d maps. Therefore, PPMs based on PSC thresholding may be preferable to PPMs based on Cohen&#x2019;s d thresholding.</p>
</sec>
<sec id="S4.SS7">
<title>Sample Size Dependencies for Classical Null Hypothesis Significance Testing and Bayesian Parameter Inference</title>
<p>An enlargement of the sample size led to an increase in the number of &#x2018;activated&#x2019; and &#x2018;not activated&#x2019; voxels, and a decrease in the number of &#x2018;low confidence&#x2019; voxels. This is due to a decrease in the posterior variance. The curve of the &#x2018;activated&#x2019; voxels rose much slower than that of the &#x2018;not activated&#x2019; voxels. For the emotion processing task (&#x2018;Emotion &#x003E; Shape&#x2019; contrast, block-design, two sessions, 352 scans), the largest gain in the number of &#x2018;activated&#x2019; and &#x2018;not activated&#x2019; voxels can be noted from 20 to 30 subjects (see <xref ref-type="fig" rid="F12">Figure 12A</xref>). With a sample size of <italic>N</italic> &#x003E; 30, the number of &#x2018;activated&#x2019; and &#x2018;not activated&#x2019; voxels increased less steeply. The &#x2018;not activated&#x2019; and &#x2018;low confidence&#x2019; voxels curves intersected at <italic>N</italic> = 30 subjects. After the intersection point, the graphs reached a plateau.</p>
<fig id="F12" position="float">
<label>FIGURE 12</label>
<caption><p>Dependencies of the number of &#x2018;activated,&#x2019; &#x2018;not activated,&#x2019; and &#x2018;low confidence&#x2019; voxels on the sample size. BPI was implemented using <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub>. <bold>(A)</bold> The emotional processing task (&#x2018;Emotion &#x003E; Shape&#x2019; contrast, two sessions). <bold>(B)</bold> The emotional processing task (&#x2018;Emotion &#x003E; Shape&#x2019; contrast, one session). <bold>(C)</bold> The stop-signal task (&#x2018;Correct Stop &#x003E; Go&#x2019; contrast). The error bars represent the mean and standard deviation across ten random groups.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g012.tif"/>
</fig>
<p>Considering only half of the emotional processing task data (one session, 176 scans), the intersection point shifted from <italic>N</italic> = 30 to <italic>N</italic> = 60 (see <xref ref-type="fig" rid="F12">Figure 12B</xref>). For the event-related task (&#x2018;Correct Stop &#x003E; Go&#x2019; contrast, the stop-signal task, 184 scans), all considered dependencies had the same features as for the block-design task, and the point of intersection was at <italic>N</italic> = 60 subjects (see <xref ref-type="fig" rid="F12">Figure 12C</xref>). For the fixed ES threshold, the moment at which the graphs reach a plateau depends on task design, data quality and the amount of data at the subject level, that is, on the number of scans, blocks, and events. The task designs from the HCP and UCLA datasets have relatively short durations (for example, the stop-signal task has approximately 15 &#x2018;Correct Stop&#x2019; trials per subject). Studies with a shorter scanning time generally require a larger sample size to enable inferences to be made with confidence. Lowering the ES threshold would also require larger sample size to reach a plateau.</p>
<p>Classical NHST with the voxel-wise FWE correction showed a steady linear increase in the number of &#x2018;activated&#x2019; voxels with increasing sample size (see <xref ref-type="fig" rid="F13">Figure 13</xref>). With a further increase in the sample size, the number of statistically significant voxels revealed by classical NHST is expected to approach 100% (see, for example, <xref ref-type="bibr" rid="B53">Gonzalez-Castillo et al., 2012</xref>; <xref ref-type="bibr" rid="B134">Smith and Nichols, 2018</xref>). In contrast, the BPI with the <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> threshold demonstrated hyperbolic dependencies. We observed a steeper increase at small and moderate sample sizes (<italic>N</italic> = 15&#x2212;60). The curve of the &#x2018;activated&#x2019; voxels flattened at large sample sizes (<italic>N</italic> &#x003E; 80). BPI offers protection against the detection of &#x2018;trivial&#x2019; effects that can appear as a result of an increased sample size if classical NHST with the point-null hypothesis is used (<xref ref-type="bibr" rid="B41">Friston et al., 2002a</xref>; <xref ref-type="bibr" rid="B39">Friston, 2012</xref>; <xref ref-type="bibr" rid="B16">Chen et al., 2017</xref>). This is achieved by the ES threshold <italic>&#x03B3;</italic>, which eliminates physiologically (practically) negligible effects. <xref ref-type="fig" rid="F13">Figure 13</xref> presents an illustration of the Jeffreys-Lindley paradox, that is, the discrepancy between results obtained using classical and Bayesian inference, which is usually manifested at higher sample sizes (<xref ref-type="bibr" rid="B70">Jeffreys, 1939/1948</xref>; <xref ref-type="bibr" rid="B86">Lindley, 1957</xref>; <xref ref-type="bibr" rid="B39">Friston, 2012</xref>).</p>
<fig id="F13" position="float">
<label>FIGURE 13</label>
<caption><p>Dependencies of the number of &#x2018;activated&#x2019; voxels on the sample size. Classical NHST was implemented using FWE correction (&#x03B1; = 0.05). BPI was implemented using <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub>. <bold>(A)</bold> The emotional processing task (block design, &#x2018;Emotion &#x003E; Shape&#x2019; contrast). <bold>(B)</bold> The stop-signal task (event-related design, &#x2018;Correct Stop &#x003E; Go&#x2019; contrast). The error bars represent the mean and standard deviation across ten random groups.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g013.tif"/>
</fig>
</sec>
<sec id="S4.SS8">
<title>Normality Check</title>
<p>For the block-design task (&#x2018;Emotion &#x003E; Shape&#x2019; contrast), the number of significantly non-Gaussian voxels was 17% with &#x03B1;<sub><italic>uncorr</italic></sub> = 0.001 and 2% with &#x03B1;<sub><italic>Bonf</italic></sub> = 0.05. The median kurtosis and skewness across voxels was <italic>Ku</italic> = 3.77 and <italic>Sk</italic> = 0.05. For the event-related task (&#x2018;Correct Stop &#x003E; Go&#x2019; contrast), the number of significantly non-Gaussian voxels was 19% with &#x03B1;<sub><italic>uncorr</italic></sub> = 0.001 and 4% with &#x03B1;<sub><italic>Bonf</italic></sub> = 0.05. The median kurtosis and skewness across voxels was <italic>Ku</italic> = 3.77 and <italic>Sk</italic> = 0.05. In general, the data are consistent with the normality assumption, though some voxels violate it.</p>
</sec>
<sec id="S4.SS9">
<title>Simulations</title>
<p>The simulations results reproduced the results obtained from the empirical data (see <xref ref-type="fig" rid="F14">Figure 14</xref> for an overview of the simulations). Further, they allowed us to demonstrate how various factors affect BPI performance with the known ground truth.</p>
<fig id="F14" position="float">
<label>FIGURE 14</label>
<caption><p>Simulations overview. <bold>(A)</bold> Ground truth axial slice <italic>z</italic> = 36 mm (MNI152 standard space). &#x2018;Activated&#x2019; and &#x2018;deactivated&#x2019; voxels are marked in red and blue colors, respectfully. &#x2018;Trivial&#x2019; voxels that should be classified as &#x2018;not activated&#x2019; (practically equivalent to the null value) are marked in green. Data were drawn from the normal (<italic>Ku</italic> = 3, <italic>Sk</italic> = 0, the red line) and non-normal distributions. <bold>(B)</bold> Classical NHST results for <italic>N</italic> = 200 images, moderate effect and medium noise (&#x03B8; = 0.2%, <italic>SD</italic> = 0.3%), obtained using voxel-wise FWE correction (&#x03B1; = 0.05). <bold>(C)</bold> BPI results for <italic>N</italic> = 200 images, moderate effect and medium noise (&#x03B8; = 0.2%, <italic>SD</italic> = 0.3%), obtained using the &#x2018;ROPE-only&#x2019; decision rule, <italic>P</italic><sub><italic>thr</italic></sub> = 95% (<italic>LPO</italic> &#x003E; 3) and <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub>.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g014.tif"/>
</fig>
<sec id="S4.SS9.SSS1">
<title>Dependence of the Number of &#x2018;Activated&#x2019; Voxels on the Sample Size</title>
<p>The number of &#x2018;activated&#x2019; voxels revealed by BPI with the <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> threshold approaches the true number of practically significant voxels and stops increasing (see <xref ref-type="fig" rid="F15">Figure 15</xref>). Classical NHST shows further increase of &#x2018;activated&#x2019; voxels with the sample size increase, as it considers only statistical significance. This is more evident for low and medium noise cases (<italic>SD</italic> = 0.2, 0.3%). For the high noise case (<italic>SD</italic> = 0.4%), the sample size should be larger than <italic>N</italic> = 500 for the discrepancy between NHST and BPI results to become evident.</p>
<fig id="F15" position="float">
<label>FIGURE 15</label>
<caption><p>Simulations results for the dependencies of the number of &#x2018;activated&#x2019; voxels on the sample size. Data were drawn from normal distributions with different mean effect &#x03B8; and noise <italic>SD</italic>. Classical NHST was implemented using FWE correction (&#x03B1; = 0.05). BPI was implemented using <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub>. The error bars represent the mean and standard deviation across ten random groups. Horizontal lines indicate the true number of practically significant voxels.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g015.tif"/>
</fig>
</sec>
<sec id="S4.SS9.SSS2">
<title>Dependence of the Correct and Low Confidence Decision Rates on the Sample Size</title>
<p>For the weak effect size (&#x03B8; = 0.1%), the BPI with the <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> threshold is more sensitive for &#x2018;activated&#x2019; than for &#x2018;not activated&#x2019; voxels (see <xref ref-type="fig" rid="F16">Figure 16</xref>). This is because <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> threshold is smaller for the weak effect size. For the moderate and strong effects (&#x03B8; = 0.2, 0.3%), this difference in sensitivity become less evident. The low confidence decisions are prevalent in the &#x2018;weak effect plus high noise&#x2019; case. It becomes more challenging to distinguish between &#x2018;activated&#x2019; and &#x2018;not activated&#x2019; voxels when the data are noisy, and the PSC in the &#x2018;activated&#x2019; voxels is close to the PSC in &#x2018;trivial&#x2019; voxels. For the intermediate case (moderate effect plus medium noise), the correct decision rates for &#x2018;activated&#x2019; and &#x2018;not activated&#x2019; voxels reached 80% at the sample sizes <italic>N</italic> = 80 and <italic>N</italic> = 150, correspondingly. For larger effect sizes and lower noise, a smaller sample size will be required to achieve the correct decision rate of 80% (and vice versa). The &#x2018;ROPE-only&#x2019; decision rule is more sensitive to both &#x2018;activated&#x2019; and &#x2018;not activated&#x2019; voxels than the &#x2018;HDI+ROPE&#x2019; decision rule.</p>
<fig id="F16" position="float">
<label>FIGURE 16</label>
<caption><p>Simulations results for the dependencies of the correct and low confidence decision rates on the sample size. Data were drawn from normal distributions with different mean effect &#x03B8; and noise <italic>SD</italic>. BPI was implemented using <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub>. The plots for &#x2018;deactivated&#x2019; voxels closely follow the plots for &#x2018;activated&#x2019; voxels and have therefore been omitted for visualization purposes. The error bars represent the mean and standard deviation across ten random groups.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g016.tif"/>
</fig>
</sec>
<sec id="S4.SS9.SSS3">
<title>Robustness of Bayesian Parameter Inference to Violations of the Normality Assumption</title>
<p>Non-normal distributions with positive and negative skewness increase incorrect decision rates for &#x2018;deactivated&#x2019; and &#x2018;activated&#x2019; voxels, correspondingly (<xref ref-type="fig" rid="F17">Figure 17</xref>). Application the &#x2018;ROPE-only&#x2019; decision rule results in higher incorrect decision rates than the &#x2018;HDI+ROPE&#x2019; decision rule. However, even in the worst case (weak effect plus high noise), the incorrect decision rates for BPI with the <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> threshold did not exceed 5%. This result shows that BPI is robust to violations of the normality assumption. The &#x2018;ROPE-only&#x2019; rule may be preferable to the &#x2018;HDI+ROPE&#x2019; rule, as both rules protect against incorrect decisions, but the &#x2018;ROPE-only&#x2019; rule is more sensitive to the true effects using <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> threshold.</p>
<fig id="F17" position="float">
<label>FIGURE 17</label>
<caption><p>Simulations results for the dependencies of the incorrect decision rate on the sample size. Data were drawn from normal (<italic>Ku</italic> = 3, <italic>Sk</italic> = 0) and non-normal distributions with weak effect and high noise (&#x03B8; = 0.1%, <italic>SD</italic> = 0.4%). BPI was implemented using <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub>. The error bars represent the mean and standard deviation across ten random groups.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g017.tif"/>
</fig>
</sec>
<sec id="S4.SS9.SSS4">
<title>Dependence of the Correct and Incorrect Decision Rates on the Effect Size Threshold</title>
<p>The optimal ES threshold should provide high sensitivity to both &#x2018;activated&#x2019; and &#x2018;not activated&#x2019; voxels (e.g., higher than 80%) while protecting against incorrect decisions (e.g., lower than 5%). The range of ES thresholds that meets these criteria decreases for lower true effects and higher noise (see <xref ref-type="fig" rid="F18">Figure 18</xref>). At the sample size <italic>N</italic> = 200, the default <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub> threshold falled in the range of optimal ES thresholds in the majority of the cases. For the weak effect plus high noise case, one should choose between high sensitivity to &#x2018;activated&#x2019; or &#x2018;not activated&#x2019; voxels. In this scenario, to achieve high sensitivity to both types of voxels, it is necessary to obtain a very large sample size (<italic>N</italic> &#x003E; 500). In all considered cases, the default ES threshold provided approximately equal correct decision rates for &#x2018;activated&#x2019; and &#x2018;not activated&#x2019; voxels and protected against incorrect decisions. This result confirmed that the default IS threshold is optimal in most scenarios, except for the scenario with low effect and high noise level.</p>
<fig id="F18" position="float">
<label>FIGURE 18</label>
<caption><p>Simulations results for the dependencies of the correct and incorrect decision rates on the ES threshold <italic>&#x03B3;</italic>. Data were drawn from normal distributions with different mean effect &#x03B8; and noise <italic>SD</italic>. Sample size <italic>N</italic> = 200 images, results for one random group. The plots for &#x2018;deactivated&#x2019; voxels closely follow the plots for &#x2018;activated&#x2019; voxels and have therefore been omitted for visualization purposes. Vertical lines indicate the default ES threshold <italic>&#x03B3;</italic> = 1 <italic>prior SD</italic><sub>&#x03B8;</sub>. The light blue areas indicate ES thresholds at which the incorrect decision rates do not exceed 5% for both &#x2018;activated&#x2019; and &#x2018;not activated&#x2019; voxels. The dark blue areas indicate ES thresholds at which the correct decision rates exceed 80% for both &#x2018;activated&#x2019; and &#x2018;not activated&#x2019; voxels.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g018.tif"/>
</fig>
</sec>
</sec>
<sec id="S4.SS10">
<title>Example of Practical Application of Bayesian Parameter Inference</title>
<p>In contrast to classical NHST, Bayesian inference allows us to:</p>
<list list-type="simple">
<list-item>
<label>(1)</label>
<p>Provide evidence that there is no practically meaningful BOLD signal change in the brain area when comparing the two task conditions.</p>
</list-item>
<list-item>
<label>(2)</label>
<p>Establish double dissociations; that is to state that one area responds to A <italic>but not</italic> B condition and another responds to B <italic>but not</italic> A condition (<xref ref-type="bibr" rid="B41">Friston et al., 2002a</xref>).</p>
</list-item>
<list-item>
<label>(3)</label>
<p>Provide evidence for practically equivalent engagement of one area under different experimental conditions in terms of local brain activity.</p>
</list-item>
<list-item>
<label>(4)</label>
<p>Provide evidence for the absence of a practically meaningful difference in BOLD signals between groups of subjects or repeated measures.</p>
</list-item>
</list>
<p>To illustrate a possible application of Bayesian inference in research practice, we used a working memory task. Let us consider an overlap between the &#x2018;2-back &#x003E; baseline&#x2019; and &#x2018;0-back &#x003E; baseline&#x2019; contrasts (see <xref ref-type="fig" rid="F19">Figure 19</xref>, purple areas). We cannot claim that brain areas revealed by this conjunction analysis were equally engaged in the &#x2018;2-back&#x2019; and &#x2018;0-back&#x2019; conditions. To provide evidence for this notion, we can use BPI and attempt to identify voxels with a practically equivalent BOLD signal in the &#x2018;2-back&#x2019; and &#x2018;0-back&#x2019; conditions (see <xref ref-type="fig" rid="F19">Figure 19</xref>, green areas). Overlap between the &#x2018;2-back &#x003E; baseline&#x2019; and &#x2018;0-back &#x003E; baseline&#x2019; and the &#x2018;2-back = 0-back&#x2019; effects was found in several brain areas: visual cortex (V1, V2, V3), frontal eye field (FEF), superior eye field (SEF), parietal eye field (PEF, or posterior parietal cortex), lateral geniculate nucleus (LGN) and left primary motor cortex (M1) (see <xref ref-type="fig" rid="F19">Figure 19</xref>, white areas). This result can be easily explained by the fact that both experimental conditions require the subject to analyze perceptually similar visual stimuli and push response buttons with the right hand, which should not depend much on the working memory load. At the same time, it does not follow directly from simple conjunction analysis.</p>
<fig id="F19" position="float">
<label>FIGURE 19</label>
<caption><p>Example of possible application of BPI based on the working memory task. L/R, left/right; V1, V2, V3, primary, secondary, and third visual cortex; FEF, frontal eye field; SEF, superior eye field; PEF, parietal eye field; LGN, lateral geniculate nucleus; M1, primary motor cortex.</p></caption>
<graphic mimetype="image" mime-subtype="tiff" xlink:href="fninf-15-738342-g019.tif"/>
</fig>
</sec>
</sec>
<sec id="S5" sec-type="discussion">
<title>Discussion</title>
<p>Over-reliance on classical NHST promotes publication bias toward statistically significant findings. However, the null result can be just as valuable and exciting as the statistically significant result. Furthermore, not every statistically significant result has a practical significance. In recent years, statistical practice has seen a gradual shift from point-null hypothesis testing to interval-null hypothesis testing and interval estimation, as well as from frequentist to Bayesian approaches. Frequentist and Bayesian interval-based approaches allow us to assess the &#x2018;null effects&#x2019; and thus overcome prejudice against the null hypothesis. While both approaches may lead to similar results (if specially calibrated to get it), we discussed conceptual and practical reasons for preferring the Bayesian approach. One of the main conceptual difficulties of the frequentist approach is that it is based on the probabilistic &#x2018;proof by contradiction,&#x2019; which results in the &#x2018;inverse probability&#x2019; fallacy: that is a widespread misinterpretation of <italic>p</italic>-values and confidence intervals as posterior probabilities and credible intervals. Although the Bayesian approach does not automatically guarantee correct interpretations, it can be more intuitive and straightforward than the frequentist approach (particularly, Bayesian inference based on the posterior probability distributions of the parameters or BPI).</p>
<p>At the same time, from the frequentist point of view, the main conceptual disadvantage of the Bayesian approach is the need to specify our prior beliefs about the model parameters. Sometimes it is argued that we do not want our result to depend on a subjective prior decision. However, in the frequentist framework, we also make prior assumptions when subjectively choosing a model or ignoring the prior distributions of model parameters (implicitly use &#x2018;flat&#x2019; prior). From this point of view, the explicit choice of the prior may be rather an advantage. We can choose prior from theoretical arguments (e.g., biophysical or anatomical priors) or derive prior from the hierarchically organized data (empirical Bayes approach). In this way, we limit the subjectivity of the choice of the prior.</p>
<p>Another potential obstacle to using Bayesian statistics is its computational complexity. Integrals in Bayes&#x2019; rule can be solved analytically only for relatively simple models. In other cases, numerical integration approaches should be used to calculate the posterior probability, which are particularly time-consuming, especially when considering thousands of voxels. Alternatively, one can use computationally efficient analytical approximations to the posterior distributions, which, however, can be less accurate for high-dimensional parameter spaces (multivariate analysis).</p>
<p>Despite profound development of Bayesian techniques, to date, the &#x2018;null effect&#x2019; assessment is uncommon in neuroimaging field and, in particular, in fMRI studies. One of the possible reasons for this may be the lack of tools available to the end-user. To facilitate the &#x2018;null effect&#x2019; assessment for fMRI practitioners, we developed SPM12 based toolbox for group-level Bayesian inference<sup>4</sup>. We evaluated the BPI approach on empirical and simulated data and discussed its possible application in fMRI studies.</p>
<p>Bayesian parameter inference allows us to simultaneously find &#x2018;activated/deactivated,&#x2019; &#x2018;not activated,&#x2019; and &#x2018;low confidence&#x2019; voxels using a single decision rule. The &#x2018;not activated&#x2019; decision means that the effect is practically non-significant and can be considered equivalent to the null for practical purposes. The &#x2018;low confidence&#x2019; decision means we need more data to make a confident inference, that is, we need to increase the scanning time, sample size, data quality or revisit the task design. The use of parametrical empirical Bayes with the &#x2018;global shrinkage&#x2019; prior enables us to check the results as the sample size increases and allows us to decide whether to stop the experiment if the obtained data are sufficient to make a confident inference. All the above features are absent from the classical NHST framework, limited to the point-null hypothesis with a pre-determined stopping rule.</p>
<p>An important advantage of Bayesian inference is that we can use graphs such as those shown in <xref ref-type="fig" rid="F12">Figure 12</xref> to determine when the obtained data are sufficient to make a confident inference. We can plot such graphs for the whole brain or for <italic>a priori</italic> defined ROIs. When the curves reach a plateau, the data collection can be stopped. If the brain area can be labeled as either &#x2018;activated/deactivated&#x2019; or &#x2018;not activated&#x2019; at a relatively small sample size, it will be still so at larger sample sizes. If the brain area can be labeled as &#x2018;low confidence,&#x2019; we must increase the sample size to make a confident inference. At a certain sample size, it could possibly be labeled as either &#x2018;activated/deactivated&#x2019; or &#x2018;not activated.&#x2019; In the worst case, we can reach the plateau and still label the brain area as &#x2018;low confidence.&#x2019; However, even in this case, we can make a definite conclusion: the task design is not sensitive to the effect and should be revised. Empirical Bayes with the &#x2018;global shrinkage&#x2019; prior allows us to monitor the evidence for the alternative or null hypotheses after each participant without special adjustment for multiplicity (<xref ref-type="bibr" rid="B32">Edwards et al., 1963</xref>; <xref ref-type="bibr" rid="B10">Berger and Berry, 1988</xref>; <xref ref-type="bibr" rid="B143">Wagenmakers, 2007</xref>; <xref ref-type="bibr" rid="B119">Rouder, 2014</xref>; <xref ref-type="bibr" rid="B80">Kruschke and Liddell, 2017b</xref>; <xref ref-type="bibr" rid="B127">Sch&#x00F6;nbrodt et al., 2017</xref>). The optional stopping of the experiment not only allows more freedom in terms of the experimental design, but also saves limited resources and is even more ethically justified in certain cases<sup><xref ref-type="fn" rid="footnote6">6</xref></sup> (<xref ref-type="bibr" rid="B32">Edwards et al., 1963</xref>; <xref ref-type="bibr" rid="B143">Wagenmakers, 2007</xref>). To strike a balance between analytical flexibility and subjectivity of analysis, one may pre-register hypotheses, models, priors and desired level of evidence to reach without being limited by predefined sample size.</p>
<p>In contrast, frequentist inference depends on the researcher&#x2019;s intention to stop data collection and thus requires a definition of the stopping rule based on <italic>a priori</italic> power analysis. The sequential analysis and optional stopping in frequentist inference inflate the number of false positives and require special multiplicity adjustments. Moreover, even if the <italic>a priori</italic> defined sample size is reached, the researcher can still obtain a non-significant result. In this case, the researcher can follow two controversial paths within the classical NHST framework. Firstly, the sample size could be further increased to force an indecisive result to a decisive conclusion. The problem is that this conclusion would always be against the null hypothesis. Thus, an unbounded increase in the sample size introduces a discrepancy between classical NHST and Bayesian inference, also known as the Jeffreys-Lindley paradox. Secondly, one may argue that high <italic>a priori</italic> power and non-significant results provide evidence for the null hypothesis (see, for example, <xref ref-type="bibr" rid="B20">Cohen, 1990</xref>). However, even high <italic>a priori</italic> power and non-significant results do not provide direct evidence for the null hypothesis. In fact, a high-powered non-significant result may arise when the obtained data provide no evidence for the null over the alternative hypothesis, according to Bayesian inference (<xref ref-type="bibr" rid="B31">Dienes and Mclatchie, 2017</xref>). This does not mean that power analysis is irrelevant from a Bayesian perspective. Although power analysis is not necessary for Bayesian inference, it can still be used within the Bayesian framework for study planning (<xref ref-type="bibr" rid="B80">Kruschke and Liddell, 2017b</xref>). At the same time, power analysis is a critical part of frequentist inference, as it depends on researcher intentions, such as the stopping intention.</p>
<p>The main difficulty with the application of BPI is the need to define the ES threshold. However, the problem of choosing a practically meaningful effect size is not unique to fMRI studies, as it arises in every mature field of science. It should not discourage us from using BPI, as the point-null hypothesis is never true in the soft sciences. From our perspective, there are several ways to address this problem. Firstly, the ES threshold can be chosen based on previously reported effect sizes in studies with a similar design or perform a pilot study to estimate the expected effect size.</p>
<p>Based on the fMRI literature, the largest BOLD responses are evoked by sensory stimulation and vary within 1&#x2013;5% of the overall mean whole-brain activity. In contrast, BOLD responses induced by cognitive tasks vary within 0.1&#x2013;0.5% (<xref ref-type="bibr" rid="B42">Friston et al., 2002b</xref>; <xref ref-type="bibr" rid="B111">Poldrack et al., 2011</xref>; <xref ref-type="bibr" rid="B16">Chen et al., 2017</xref>). The results obtained in this study support this notion. Primary sensory effects were &#x003E;1%, and motor effects were &#x003E;0.3%. Cognitive effects can be classified into three categories.</p>
<list list-type="simple">
<list-item>
<label>(1)</label>
<p>&#x2018;Strong&#x2019; effects of 0.2&#x2212;0.3% (for example, emotion processing in the amygdala, language processing in Broca&#x2019;s area),</p>
</list-item>
<list-item>
<label>(2)</label>
<p>&#x2018;Moderate&#x2019; effects of 0.1&#x2212;0.2% (for example, working memory load in DLPFC, social cognition in IPL, response inhibition in IFG/FO),</p>
</list-item>
<list-item>
<label>(3)</label>
<p>&#x2018;Weak&#x2019; effects of 0.05&#x2013;0.1% in contrasts without robust activations (for example, reward processing in the nucleus accumbens, relational processing in DLPFC).</p>
</list-item>
</list>
<p>However, choosing the ES threshold based on previous studies can be challenging because fMRI designs become increasingly complex over time, and it can be difficult to find previous experiments reporting unbiased effect size with a similar design. In this case, one can use the ES threshold equal to <italic>one prior SD</italic> of the effect (<xref ref-type="bibr" rid="B44">Friston and Penny, 2003</xref>), which can be thought as a neuronal &#x2018;background noise level&#x2019; or a level of activity that is generic to the whole brain (<xref ref-type="bibr" rid="B33">Eickhoff et al., 2008</xref>). As empirical and simulation analysis results show, BPI with this ES threshold generally works well for both &#x2018;activated/deactivated&#x2019; and &#x2018;not activated&#x2019; voxel detection. However, it may not be suitable in cases with the weak effects and high noise. In addition, researchers who rely more on the frequentist inference may use the <italic>&#x03B3;</italic>(<italic>Dice</italic><sub><italic>max</italic></sub>) threshold to replicate the results obtained previously with classical NHST and additionally search for &#x2018;not activated&#x2019; and &#x2018;low confidence&#x2019; voxels. Finally, the degree to which the posterior probability is contained within the ROPEs of different widths could be specified or the ROPE maps in which the thresholding sequence is inverted could be calculated. The ROPE maps can be shared in public repositories, such as Neurovault, along with PPMs, and subsequently thresholded by any reasonable ES threshold.</p>
<p>The ability to provide evidence for the null hypothesis may be especially beneficial for clinical neuroimaging. Possible issues that can be resolved using this approach are:</p>
<list list-type="simple">
<list-item>
<label>(1)</label>
<p>Let the brain activity in certain ROIs due to a neurodegenerative process decrease by more than <italic>&#x03B3;</italic> per year on average without any treatment. To prove that a new treatment <italic>effectively protects against neurodegenerative processes</italic>, we can provide evidence that, within 1 year of treatment, brain activity was reduced by less than X%.</p>
</list-item>
<list-item>
<label>(2)</label>
<p>Assume that an effective treatment should change the brain activity in certain ROIs by at least X%. Then, we can prove that a new treatment is <italic>practically ineffective</italic> if the activity has changed by less than X%.</p>
</list-item>
<list-item>
<label>(3)</label>
<p>Consider two groups of subjects taking a new treatment and a placebo, respectively. Using BPI, we can provide evidence that the result of the new treatment is <italic>does not differ from that of the placebo</italic>.</p>
</list-item>
<list-item>
<label>(4)</label>
<p>Consider two groups of subjects taking an old effective treatment and a new treatment. Using BPI, we can provide evidence that the new treatment is <italic>no worse than the old effective treatment</italic>.</p>
</list-item>
<list-item>
<label>(5)</label>
<p>Consider a new treatment for a disease that <italic>is not related to brain function</italic>. Using BPI, we can provide evidence that the new treatment <italic>does not have side effects</italic> on brain activity.</p>
</list-item>
</list>
</sec>
<sec id="S6" sec-type="conclusion">
<title>Conclusion</title>
<p>Herein, a discussion of the use of the Bayesian and frequentist approaches to assess the &#x2018;null effects&#x2019; in fMRI studies was presented. We demonstrated that group-level Bayesian inference may be more intuitive and convenient in practice than frequentist inference. Crucially, Bayesian inference can detect &#x2018;activated/deactivated,&#x2019; &#x2018;not activated,&#x2019; and &#x2018;low confidence&#x2019; voxels using a single decision rule. Moreover, it allows for interim analysis and optional stopping when the obtained sample size is sufficient to make a confident inference. We considered the problem of defining a threshold for the effect size and provided a reference set of typical effect sizes in different fMRI designs. Bayesian inference and assessment of the &#x2018;null effects&#x2019; may be especially beneficial for basic and applied clinical neuroimaging. The developed SPM12-based toolbox with a simple GUI is expected to be useful for the assessment of &#x2018;null effects&#x2019; using BPI.</p>
</sec>
<sec id="S7">
<title>Limitations and Future Work</title>
<p>Firstly, we did not consider BMI, which is currently mainly used for the analysis of effective connectivity. A promising area of future research would be to compare the advantages of BMI and BPI when analyzing local brain activity. Secondly, the &#x2018;global shrinkage&#x2019; prior must be compared with other possible priors, in particular with priors that take into account the spatial dependency between voxels. Thirdly, we used Bayesian statistics only at the group level. Future studies could consider the advantages of using the Bayesian approach at both the subject and group levels.</p>
</sec>
<sec id="S8" sec-type="data-availability">
<title>Data Availability Statement</title>
<p>The datasets analyzed for this study can be found in the Human Connectome Project (<ext-link ext-link-type="uri" xlink:href="https://www.humanconnectome.org/study/hcp-young-adult/document/1200-subjects-data-release">https://www.humanconnectome.org/study/hcp-young-adult/document/1200-subjects-data-release</ext-link>) and the UCLA Consortium for Neuropsychiatric Phenomics study (<ext-link ext-link-type="uri" xlink:href="https://openneuro.org/datasets/ds000030/versions/1.0.0">https://openneuro.org/datasets/ds000030/versions/1.0.0</ext-link>). Bayesian parameter inference was performed using the SPM12-based toolbox available at <ext-link ext-link-type="uri" xlink:href="https://github.com/Masharipov/Bayesian_inference">https://github.com/Masharipov/Bayesian_inference</ext-link>.</p>
</sec>
<sec id="S9">
<title>Author Contributions</title>
<p>RM, AK, and MK contributed to conceptualization of the research. MK supervised the project. RM, IK, and YN contributed to statistical analysis and programming. RM and IK performed simulations. MD, DC, and MK acquired funding. All authors contributed to the text of this article and approved the submitted version.</p>
</sec>
<sec id="conf1" sec-type="COI-statement">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec id="pudiscl1" sec-type="disclaimer">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
</body>
<back>
<sec id="S10" sec-type="funding-information">
<title>Funding</title>
<p>RM, AK, and MK were supported by the Russian Science Foundation Grant #19-18-00454. IK, YN, MD, and DC were supported by the state assignment of the Ministry of Education and Science of Russian Federation (theme number AAAA-A19-119101890066-2). Data were provided [in part] by the Human Connectome Project, WU-Minn Consortium (Principal Investigators: David Van Essen and Kamil Ugurbil; 1U54MH091657) funded by the 16 NIH Institutes and Centers that support the NIH Blueprint for Neuroscience Research; and by the McDonnell Center for Systems Neuroscience at Washington University. Another part of the data were provided by UCLA dataset which was obtained from the OpenfMRI database (its accession number is <ext-link ext-link-type="DDBJ/EMBL/GenBank" xlink:href="ds000030">ds000030</ext-link>) and data collection was funded by the Consortium for Neuropsychiatric Phenomics (NIH Roadmap for Medical Research grants UL1-DE019580, RL1MH083268, RL1MH083269, RL1DA024853, RL1MH083270, RL1LM009833, PL1MH083271, and PL1NS062410).</p>
</sec>
<ack>
<p>We thank Andrey Ogai for the valuable help with a code for visualization of statistical maps.</p>
</ack>
<sec id="S12" sec-type="supplementary-material">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fninf.2021.738342/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fninf.2021.738342/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.docx" id="DS1" mimetype="application/vnd.openxmlformats-officedocument.wordprocessingml.document" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Acar</surname> <given-names>F.</given-names></name> <name><surname>Seurinck</surname> <given-names>R.</given-names></name> <name><surname>Eickhoff</surname> <given-names>S. B.</given-names></name> <name><surname>Moerkerke</surname> <given-names>B.</given-names></name></person-group> (<year>2018</year>). <article-title>Assessing robustness against potential publication bias in activation likelihood estimation (ALE) meta-analyses for fMRI.</article-title> <source><italic>PLoS One</italic></source> <volume>13</volume>:<issue>e0208177</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0208177</pub-id> <pub-id pub-id-type="pmid">30500854</pub-id></citation></ref>
<ref id="B2"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Aczel</surname> <given-names>B.</given-names></name> <name><surname>Palfi</surname> <given-names>B.</given-names></name> <name><surname>Szollosi</surname> <given-names>A.</given-names></name> <name><surname>Kovacs</surname> <given-names>M.</given-names></name> <name><surname>Szaszi</surname> <given-names>B.</given-names></name> <name><surname>Szecsi</surname> <given-names>P.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Quantifying support for the null hypothesis in psychology: an empirical investigation.</article-title> <source><italic>Adv. Methods Pract. Psychol. Sci.</italic></source> <volume>1</volume> <fpage>357</fpage>&#x2013;<lpage>366</lpage>. <pub-id pub-id-type="doi">10.1177/2515245918773742</pub-id></citation></ref>
<ref id="B3"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Alberton</surname> <given-names>B. A.</given-names></name> <name><surname>Nichols</surname> <given-names>T. E.</given-names></name> <name><surname>Gamba</surname> <given-names>H. R.</given-names></name> <name><surname>Winkler</surname> <given-names>A. M.</given-names></name></person-group> (<year>2020</year>). <article-title>Multiple testing correction over contrasts for brain imaging.</article-title> <source><italic>Neuroimage</italic></source> <volume>216</volume>:<issue>116760</issue>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2020.116760</pub-id> <pub-id pub-id-type="pmid">32201328</pub-id></citation></ref>
<ref id="B4"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Altman</surname> <given-names>D. G.</given-names></name> <name><surname>Bland</surname> <given-names>J. M.</given-names></name></person-group> (<year>1995</year>). <article-title>Statistics notes: absence of evidence is not evidence of absence.</article-title> <source><italic>BMJ</italic></source> <volume>311</volume>:<issue>485</issue>. <pub-id pub-id-type="doi">10.1136/bmj.311.7003.485</pub-id> <pub-id pub-id-type="pmid">7647644</pub-id></citation></ref>
<ref id="B5"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Amrhein</surname> <given-names>V.</given-names></name> <name><surname>Korner-Nievergelt</surname> <given-names>F.</given-names></name> <name><surname>Roth</surname> <given-names>T.</given-names></name></person-group> (<year>2017</year>). <article-title>The earth is flat (p &#x003E; 0.05): significance thresholds and the crisis of unreplicable research.</article-title> <source><italic>PeerJ</italic></source> <volume>5</volume>:<issue>e3544</issue>. <pub-id pub-id-type="doi">10.7717/peerj.3544</pub-id> <pub-id pub-id-type="pmid">28698825</pub-id></citation></ref>
<ref id="B6"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baguley</surname> <given-names>T.</given-names></name></person-group> (<year>2009</year>). <article-title>Standardized or simple effect size: what should be reported?</article-title> <source><italic>Br. J. Psychol.</italic></source> <volume>100</volume> <fpage>603</fpage>&#x2013;<lpage>617</lpage>. <pub-id pub-id-type="doi">10.1348/000712608x377117</pub-id> <pub-id pub-id-type="pmid">19017432</pub-id></citation></ref>
<ref id="B7"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barch</surname> <given-names>D. M.</given-names></name> <name><surname>Burgess</surname> <given-names>G. C.</given-names></name> <name><surname>Harms</surname> <given-names>M. P.</given-names></name> <name><surname>Petersen</surname> <given-names>S. E.</given-names></name> <name><surname>Schlaggar</surname> <given-names>B. L.</given-names></name> <name><surname>Corbetta</surname> <given-names>M.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>Function in the human connectome: task-fMRI and individual differences in behavior.</article-title> <source><italic>Neuroimage</italic></source> <volume>80</volume> <fpage>169</fpage>&#x2013;<lpage>189</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2013.05.033</pub-id> <pub-id pub-id-type="pmid">23684877</pub-id></citation></ref>
<ref id="B8"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Belia</surname> <given-names>S.</given-names></name> <name><surname>Fidler</surname> <given-names>F.</given-names></name> <name><surname>Williams</surname> <given-names>J.</given-names></name> <name><surname>Cumming</surname> <given-names>G.</given-names></name></person-group> (<year>2005</year>). <article-title>Researchers misunderstand confidence intervals and standard error bars.</article-title> <source><italic>Psychol. Methods</italic></source> <volume>10</volume> <fpage>389</fpage>&#x2013;<lpage>396</lpage>. <pub-id pub-id-type="doi">10.1037/1082-989x.10.4.389</pub-id> <pub-id pub-id-type="pmid">16392994</pub-id></citation></ref>
<ref id="B9"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Berger</surname> <given-names>J. O.</given-names></name></person-group> (<year>2003</year>). <article-title>Could Fisher, Jeffreys and Neyman have agreed on testing?</article-title> <source><italic>Stat. Sci.</italic></source> <volume>18</volume> <fpage>1</fpage>&#x2013;<lpage>32</lpage>. <pub-id pub-id-type="doi">10.1214/ss/1056397485</pub-id></citation></ref>
<ref id="B10"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Berger</surname> <given-names>J. O.</given-names></name> <name><surname>Berry</surname> <given-names>D. A.</given-names></name></person-group> (<year>1988</year>). <article-title>Statistical analysis and the illusion of objectivity.</article-title> <source><italic>Am. Sci.</italic></source> <volume>76</volume> <fpage>159</fpage>&#x2013;<lpage>165</lpage>.</citation></ref>
<ref id="B11"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Berger</surname> <given-names>J. O.</given-names></name> <name><surname>Sellke</surname> <given-names>T.</given-names></name></person-group> (<year>1987</year>). <article-title>Testing a point null hypothesis: the irreconcilability of p values and evidence: rejoinder.</article-title> <source><italic>J. Am. Stat. Assoc.</italic></source> <volume>82</volume>:<issue>135</issue>. <pub-id pub-id-type="doi">10.2307/2289139</pub-id></citation></ref>
<ref id="B12"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Berry</surname> <given-names>D.</given-names></name></person-group> (<year>1988</year>). &#x201C;<article-title>Multiple comparisons, multiple tests, and data dredging: a Bayesian perspective</article-title>,&#x201D; in <source><italic>Bayesian Statistics</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Bernardo</surname> <given-names>J.</given-names></name> <name><surname>DeGroot</surname> <given-names>M.</given-names></name> <name><surname>Lindley</surname> <given-names>D.</given-names></name> <name><surname>Smith</surname> <given-names>A.</given-names></name></person-group> (<publisher-loc>Oxford</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>), <fpage>79</fpage>&#x2013;<lpage>94</lpage>.</citation></ref>
<ref id="B13"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Berry</surname> <given-names>D. A.</given-names></name> <name><surname>Hochberg</surname> <given-names>Y.</given-names></name></person-group> (<year>1999</year>). <article-title>Bayesian perspectives on multiple comparisons.</article-title> <source><italic>J. Stat. Plan. Inference</italic></source> <volume>82</volume> <fpage>215</fpage>&#x2013;<lpage>227</lpage>. <pub-id pub-id-type="doi">10.1016/s0378-3758(99)00044-0</pub-id></citation></ref>
<ref id="B14"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Campbell</surname> <given-names>H.</given-names></name> <name><surname>Gustafson</surname> <given-names>P.</given-names></name></person-group> (<year>2018</year>). <article-title>Conditional equivalence testing: an alternative remedy for publication bias.</article-title> <source><italic>PLoS One</italic></source> <volume>13</volume>:<issue>e0195145</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0195145</pub-id> <pub-id pub-id-type="pmid">29652891</pub-id></citation></ref>
<ref id="B15"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>G.</given-names></name> <name><surname>Cox</surname> <given-names>R. W.</given-names></name> <name><surname>Glen</surname> <given-names>D. R.</given-names></name> <name><surname>Rajendra</surname> <given-names>J. K.</given-names></name> <name><surname>Reynolds</surname> <given-names>R. C.</given-names></name> <name><surname>Taylor</surname> <given-names>P. A.</given-names></name></person-group> (<year>2018</year>). <article-title>A tail of two sides: artificially doubled false positive rates in neuroimaging due to the sidedness choice with t -tests.</article-title> <source><italic>Hum. Brain Mapp.</italic></source> <volume>40</volume> <fpage>1037</fpage>&#x2013;<lpage>1043</lpage>. <pub-id pub-id-type="doi">10.1002/hbm.24399</pub-id> <pub-id pub-id-type="pmid">30265768</pub-id></citation></ref>
<ref id="B16"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>G.</given-names></name> <name><surname>Taylor</surname> <given-names>P. A.</given-names></name> <name><surname>Cox</surname> <given-names>R. W.</given-names></name></person-group> (<year>2017</year>). <article-title>Is the statistic value all we should care about in neuroimaging?</article-title> <source><italic>Neuroimage</italic></source> <volume>147</volume> <fpage>952</fpage>&#x2013;<lpage>959</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2016.09.066</pub-id> <pub-id pub-id-type="pmid">27729277</pub-id></citation></ref>
<ref id="B17"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>G.</given-names></name> <name><surname>Taylor</surname> <given-names>P. A.</given-names></name> <name><surname>Cox</surname> <given-names>R. W.</given-names></name> <name><surname>Pessoa</surname> <given-names>L.</given-names></name></person-group> (<year>2020</year>). <article-title>Fighting or embracing multiplicity in neuroimaging? Neighborhood leverage versus global calibration.</article-title> <source><italic>Neuroimage</italic></source> <volume>206</volume>:<issue>116320</issue>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2019.116320</pub-id> <pub-id pub-id-type="pmid">31698079</pub-id></citation></ref>
<ref id="B18"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Chen</surname> <given-names>G.</given-names></name> <name><surname>Xiao</surname> <given-names>Y.</given-names></name> <name><surname>Taylor</surname> <given-names>P. A.</given-names></name> <name><surname>Rajendra</surname> <given-names>J. K.</given-names></name> <name><surname>Riggins</surname> <given-names>T.</given-names></name> <name><surname>Geng</surname> <given-names>F.</given-names></name><etal/></person-group> (<year>2019</year>). <article-title>Handling multiplicity in neuroimaging through Bayesian lenses with multilevel modeling.</article-title> <source><italic>Neuroinformatics</italic></source> <volume>17</volume> <fpage>515</fpage>&#x2013;<lpage>545</lpage>. <pub-id pub-id-type="doi">10.1007/s12021-018-9409-6</pub-id> <pub-id pub-id-type="pmid">30649677</pub-id></citation></ref>
<ref id="B19"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cohen</surname> <given-names>J.</given-names></name></person-group> (<year>1965</year>). &#x201C;<article-title>Some statistical issues in psychological research</article-title>,&#x201D; in <source><italic>Handbook of Clinical Psychology</italic></source>, <role>ed.</role> <person-group person-group-type="editor"><name><surname>Wolman</surname> <given-names>B. B.</given-names></name></person-group> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>McGraw-Hill</publisher-name>), <fpage>95</fpage>&#x2013;<lpage>121</lpage>.</citation></ref>
<ref id="B20"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cohen</surname> <given-names>J.</given-names></name></person-group> (<year>1990</year>). <article-title>Things I have learned (so far).</article-title> <source><italic>Am. Psychol.</italic></source> <volume>45</volume> <fpage>1304</fpage>&#x2013;<lpage>1312</lpage>. <pub-id pub-id-type="doi">10.1037/0003-066x.45.12.1304</pub-id></citation></ref>
<ref id="B21"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cohen</surname> <given-names>J.</given-names></name></person-group> (<year>1994</year>). <article-title>The earth is round (p &#x003C; .05).</article-title> <source><italic>Am. Psychol.</italic></source> <volume>49</volume> <fpage>997</fpage>&#x2013;<lpage>1003</lpage>. <pub-id pub-id-type="doi">10.1037/0003-066x.49.12.997</pub-id></citation></ref>
<ref id="B22"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cornfield</surname> <given-names>J.</given-names></name></person-group> (<year>1966</year>). <article-title>Sequential trials, sequential analysis and the likelihood principle.</article-title> <source><italic>Am. Stat.</italic></source> <volume>20</volume> <fpage>18</fpage>&#x2013;<lpage>23</lpage>. <pub-id pub-id-type="doi">10.1080/00031305.1966.10479786</pub-id></citation></ref>
<ref id="B23"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cortina</surname> <given-names>J. M.</given-names></name> <name><surname>Dunlap</surname> <given-names>W. P.</given-names></name></person-group> (<year>1997</year>). <article-title>On the logic and purpose of significance testing.</article-title> <source><italic>Psychol. Methods</italic></source> <volume>2</volume> <fpage>161</fpage>&#x2013;<lpage>172</lpage>. <pub-id pub-id-type="doi">10.1037/1082-989x.2.2.161</pub-id></citation></ref>
<ref id="B24"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cramer</surname> <given-names>A. O. J.</given-names></name> <name><surname>van Ravenzwaaij</surname> <given-names>D.</given-names></name> <name><surname>Matzke</surname> <given-names>D.</given-names></name> <name><surname>Steingroever</surname> <given-names>H.</given-names></name> <name><surname>Wetzels</surname> <given-names>R.</given-names></name> <name><surname>Grasman</surname> <given-names>R. P. P. P.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Hidden multiplicity in exploratory multiway ANOVA: prevalence and remedies.</article-title> <source><italic>Psychon. Bull. Rev.</italic></source> <volume>23</volume> <fpage>640</fpage>&#x2013;<lpage>647</lpage>. <pub-id pub-id-type="doi">10.3758/s13423-015-0913-5</pub-id> <pub-id pub-id-type="pmid">26374437</pub-id></citation></ref>
<ref id="B25"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cremers</surname> <given-names>H. R.</given-names></name> <name><surname>Wager</surname> <given-names>T. D.</given-names></name> <name><surname>Yarkoni</surname> <given-names>T.</given-names></name></person-group> (<year>2017</year>). <article-title>The relation between statistical power and inference in fMRI.</article-title> <source><italic>PLoS One</italic></source> <volume>12</volume>:<issue>e0184923</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0184923</pub-id> <pub-id pub-id-type="pmid">29155843</pub-id></citation></ref>
<ref id="B26"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Cumming</surname> <given-names>G.</given-names></name></person-group> (<year>2013</year>). <article-title>The new statistics: why and how.</article-title> <source><italic>Psychol. Sci.</italic></source> <volume>25</volume> <fpage>7</fpage>&#x2013;<lpage>29</lpage>. <pub-id pub-id-type="doi">10.1177/0956797613504966</pub-id> <pub-id pub-id-type="pmid">24220629</pub-id></citation></ref>
<ref id="B27"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dandolo</surname> <given-names>L. C.</given-names></name> <name><surname>Schwabe</surname> <given-names>L.</given-names></name></person-group> (<year>2019</year>). <article-title>Time-dependent motor memory representations in prefrontal cortex.</article-title> <source><italic>Neuroimage</italic></source> <volume>197</volume> <fpage>143</fpage>&#x2013;<lpage>155</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2019.04.051</pub-id> <pub-id pub-id-type="pmid">31015028</pub-id></citation></ref>
<ref id="B28"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>David</surname> <given-names>S. P.</given-names></name> <name><surname>Naudet</surname> <given-names>F.</given-names></name> <name><surname>Laude</surname> <given-names>J.</given-names></name> <name><surname>Radua</surname> <given-names>J.</given-names></name> <name><surname>Fusar-Poli</surname> <given-names>P.</given-names></name> <name><surname>Chu</surname> <given-names>I.</given-names></name><etal/></person-group> (<year>2018</year>). <article-title>Potential reporting bias in neuroimaging studies of sex differences.</article-title> <source><italic>Sci. Rep.</italic></source> <volume>8</volume>:<issue>6082</issue>. <pub-id pub-id-type="doi">10.1038/s41598-018-23976-1</pub-id> <pub-id pub-id-type="pmid">29666377</pub-id></citation></ref>
<ref id="B29"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>de Winter</surname> <given-names>J. C.</given-names></name> <name><surname>Dodou</surname> <given-names>D.</given-names></name></person-group> (<year>2015</year>). <article-title>A surge ofp-values between 0.041 and 0.049 in recent decades (but negative results are increasing rapidly too).</article-title> <source><italic>PeerJ</italic></source> <volume>3</volume>:<issue>e733</issue>. <pub-id pub-id-type="doi">10.7717/peerj.733</pub-id> <pub-id pub-id-type="pmid">25650272</pub-id></citation></ref>
<ref id="B30"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dienes</surname> <given-names>Z.</given-names></name></person-group> (<year>2014</year>). <article-title>Using Bayes to get the most out of non-significant results.</article-title> <source><italic>Front. Psychol.</italic></source> <volume>5</volume>:<issue>781</issue>. <pub-id pub-id-type="doi">10.3389/fpsyg.2014.00781</pub-id> <pub-id pub-id-type="pmid">25120503</pub-id></citation></ref>
<ref id="B31"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dienes</surname> <given-names>Z.</given-names></name> <name><surname>Mclatchie</surname> <given-names>N.</given-names></name></person-group> (<year>2017</year>). <article-title>Four reasons to prefer Bayesian analyses over significance testing.</article-title> <source><italic>Psychon. Bull. Rev.</italic></source> <volume>25</volume> <fpage>207</fpage>&#x2013;<lpage>218</lpage>. <pub-id pub-id-type="doi">10.3758/s13423-017-1266-z</pub-id> <pub-id pub-id-type="pmid">28353065</pub-id></citation></ref>
<ref id="B32"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Edwards</surname> <given-names>W.</given-names></name> <name><surname>Lindman</surname> <given-names>H.</given-names></name> <name><surname>Savage</surname> <given-names>L. J.</given-names></name></person-group> (<year>1963</year>). <article-title>Bayesian statistical inference for psychological research.</article-title> <source><italic>Psychol. Rev.</italic></source> <volume>70</volume> <fpage>193</fpage>&#x2013;<lpage>242</lpage>. <pub-id pub-id-type="doi">10.1037/h0044139</pub-id></citation></ref>
<ref id="B33"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eickhoff</surname> <given-names>S. B.</given-names></name> <name><surname>Grefkes</surname> <given-names>C.</given-names></name> <name><surname>Fink</surname> <given-names>G. R.</given-names></name> <name><surname>Zilles</surname> <given-names>K.</given-names></name></person-group> (<year>2008</year>). <article-title>Functional lateralization of face, hand, and trunk representation in anatomically defined human somatosensory areas.</article-title> <source><italic>Cereb. Cortex</italic></source> <volume>18</volume> <fpage>2820</fpage>&#x2013;<lpage>2830</lpage>. <pub-id pub-id-type="doi">10.1093/cercor/bhn039</pub-id> <pub-id pub-id-type="pmid">18372289</pub-id></citation></ref>
<ref id="B34"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Eklund</surname> <given-names>A.</given-names></name> <name><surname>Nichols</surname> <given-names>T. E.</given-names></name> <name><surname>Knutsson</surname> <given-names>H.</given-names></name></person-group> (<year>2016</year>). <article-title>Cluster failure: why fMRI inferences for spatial extent have inflated false-positive rates.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>113</volume> <fpage>7900</fpage>&#x2013;<lpage>7905</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1602413113</pub-id> <pub-id pub-id-type="pmid">27357684</pub-id></citation></ref>
<ref id="B35"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Falk</surname> <given-names>R.</given-names></name> <name><surname>Greenbaum</surname> <given-names>C. W.</given-names></name></person-group> (<year>1995</year>). <article-title>Significance tests die hard.</article-title> <source><italic>Theory Psychol.</italic></source> <volume>5</volume> <fpage>75</fpage>&#x2013;<lpage>98</lpage>. <pub-id pub-id-type="doi">10.1177/0959354395051004</pub-id></citation></ref>
<ref id="B36"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Feng</surname> <given-names>C.</given-names></name> <name><surname>Forthman</surname> <given-names>K. L.</given-names></name> <name><surname>Kuplicki</surname> <given-names>R.</given-names></name> <name><surname>Yeh</surname> <given-names>H. W.</given-names></name> <name><surname>Stewart</surname> <given-names>J. L.</given-names></name> <name><surname>Paulus</surname> <given-names>M. P.</given-names></name></person-group> (<year>2019</year>). <article-title>Neighborhood affluence is not associated with positive and negative valence processing in adults with mood and anxiety disorders: a Bayesian inference approach.</article-title> <source><italic>Neuroimage Clin.</italic></source> <volume>22</volume>:<issue>101738</issue>. <pub-id pub-id-type="doi">10.1016/j.nicl.2019.101738</pub-id> <pub-id pub-id-type="pmid">30870735</pub-id></citation></ref>
<ref id="B37"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fidler</surname> <given-names>F.</given-names></name> <name><surname>Burgman</surname> <given-names>M. A.</given-names></name> <name><surname>Cumming</surname> <given-names>G.</given-names></name> <name><surname>Buttrose</surname> <given-names>R.</given-names></name> <name><surname>Thomason</surname> <given-names>N.</given-names></name></person-group> (<year>2006</year>). <article-title>Impact of criticism of null-hypothesis significance testing on statistical reporting practices in conservation biology.</article-title> <source><italic>Conserv. Biol.</italic></source> <volume>20</volume> <fpage>1539</fpage>&#x2013;<lpage>1544</lpage>. <pub-id pub-id-type="doi">10.1111/j.1523-1739.2006.00525.x</pub-id> <pub-id pub-id-type="pmid">17002771</pub-id></citation></ref>
<ref id="B38"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Finch</surname> <given-names>S.</given-names></name> <name><surname>Cumming</surname> <given-names>G.</given-names></name> <name><surname>Thomason</surname> <given-names>N.</given-names></name></person-group> (<year>2001</year>). <article-title>Colloquium on effect sizes: the roles of editors, textbook authors, and the publication manual.</article-title> <source><italic>Educ. Psychol. Meas.</italic></source> <volume>61</volume> <fpage>181</fpage>&#x2013;<lpage>210</lpage>. <pub-id pub-id-type="doi">10.1177/0013164401612001</pub-id></citation></ref>
<ref id="B39"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friston</surname> <given-names>K.</given-names></name></person-group> (<year>2012</year>). <article-title>Ten ironic rules for non-statistical reviewers.</article-title> <source><italic>Neuroimage</italic></source> <volume>61</volume> <fpage>1300</fpage>&#x2013;<lpage>1310</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2012.04.018</pub-id> <pub-id pub-id-type="pmid">22521475</pub-id></citation></ref>
<ref id="B40"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friston</surname> <given-names>K.</given-names></name></person-group> (<year>2013</year>). <article-title>Sample size and the fallacies of classical inference.</article-title> <source><italic>Neuroimage</italic></source> <volume>81</volume> <fpage>503</fpage>&#x2013;<lpage>504</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2013.02.057</pub-id> <pub-id pub-id-type="pmid">23583356</pub-id></citation></ref>
<ref id="B41"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friston</surname> <given-names>K.</given-names></name> <name><surname>Penny</surname> <given-names>W.</given-names></name> <name><surname>Phillips</surname> <given-names>C.</given-names></name> <name><surname>Kiebel</surname> <given-names>S.</given-names></name> <name><surname>Hinton</surname> <given-names>G.</given-names></name> <name><surname>Ashburner</surname> <given-names>J.</given-names></name></person-group> (<year>2002a</year>). <article-title>Classical and Bayesian inference in neuroimaging: theory.</article-title> <source><italic>Neuroimage</italic></source> <volume>16</volume> <fpage>465</fpage>&#x2013;<lpage>483</lpage>. <pub-id pub-id-type="doi">10.1006/nimg.2002.1090</pub-id> <pub-id pub-id-type="pmid">12030832</pub-id></citation></ref>
<ref id="B42"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friston</surname> <given-names>K.</given-names></name> <name><surname>Glaser</surname> <given-names>D.</given-names></name> <name><surname>Henson</surname> <given-names>R.</given-names></name> <name><surname>Kiebel</surname> <given-names>S.</given-names></name> <name><surname>Phillips</surname> <given-names>C.</given-names></name> <name><surname>Ashburner</surname> <given-names>J.</given-names></name></person-group> (<year>2002b</year>). <article-title>Classical and Bayesian inference in neuroimaging: applications.</article-title> <source><italic>Neuroimage</italic></source> <volume>16</volume> <fpage>484</fpage>&#x2013;<lpage>512</lpage>. <pub-id pub-id-type="doi">10.1006/nimg.2002.1091</pub-id> <pub-id pub-id-type="pmid">12030833</pub-id></citation></ref>
<ref id="B43"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friston</surname> <given-names>K.</given-names></name> <name><surname>Mattout</surname> <given-names>J.</given-names></name> <name><surname>Trujillo-Barreto</surname> <given-names>N.</given-names></name> <name><surname>Ashburner</surname> <given-names>J.</given-names></name> <name><surname>Penny</surname> <given-names>W.</given-names></name></person-group> (<year>2007</year>). <article-title>Variational free energy and the Laplace approximation.</article-title> <source><italic>Neuroimage</italic></source> <volume>34</volume> <fpage>220</fpage>&#x2013;<lpage>234</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2006.08.035</pub-id> <pub-id pub-id-type="pmid">17055746</pub-id></citation></ref>
<ref id="B44"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friston</surname> <given-names>K.</given-names></name> <name><surname>Penny</surname> <given-names>W.</given-names></name></person-group> (<year>2003</year>). <article-title>Posterior probability maps and SPMs.</article-title> <source><italic>Neuroimage</italic></source> <volume>19</volume> <fpage>1240</fpage>&#x2013;<lpage>1249</lpage>. <pub-id pub-id-type="doi">10.1016/s1053-8119(03)00144-7</pub-id></citation></ref>
<ref id="B45"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friston</surname> <given-names>K.</given-names></name> <name><surname>Penny</surname> <given-names>W.</given-names></name></person-group> (<year>2011</year>). <article-title>Post hoc Bayesian model selection.</article-title> <source><italic>Neuroimage</italic></source> <volume>56</volume> <fpage>2089</fpage>&#x2013;<lpage>2099</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2011.03.062</pub-id> <pub-id pub-id-type="pmid">21459150</pub-id></citation></ref>
<ref id="B46"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friston</surname> <given-names>K. J.</given-names></name> <name><surname>Holmes</surname> <given-names>A. P.</given-names></name> <name><surname>Worsley</surname> <given-names>K. J.</given-names></name> <name><surname>Poline</surname> <given-names>J. P.</given-names></name> <name><surname>Frith</surname> <given-names>C. D.</given-names></name> <name><surname>Frackowiak</surname> <given-names>R. S. J.</given-names></name></person-group> (<year>1994</year>). <article-title>Statistical parametric maps in functional imaging: a general linear approach.</article-title> <source><italic>Hum. Brain Mapp.</italic></source> <volume>2</volume> <fpage>189</fpage>&#x2013;<lpage>210</lpage>. <pub-id pub-id-type="doi">10.1002/hbm.460020402</pub-id></citation></ref>
<ref id="B47"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Friston</surname> <given-names>K. J.</given-names></name> <name><surname>Williams</surname> <given-names>S.</given-names></name> <name><surname>Howard</surname> <given-names>R.</given-names></name> <name><surname>Frackowiak</surname> <given-names>R. S. J.</given-names></name> <name><surname>Turner</surname> <given-names>R.</given-names></name></person-group> (<year>1996</year>). <article-title>Movement-related effects in fMRI time-series.</article-title> <source><italic>Magn. Reson. Med.</italic></source> <volume>35</volume> <fpage>346</fpage>&#x2013;<lpage>355</lpage>. <pub-id pub-id-type="doi">10.1002/mrm.1910350312</pub-id> <pub-id pub-id-type="pmid">8699946</pub-id></citation></ref>
<ref id="B48"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gelman</surname> <given-names>A.</given-names></name> <name><surname>Hill</surname> <given-names>J.</given-names></name> <name><surname>Yajima</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>Why we (usually) don&#x2019;t have to worry about multiple comparisons.</article-title> <source><italic>J. Res. Educ. Eff.</italic></source> <volume>5</volume> <fpage>189</fpage>&#x2013;<lpage>211</lpage>. <pub-id pub-id-type="doi">10.1080/19345747.2011.618213</pub-id></citation></ref>
<ref id="B49"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Genovese</surname> <given-names>C. R.</given-names></name></person-group> (<year>2000</year>). <article-title>A Bayesian time-course model for functional magnetic resonance imaging data: rejoinder.</article-title> <source><italic>J. Am. Stat. Assoc.</italic></source> <volume>95</volume>:<issue>716</issue>. <pub-id pub-id-type="doi">10.2307/2669451</pub-id></citation></ref>
<ref id="B50"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Genovese</surname> <given-names>C. R.</given-names></name> <name><surname>Lazar</surname> <given-names>N. A.</given-names></name> <name><surname>Nichols</surname> <given-names>T.</given-names></name></person-group> (<year>2002</year>). <article-title>Thresholding of statistical maps in functional neuroimaging using the false discovery rate.</article-title> <source><italic>Neuroimage</italic></source> <volume>15</volume> <fpage>870</fpage>&#x2013;<lpage>878</lpage>. <pub-id pub-id-type="doi">10.1006/nimg.2001.1037</pub-id> <pub-id pub-id-type="pmid">11906227</pub-id></citation></ref>
<ref id="B51"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gigerenzer</surname> <given-names>G.</given-names></name></person-group> (<year>1993</year>). &#x201C;<article-title>The superego, the ego, and the id in statistical reasoning</article-title>,&#x201D; in <source><italic>A Handbook for Data Analysis in the Behavioral Sciences: Methodological Issues</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Keren</surname> <given-names>G.</given-names></name> <name><surname>Lewis</surname> <given-names>C.</given-names></name></person-group> (<publisher-loc>Mahwah, NJ</publisher-loc>: <publisher-name>Lawrence Erlbaum Associates, Inc.</publisher-name>), <fpage>311</fpage>&#x2013;<lpage>339</lpage>.</citation></ref>
<ref id="B52"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Glasser</surname> <given-names>M. F.</given-names></name> <name><surname>Sotiropoulos</surname> <given-names>S. N.</given-names></name> <name><surname>Wilson</surname> <given-names>J. A.</given-names></name> <name><surname>Coalson</surname> <given-names>T. S.</given-names></name> <name><surname>Fischl</surname> <given-names>B.</given-names></name> <name><surname>Andersson</surname> <given-names>J. L.</given-names></name><etal/></person-group> (<year>2013</year>). <article-title>The minimal preprocessing pipelines for the human connectome project.</article-title> <source><italic>Neuroimage</italic></source> <volume>80</volume> <fpage>105</fpage>&#x2013;<lpage>124</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2013.04.127</pub-id> <pub-id pub-id-type="pmid">23668970</pub-id></citation></ref>
<ref id="B53"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gonzalez-Castillo</surname> <given-names>J.</given-names></name> <name><surname>Saad</surname> <given-names>Z. S.</given-names></name> <name><surname>Handwerker</surname> <given-names>D. A.</given-names></name> <name><surname>Inati</surname> <given-names>S. J.</given-names></name> <name><surname>Brenowitz</surname> <given-names>N.</given-names></name> <name><surname>Bandettini</surname> <given-names>P. A.</given-names></name></person-group> (<year>2012</year>). <article-title>Whole-brain, time-locked activation with simple tasks revealed using massive averaging and model-free analysis.</article-title> <source><italic>Proc. Natl. Acad. Sci. US.A.</italic></source> <volume>109</volume> <fpage>5487</fpage>&#x2013;<lpage>5492</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1121049109</pub-id> <pub-id pub-id-type="pmid">22431587</pub-id></citation></ref>
<ref id="B54"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Goodman</surname> <given-names>S.</given-names></name></person-group> (<year>2008</year>). <article-title>A dirty dozen: twelve P-value misconceptions.</article-title> <source><italic>Semin. Hematol.</italic></source> <volume>45</volume> <fpage>135</fpage>&#x2013;<lpage>140</lpage>. <pub-id pub-id-type="doi">10.1053/j.seminhematol.2008.04.003</pub-id> <pub-id pub-id-type="pmid">18582619</pub-id></citation></ref>
<ref id="B55"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Goodman</surname> <given-names>S. N.</given-names></name></person-group> (<year>1993</year>). <article-title>p values, hypothesis tests, and likelihood: implications for epidemiology of a neglected historical debate.</article-title> <source><italic>Am. J. Epidemiol.</italic></source> <volume>137</volume> <fpage>485</fpage>&#x2013;<lpage>496</lpage>. <pub-id pub-id-type="doi">10.1093/oxfordjournals.aje.a116700</pub-id> <pub-id pub-id-type="pmid">8465801</pub-id></citation></ref>
<ref id="B56"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gopalan</surname> <given-names>R.</given-names></name> <name><surname>Berry</surname> <given-names>D. A.</given-names></name></person-group> (<year>1998</year>). <article-title>Bayesian multiple comparisons using dirichlet process priors.</article-title> <source><italic>J. Am. Stat. Assoc.</italic></source> <volume>93</volume> <fpage>1130</fpage>&#x2013;<lpage>1139</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.1998.10473774</pub-id></citation></ref>
<ref id="B57"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gorgolewski</surname> <given-names>K. J.</given-names></name> <name><surname>Durnez</surname> <given-names>J.</given-names></name> <name><surname>Poldrack</surname> <given-names>R. A.</given-names></name></person-group> (<year>2017</year>). <article-title>Preprocessed consortium for neuropsychiatric phenomics dataset.</article-title> <source><italic>F1000Res.</italic></source> <volume>6</volume>:<issue>1262</issue>. <pub-id pub-id-type="doi">10.12688/f1000research.11964.2</pub-id> <pub-id pub-id-type="pmid">29152222</pub-id></citation></ref>
<ref id="B58"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Greenland</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>Valid P-values behave exactly as they should: some misleading criticisms of P-values and their resolution with S-values.</article-title> <source><italic>Am. Stat.</italic></source> <volume>73</volume>(<issue>Suppl. 1</issue>) <fpage>106</fpage>&#x2013;<lpage>114</lpage>. <pub-id pub-id-type="doi">10.1080/00031305.2018.1529625</pub-id></citation></ref>
<ref id="B59"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Greenland</surname> <given-names>S.</given-names></name> <name><surname>Senn</surname> <given-names>S. J.</given-names></name> <name><surname>Rothman</surname> <given-names>K. J.</given-names></name> <name><surname>Carlin</surname> <given-names>J. B.</given-names></name> <name><surname>Poole</surname> <given-names>C.</given-names></name> <name><surname>Goodman</surname> <given-names>S. N.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>Statistical tests, p values, confidence intervals, and power: a guide to misinterpretations.</article-title> <source><italic>Eur. J. Epidemiol.</italic></source> <volume>31</volume> <fpage>337</fpage>&#x2013;<lpage>350</lpage>. <pub-id pub-id-type="doi">10.1007/s10654-016-0149-3</pub-id> <pub-id pub-id-type="pmid">27209009</pub-id></citation></ref>
<ref id="B60"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Greenwald</surname> <given-names>A. G.</given-names></name></person-group> (<year>1975</year>). <article-title>Consequences of prejudice against the null hypothesis.</article-title> <source><italic>Psychol. Bull.</italic></source> <volume>82</volume> <fpage>1</fpage>&#x2013;<lpage>20</lpage>. <pub-id pub-id-type="doi">10.1037/h0076157</pub-id></citation></ref>
<ref id="B61"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gusnard</surname> <given-names>D. A.</given-names></name> <name><surname>Raichle</surname> <given-names>M. E.</given-names></name></person-group> (<year>2001</year>). <article-title>Searching for a baseline: functional imaging and the resting human brain.</article-title> <source><italic>Nat. Rev. Neurosci.</italic></source> <volume>2</volume> <fpage>685</fpage>&#x2013;<lpage>694</lpage>. <pub-id pub-id-type="doi">10.1038/35094500</pub-id> <pub-id pub-id-type="pmid">11584306</pub-id></citation></ref>
<ref id="B62"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hagen</surname> <given-names>R. L.</given-names></name></person-group> (<year>1997</year>). <article-title>In praise of the null hypothesis statistical test.</article-title> <source><italic>Am. Psychol.</italic></source> <volume>52</volume> <fpage>15</fpage>&#x2013;<lpage>24</lpage>. <pub-id pub-id-type="doi">10.1037/0003-066x.52.1.15</pub-id></citation></ref>
<ref id="B63"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hodges</surname> <given-names>J. L.</given-names></name> <name><surname>Lehmann</surname> <given-names>E. L.</given-names></name></person-group> (<year>1954</year>). <article-title>Testing the approximate validity of statistical hypotheses.</article-title> <source><italic>J. R. Stat. Soc. Ser. B Methodol.</italic></source> <volume>16</volume> <fpage>261</fpage>&#x2013;<lpage>268</lpage>. <pub-id pub-id-type="doi">10.1111/j.2517-6161.1954.tb00169.x</pub-id></citation></ref>
<ref id="B64"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hoekstra</surname> <given-names>R.</given-names></name> <name><surname>Finch</surname> <given-names>S.</given-names></name> <name><surname>Kiers</surname> <given-names>H. A. L.</given-names></name> <name><surname>Johnson</surname> <given-names>A.</given-names></name></person-group> (<year>2006</year>). <article-title>Probability as certainty: dichotomous thinking and the misuse of p values.</article-title> <source><italic>Psychon. Bull. Rev.</italic></source> <volume>13</volume> <fpage>1033</fpage>&#x2013;<lpage>1037</lpage>. <pub-id pub-id-type="doi">10.3758/bf03213921</pub-id> <pub-id pub-id-type="pmid">17484431</pub-id></citation></ref>
<ref id="B65"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hoekstra</surname> <given-names>R.</given-names></name> <name><surname>Morey</surname> <given-names>R. D.</given-names></name> <name><surname>Rouder</surname> <given-names>J. N.</given-names></name> <name><surname>Wagenmakers</surname> <given-names>E. J.</given-names></name></person-group> (<year>2014</year>). <article-title>Robust misinterpretation of confidence intervals.</article-title> <source><italic>Psychon. Bull. Rev.</italic></source> <volume>21</volume> <fpage>1157</fpage>&#x2013;<lpage>1164</lpage>. <pub-id pub-id-type="doi">10.3758/s13423-013-0572-3</pub-id> <pub-id pub-id-type="pmid">24420726</pub-id></citation></ref>
<ref id="B66"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hubbard</surname> <given-names>R.</given-names></name> <name><surname>Bayarri</surname> <given-names>M. J.</given-names></name></person-group> (<year>2003</year>). <article-title>Confusion over measures of evidence (p&#x2019;s) versus errors (&#x03B1;&#x2019;s) in classical statistical testing.</article-title> <source><italic>Am. Stat.</italic></source> <volume>57</volume> <fpage>171</fpage>&#x2013;<lpage>178</lpage>. <pub-id pub-id-type="doi">10.1198/0003130031856</pub-id> <pub-id pub-id-type="pmid">12611515</pub-id></citation></ref>
<ref id="B67"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hubbard</surname> <given-names>R.</given-names></name> <name><surname>Lindsay</surname> <given-names>R. M.</given-names></name></person-group> (<year>2008</year>). <article-title>Why p values are not a useful measure of evidence in statistical significance testing.</article-title> <source><italic>Theory Psychol.</italic></source> <volume>18</volume> <fpage>69</fpage>&#x2013;<lpage>88</lpage>. <pub-id pub-id-type="doi">10.1177/0959354307086923</pub-id></citation></ref>
<ref id="B68"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ioannidis</surname> <given-names>J. P.</given-names></name> <name><surname>Munaf&#x00F2;</surname> <given-names>M. R.</given-names></name> <name><surname>Fusar-Poli</surname> <given-names>P.</given-names></name> <name><surname>Nosek</surname> <given-names>B. A.</given-names></name> <name><surname>David</surname> <given-names>S. P.</given-names></name></person-group> (<year>2014</year>). <article-title>Publication and other reporting biases in cognitive sciences: detection, prevalence, and prevention.</article-title> <source><italic>Trends Cogn. Sci.</italic></source> <volume>18</volume> <fpage>235</fpage>&#x2013;<lpage>241</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2014.02.010</pub-id> <pub-id pub-id-type="pmid">24656991</pub-id></citation></ref>
<ref id="B69"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ioannidis</surname> <given-names>J. P. A.</given-names></name></person-group> (<year>2019</year>). <article-title>What have we (not) learnt from millions of scientific papers with p values?</article-title> <source><italic>Am. Stat.</italic></source> <volume>73</volume>(<issue>Suppl. 1</issue>) <fpage>20</fpage>&#x2013;<lpage>25</lpage>. <pub-id pub-id-type="doi">10.1080/00031305.2018.1447512</pub-id></citation></ref>
<ref id="B70"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jeffreys</surname> <given-names>H.</given-names></name></person-group> (<year>1939/1948</year>). <source><italic>Theory of Probability</italic></source>, <edition>2nd Edn</edition>. <publisher-loc>Oxford</publisher-loc>: <publisher-name>The Clarendon Press</publisher-name>.</citation></ref>
<ref id="B71"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Jennings</surname> <given-names>R. G.</given-names></name> <name><surname>Van Horn</surname> <given-names>J. D.</given-names></name></person-group> (<year>2012</year>). <article-title>Publication bias in neuroimaging research: implications for meta-analyses.</article-title> <source><italic>Neuroinformatics</italic></source> <volume>10</volume> <fpage>67</fpage>&#x2013;<lpage>80</lpage>. <pub-id pub-id-type="doi">10.1007/s12021-011-9125-y</pub-id> <pub-id pub-id-type="pmid">21643733</pub-id></citation></ref>
<ref id="B72"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johansson</surname> <given-names>T.</given-names></name></person-group> (<year>2011</year>). <article-title>Hail the impossible: p-values, evidence, and likelihood.</article-title> <source><italic>Scand. J. Psychol.</italic></source> <volume>52</volume> <fpage>113</fpage>&#x2013;<lpage>125</lpage>. <pub-id pub-id-type="doi">10.1111/j.1467-9450.2010.00852.x</pub-id> <pub-id pub-id-type="pmid">21077903</pub-id></citation></ref>
<ref id="B73"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Johnson</surname> <given-names>N. L.</given-names></name> <name><surname>Kotz</surname> <given-names>S.</given-names></name> <name><surname>Balakrishnan</surname> <given-names>N.</given-names></name></person-group> (<year>1994</year>). <source><italic>Continuous Univariate Distributions</italic></source>, <volume>Vol. 6</volume>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>John Wiley and Sons</publisher-name>, <fpage>1</fpage>&#x2013;<lpage>119</lpage>.</citation></ref>
<ref id="B74"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Joyce</surname> <given-names>K. E.</given-names></name> <name><surname>Hayasaka</surname> <given-names>S.</given-names></name></person-group> (<year>2012</year>). <article-title>Development of PowerMap: a software package for statistical power calculation in neuroimaging studies.</article-title> <source><italic>Neuroinformatics</italic></source> <volume>10</volume> <fpage>351</fpage>&#x2013;<lpage>365</lpage>. <pub-id pub-id-type="doi">10.1007/s12021-012-9152-3</pub-id> <pub-id pub-id-type="pmid">22644868</pub-id></citation></ref>
<ref id="B75"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kass</surname> <given-names>R. E.</given-names></name> <name><surname>Raftery</surname> <given-names>A. E.</given-names></name></person-group> (<year>1995</year>). <article-title>Bayes factors.</article-title> <source><italic>J. Am. Stat. Assoc.</italic></source> <volume>90</volume> <fpage>773</fpage>&#x2013;<lpage>795</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.1995.10476572</pub-id></citation></ref>
<ref id="B76"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kirk</surname> <given-names>R. E.</given-names></name></person-group> (<year>1996</year>). <article-title>Practical significance: a concept whose time has come.</article-title> <source><italic>Educ. Psychol. Meas.</italic></source> <volume>56</volume> <fpage>746</fpage>&#x2013;<lpage>759</lpage>. <pub-id pub-id-type="doi">10.1177/0013164496056005002</pub-id></citation></ref>
<ref id="B77"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Knief</surname> <given-names>U.</given-names></name> <name><surname>Forstmeier</surname> <given-names>W.</given-names></name></person-group> (<year>2021</year>). <article-title>Violating the normality assumption may be the lesser of two evils.</article-title> <source><italic>Behav. Res. Methods</italic></source> <fpage>1</fpage>&#x2013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.3758/s13428-021-01587-5</pub-id> <pub-id pub-id-type="pmid">33963496</pub-id></citation></ref>
<ref id="B78"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kruschke</surname> <given-names>J. K.</given-names></name></person-group> (<year>2010</year>). <article-title>What to believe: Bayesian methods for data analysis.</article-title> <source><italic>Trends Cogn. Sci.</italic></source> <volume>14</volume> <fpage>293</fpage>&#x2013;<lpage>300</lpage>. <pub-id pub-id-type="doi">10.1016/j.tics.2010.05.001</pub-id> <pub-id pub-id-type="pmid">20542462</pub-id></citation></ref>
<ref id="B79"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kruschke</surname> <given-names>J. K.</given-names></name></person-group> (<year>2011</year>). <article-title>Bayesian assessment of null values via parameter estimation and model comparison.</article-title> <source><italic>Perspect. Psychol. Sci.</italic></source> <volume>6</volume> <fpage>299</fpage>&#x2013;<lpage>312</lpage>. <pub-id pub-id-type="doi">10.1177/1745691611406925</pub-id> <pub-id pub-id-type="pmid">26168520</pub-id></citation></ref>
<ref id="B80"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kruschke</surname> <given-names>J. K.</given-names></name> <name><surname>Liddell</surname> <given-names>T. M.</given-names></name></person-group> (<year>2017b</year>). <article-title>The Bayesian new statistics: hypothesis testing, estimation, meta-analysis, and power analysis from a Bayesian perspective.</article-title> <source><italic>Psychon. Bull. Rev.</italic></source> <volume>25</volume> <fpage>178</fpage>&#x2013;<lpage>206</lpage>. <pub-id pub-id-type="doi">10.3758/s13423-016-1221-4</pub-id> <pub-id pub-id-type="pmid">28176294</pub-id></citation></ref>
<ref id="B81"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kruschke</surname> <given-names>J. K.</given-names></name> <name><surname>Liddell</surname> <given-names>T. M.</given-names></name></person-group> (<year>2017a</year>). <article-title>Bayesian data analysis for newcomers.</article-title> <source><italic>Psychon. Bull. Rev.</italic></source> <volume>25</volume> <fpage>155</fpage>&#x2013;<lpage>177</lpage>. <pub-id pub-id-type="doi">10.3758/s13423-017-1272-1</pub-id> <pub-id pub-id-type="pmid">28405907</pub-id></citation></ref>
<ref id="B82"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lakens</surname> <given-names>D.</given-names></name></person-group> (<year>2017</year>). <article-title>Equivalence tests.</article-title> <source><italic>Soc. Psychol. Pers. Sci.</italic></source> <volume>8</volume> <fpage>355</fpage>&#x2013;<lpage>362</lpage>. <pub-id pub-id-type="doi">10.1177/1948550617697177</pub-id> <pub-id pub-id-type="pmid">28736600</pub-id></citation></ref>
<ref id="B83"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lakens</surname> <given-names>D.</given-names></name> <name><surname>McLatchie</surname> <given-names>N.</given-names></name> <name><surname>Isager</surname> <given-names>P. M.</given-names></name> <name><surname>Scheel</surname> <given-names>A. M.</given-names></name> <name><surname>Dienes</surname> <given-names>Z.</given-names></name></person-group> (<year>2018</year>). <article-title>Improving inferences about null effects with Bayes factors and equivalence tests.</article-title> <source><italic>J. Gerontol. Ser. B</italic></source> <volume>75</volume> <fpage>45</fpage>&#x2013;<lpage>57</lpage>. <pub-id pub-id-type="doi">10.1093/geronb/gby065</pub-id> <pub-id pub-id-type="pmid">29878211</pub-id></citation></ref>
<ref id="B84"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Liao</surname> <given-names>J. G.</given-names></name> <name><surname>Midya</surname> <given-names>V.</given-names></name> <name><surname>Berg</surname> <given-names>A.</given-names></name></person-group> (<year>2019</year>). <article-title>Connecting Bayes factor and the region of practical equivalence (ROPE) procedure for testing interval null hypothesis.</article-title> <source><italic>arXiv</italic></source> <comment>[Preprint] arXiv:1903.03153</comment>,</citation></ref>
<ref id="B85"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lindley</surname> <given-names>D.</given-names></name></person-group> (<year>1965</year>). <source><italic>Introduction to Probability and Statistics from a Bayesian Viewpoint</italic></source>, <edition>1st Edn</edition>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation></ref>
<ref id="B86"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lindley</surname> <given-names>D. V.</given-names></name></person-group> (<year>1957</year>). <article-title>A statistical paradox.</article-title> <source><italic>Biometrika</italic></source> <volume>44</volume>:<issue>187</issue>. <pub-id pub-id-type="doi">10.2307/2333251</pub-id></citation></ref>
<ref id="B87"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lindley</surname> <given-names>D. V.</given-names></name></person-group> (<year>1975</year>). <article-title>The future of statistics: a Bayesian 21st century.</article-title> <source><italic>Adv. Appl. Probab.</italic></source> <volume>7</volume>:<issue>106</issue>. <pub-id pub-id-type="doi">10.2307/1426315</pub-id></citation></ref>
<ref id="B88"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lindley</surname> <given-names>D. V.</given-names></name></person-group> (<year>1990</year>). <article-title>The 1988 wald memorial lectures: the present position in Bayesian statistics.</article-title> <source><italic>Stat. Sci.</italic></source> <volume>5</volume> <fpage>44</fpage>&#x2013;<lpage>65</lpage>. <pub-id pub-id-type="doi">10.1214/ss/1177012253</pub-id></citation></ref>
<ref id="B89"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Magerkurth</surname> <given-names>J.</given-names></name> <name><surname>Mancini</surname> <given-names>L.</given-names></name> <name><surname>Penny</surname> <given-names>W.</given-names></name> <name><surname>Flandin</surname> <given-names>G.</given-names></name> <name><surname>Ashburner</surname> <given-names>J.</given-names></name> <name><surname>Micallef</surname> <given-names>C.</given-names></name><etal/></person-group> (<year>2015</year>). <article-title>Objective Bayesian fMRI analysis&#x2013;a pilot study in different clinical environments.</article-title> <source><italic>Front. Neurosci.</italic></source> <volume>9</volume>:<issue>168</issue>. <pub-id pub-id-type="doi">10.3389/fnins.2015.00168</pub-id> <pub-id pub-id-type="pmid">26029041</pub-id></citation></ref>
<ref id="B90"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meehl</surname> <given-names>P. E.</given-names></name></person-group> (<year>1967</year>). <article-title>Theory-testing in psychology and physics: a methodological paradox.</article-title> <source><italic>Philos. Sci.</italic></source> <volume>34</volume> <fpage>103</fpage>&#x2013;<lpage>115</lpage>. <pub-id pub-id-type="doi">10.1086/288135</pub-id></citation></ref>
<ref id="B91"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meehl</surname> <given-names>P. E.</given-names></name></person-group> (<year>1978</year>). <article-title>Theoretical risks and tabular asterisks: Sir Karl, Sir Ronald, and the slow progress of soft psychology</article-title>. <source><italic>J. Consult. Clin. Psychol.</italic></source> <volume>46</volume>, <fpage>806</fpage>&#x2013;<lpage>834</lpage>. <pub-id pub-id-type="doi">10.1037/0022-006X.46.4.806</pub-id></citation></ref>
<ref id="B92"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Meyners</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>Equivalence tests &#x2013; a review.</article-title> <source><italic>Food Qual. Prefer.</italic></source> <volume>26</volume> <fpage>231</fpage>&#x2013;<lpage>245</lpage>. <pub-id pub-id-type="doi">10.1016/j.foodqual.2012.05.003</pub-id></citation></ref>
<ref id="B93"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Morey</surname> <given-names>R. D.</given-names></name> <name><surname>Hoekstra</surname> <given-names>R.</given-names></name> <name><surname>Rouder</surname> <given-names>J. N.</given-names></name> <name><surname>Wagenmakers</surname> <given-names>E. J.</given-names></name></person-group> (<year>2015</year>). <article-title>Continued misinterpretation of confidence intervals: response to Miller and Ulrich.</article-title> <source><italic>Psychon. Bull. Rev.</italic></source> <volume>23</volume> <fpage>131</fpage>&#x2013;<lpage>140</lpage>. <pub-id pub-id-type="doi">10.3758/s13423-015-0955-8</pub-id> <pub-id pub-id-type="pmid">26620955</pub-id></citation></ref>
<ref id="B94"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Morey</surname> <given-names>R. D.</given-names></name> <name><surname>Rouder</surname> <given-names>J. N.</given-names></name></person-group> (<year>2011</year>). <article-title>Bayes factor approaches for testing interval null hypotheses.</article-title> <source><italic>Psychol. Methods</italic></source> <volume>16</volume> <fpage>406</fpage>&#x2013;<lpage>419</lpage>. <pub-id pub-id-type="doi">10.1037/a0024377</pub-id> <pub-id pub-id-type="pmid">21787084</pub-id></citation></ref>
<ref id="B95"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Muller</surname> <given-names>P.</given-names></name> <name><surname>Parmigiani</surname> <given-names>G.</given-names></name> <name><surname>Rice</surname> <given-names>K.</given-names></name></person-group> (<year>2006</year>). &#x201C;<article-title>FDR and Bayesian multiple comparisons rules</article-title>,&#x201D; in <source><italic>Proceedings of the 8th Valencia International Meeting Bayesian Statistics 8</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Bernardo</surname> <given-names>J. M.</given-names></name> <name><surname>Bayarri</surname> <given-names>M. J.</given-names></name> <name><surname>Berger</surname> <given-names>J. O.</given-names></name> <name><surname>Dawid</surname> <given-names>A. P.</given-names></name> <name><surname>Heckerman</surname> <given-names>D.</given-names></name> <name><surname>Smith</surname> <given-names>A. F. M.</given-names></name><etal/></person-group> (<publisher-loc>Oxford</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>), <fpage>366</fpage>&#x2013;<lpage>368</lpage>.</citation></ref>
<ref id="B96"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mumford</surname> <given-names>J. A.</given-names></name></person-group> (<year>2012</year>). <article-title>A power calculation guide for fMRI studies.</article-title> <source><italic>Soc. Cogn. Affect. Neurosci.</italic></source> <volume>7</volume> <fpage>738</fpage>&#x2013;<lpage>742</lpage>. <pub-id pub-id-type="doi">10.1093/scan/nss059</pub-id> <pub-id pub-id-type="pmid">22641837</pub-id></citation></ref>
<ref id="B97"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mumford</surname> <given-names>J. A.</given-names></name> <name><surname>Nichols</surname> <given-names>T. E.</given-names></name></person-group> (<year>2008</year>). <article-title>Power calculation for group fMRI studies accounting for arbitrary design and temporal autocorrelation.</article-title> <source><italic>Neuroimage</italic></source> <volume>39</volume> <fpage>261</fpage>&#x2013;<lpage>268</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2007.07.061</pub-id> <pub-id pub-id-type="pmid">17919925</pub-id></citation></ref>
<ref id="B98"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Murphy</surname> <given-names>K. R.</given-names></name> <name><surname>Myors</surname> <given-names>B.</given-names></name></person-group> (<year>1999</year>). <article-title>Testing the hypothesis that treatments have negligible effects: minimum-effect tests in the general linear model.</article-title> <source><italic>J. Appl. Psychol.</italic></source> <volume>84</volume> <fpage>234</fpage>&#x2013;<lpage>248</lpage>. <pub-id pub-id-type="doi">10.1037/0021-9010.84.2.234</pub-id></citation></ref>
<ref id="B99"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Murphy</surname> <given-names>K. R.</given-names></name> <name><surname>Myors</surname> <given-names>B.</given-names></name></person-group> (<year>2004</year>). <source><italic>Statistical Power Analysis: A Simple and General Model for Traditional and Modern Hypothesis Tests</italic></source>, <edition>2nd Edn</edition>. <publisher-loc>Mahwah, NJ</publisher-loc>: <publisher-name>Lawrence Erlbaum Associates</publisher-name>.</citation></ref>
<ref id="B100"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nichols</surname> <given-names>T.</given-names></name> <name><surname>Hayasaka</surname> <given-names>S.</given-names></name></person-group> (<year>2003</year>). <article-title>Controlling the familywise error rate in functional neuroimaging: a comparative review.</article-title> <source><italic>Stat. Methods Med. Res.</italic></source> <volume>12</volume> <fpage>419</fpage>&#x2013;<lpage>446</lpage>. <pub-id pub-id-type="doi">10.1191/0962280203sm341ra</pub-id> <pub-id pub-id-type="pmid">14599004</pub-id></citation></ref>
<ref id="B101"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nichols</surname> <given-names>T. E.</given-names></name></person-group> (<year>2012</year>). <article-title>Multiple testing corrections, nonparametric methods, and random field theory.</article-title> <source><italic>Neuroimage</italic></source> <volume>62</volume> <fpage>811</fpage>&#x2013;<lpage>815</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2012.04.014</pub-id> <pub-id pub-id-type="pmid">22521256</pub-id></citation></ref>
<ref id="B102"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nickerson</surname> <given-names>R. S.</given-names></name></person-group> (<year>2000</year>). <article-title>Null hypothesis significance testing: a review of an old and continuing controversy.</article-title> <source><italic>Psychol. Methods</italic></source> <volume>5</volume> <fpage>241</fpage>&#x2013;<lpage>301</lpage>. <pub-id pub-id-type="doi">10.1037/1082-989x.5.2.241</pub-id> <pub-id pub-id-type="pmid">10937333</pub-id></citation></ref>
<ref id="B103"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Penny</surname> <given-names>W.</given-names></name> <name><surname>Flandin</surname> <given-names>G.</given-names></name> <name><surname>Trujillo-Barreto</surname> <given-names>N.</given-names></name></person-group> (<year>2007</year>). <article-title>Bayesian comparison of spatially regularised general linear models.</article-title> <source><italic>Hum. Brain Mapp.</italic></source> <volume>28</volume> <fpage>275</fpage>&#x2013;<lpage>293</lpage>. <pub-id pub-id-type="doi">10.1002/hbm.20327</pub-id> <pub-id pub-id-type="pmid">17133400</pub-id></citation></ref>
<ref id="B104"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Penny</surname> <given-names>W.</given-names></name> <name><surname>Kiebel</surname> <given-names>S.</given-names></name> <name><surname>Friston</surname> <given-names>K.</given-names></name></person-group> (<year>2003</year>). <article-title>Variational Bayesian inference for fMRI time series.</article-title> <source><italic>Neuroimage</italic></source> <volume>19</volume> <fpage>727</fpage>&#x2013;<lpage>741</lpage>. <pub-id pub-id-type="doi">10.1016/s1053-8119(03)00071-5</pub-id></citation></ref>
<ref id="B105"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Penny</surname> <given-names>W. D.</given-names></name> <name><surname>Ridgway</surname> <given-names>G. R.</given-names></name></person-group> (<year>2013</year>). <article-title>Efficient posterior probability mapping using savage-dickey ratios.</article-title> <source><italic>PLoS One</italic></source> <volume>8</volume>:<issue>e59655</issue>. <pub-id pub-id-type="doi">10.1371/journal.pone.0059655</pub-id> <pub-id pub-id-type="pmid">23533640</pub-id></citation></ref>
<ref id="B106"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Penny</surname> <given-names>W. D.</given-names></name> <name><surname>Trujillo-Barreto</surname> <given-names>N. J.</given-names></name> <name><surname>Friston</surname> <given-names>K. J.</given-names></name></person-group> (<year>2005</year>). <article-title>Bayesian fMRI time series analysis with spatial priors.</article-title> <source><italic>Neuroimage</italic></source> <volume>24</volume> <fpage>350</fpage>&#x2013;<lpage>362</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2004.08.034</pub-id> <pub-id pub-id-type="pmid">15627578</pub-id></citation></ref>
<ref id="B107"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Perezgonzalez</surname> <given-names>J. D.</given-names></name></person-group> (<year>2015</year>). <article-title>Fisher, Neyman-Pearson or NHST? A tutorial for teaching data testing.</article-title> <source><italic>Front. Psychol.</italic></source> <volume>6</volume>:<issue>223</issue>. <pub-id pub-id-type="doi">10.3389/fpsyg.2015.00223</pub-id> <pub-id pub-id-type="pmid">25784889</pub-id></citation></ref>
<ref id="B108"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pernet</surname> <given-names>C. R.</given-names></name></person-group> (<year>2014</year>). <article-title>Misconceptions in the use of the general linear model applied to functional MRI: a tutorial for junior neuro-imagers.</article-title> <source><italic>Front. Neurosci.</italic></source> <volume>8</volume>:<issue>1</issue>. <pub-id pub-id-type="doi">10.3389/fnins.2014.00001</pub-id> <pub-id pub-id-type="pmid">24478622</pub-id></citation></ref>
<ref id="B109"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poldrack</surname> <given-names>R.</given-names></name> <name><surname>Congdon</surname> <given-names>E.</given-names></name> <name><surname>Triplett</surname> <given-names>W.</given-names></name> <name><surname>Gorgolewski</surname> <given-names>K.</given-names></name> <name><surname>Karlsgodt</surname> <given-names>K.</given-names></name> <name><surname>Mumford</surname> <given-names>J.</given-names></name><etal/></person-group> (<year>2016</year>). <article-title>A phenome-wide examination of neural and cognitive function.</article-title> <source><italic>Sci. Data</italic></source> <volume>3</volume>:<issue>160110</issue>. <pub-id pub-id-type="doi">10.1038/sdata.2016.110</pub-id> <pub-id pub-id-type="pmid">27922632</pub-id></citation></ref>
<ref id="B110"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poldrack</surname> <given-names>R. A.</given-names></name> <name><surname>Baker</surname> <given-names>C. I.</given-names></name> <name><surname>Durnez</surname> <given-names>J.</given-names></name> <name><surname>Gorgolewski</surname> <given-names>K. J.</given-names></name> <name><surname>Matthews</surname> <given-names>P. M.</given-names></name> <name><surname>Munaf&#x00F2;</surname> <given-names>M. R.</given-names></name><etal/></person-group> (<year>2017</year>). <article-title>Scanning the horizon: towards transparent and reproducible neuroimaging research.</article-title> <source><italic>Nat. Rev. Neurosci.</italic></source> <volume>18</volume> <fpage>115</fpage>&#x2013;<lpage>126</lpage>. <pub-id pub-id-type="doi">10.1038/nrn.2016.167</pub-id> <pub-id pub-id-type="pmid">28053326</pub-id></citation></ref>
<ref id="B111"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poldrack</surname> <given-names>R. A.</given-names></name> <name><surname>Mumford</surname> <given-names>J. A.</given-names></name> <name><surname>Nichols</surname> <given-names>T. E.</given-names></name></person-group> (<year>2011</year>). <source><italic>Handbook of Functional MRI Data Analysis.</italic></source> <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation></ref>
<ref id="B112"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Poline</surname> <given-names>J. B.</given-names></name> <name><surname>Brett</surname> <given-names>M.</given-names></name></person-group> (<year>2012</year>). <article-title>The general linear model and fMRI: does love last forever?</article-title> <source><italic>Neuroimage</italic></source> <volume>62</volume> <fpage>871</fpage>&#x2013;<lpage>880</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2012.01.133</pub-id> <pub-id pub-id-type="pmid">22343127</pub-id></citation></ref>
<ref id="B113"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pollard</surname> <given-names>P.</given-names></name> <name><surname>Richardson</surname> <given-names>J. T.</given-names></name></person-group> (<year>1987</year>). <article-title>On the probability of making type I errors.</article-title> <source><italic>Psychol. Bull.</italic></source> <volume>102</volume> <fpage>159</fpage>&#x2013;<lpage>163</lpage>. <pub-id pub-id-type="doi">10.1037/0033-2909.102.1.159</pub-id></citation></ref>
<ref id="B114"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Raichle</surname> <given-names>M. E.</given-names></name> <name><surname>Gusnard</surname> <given-names>D. A.</given-names></name></person-group> (<year>2002</year>). <article-title>Appraising the brain&#x2019;s energy budget.</article-title> <source><italic>Proc. Natl. Acad. Sci. U.S.A.</italic></source> <volume>99</volume> <fpage>10237</fpage>&#x2013;<lpage>10239</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.172399499</pub-id> <pub-id pub-id-type="pmid">12149485</pub-id></citation></ref>
<ref id="B115"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Reimold</surname> <given-names>M.</given-names></name> <name><surname>Slifstein</surname> <given-names>M.</given-names></name> <name><surname>Heinz</surname> <given-names>A.</given-names></name> <name><surname>Mueller-Schauenburg</surname> <given-names>W.</given-names></name> <name><surname>Bares</surname> <given-names>R.</given-names></name></person-group> (<year>2005</year>). <article-title>Effect of spatial smoothing on t-Maps: arguments for going back from t-Maps to masked contrast images.</article-title> <source><italic>J. Cereb. Blood Flow Metab.</italic></source> <volume>26</volume> <fpage>751</fpage>&#x2013;<lpage>759</lpage>. <pub-id pub-id-type="doi">10.1038/sj.jcbfm.9600231</pub-id> <pub-id pub-id-type="pmid">16208316</pub-id></citation></ref>
<ref id="B116"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rogers</surname> <given-names>J. L.</given-names></name> <name><surname>Howard</surname> <given-names>K. I.</given-names></name> <name><surname>Vessey</surname> <given-names>J. T.</given-names></name></person-group> (<year>1993</year>). <article-title>Using significance tests to evaluate equivalence between two experimental groups.</article-title> <source><italic>Psychol. Bull.</italic></source> <volume>113</volume> <fpage>553</fpage>&#x2013;<lpage>565</lpage>. <pub-id pub-id-type="doi">10.1037/0033-2909.113.3.553</pub-id> <pub-id pub-id-type="pmid">8316613</pub-id></citation></ref>
<ref id="B117"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rosa</surname> <given-names>M.</given-names></name> <name><surname>Friston</surname> <given-names>K.</given-names></name> <name><surname>Penny</surname> <given-names>W.</given-names></name></person-group> (<year>2012</year>). <article-title>Post-hoc selection of dynamic causal models.</article-title> <source><italic>J. Neurosci. Methods</italic></source> <volume>208</volume> <fpage>66</fpage>&#x2013;<lpage>78</lpage>. <pub-id pub-id-type="doi">10.1016/j.jneumeth.2012.04.013</pub-id> <pub-id pub-id-type="pmid">22561579</pub-id></citation></ref>
<ref id="B118"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rosenthal</surname> <given-names>R.</given-names></name></person-group> (<year>1979</year>). <article-title>The file drawer problem and tolerance for null results.</article-title> <source><italic>Psychol. Bull.</italic></source> <volume>86</volume> <fpage>638</fpage>&#x2013;<lpage>641</lpage>. <pub-id pub-id-type="doi">10.1037/0033-2909.86.3.638</pub-id></citation></ref>
<ref id="B119"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rouder</surname> <given-names>J. N.</given-names></name></person-group> (<year>2014</year>). <article-title>Optional stopping: no problem for Bayesians.</article-title> <source><italic>Psychon. Bull. Rev.</italic></source> <volume>21</volume> <fpage>301</fpage>&#x2013;<lpage>308</lpage>. <pub-id pub-id-type="doi">10.3758/s13423-014-0595-4</pub-id> <pub-id pub-id-type="pmid">24659049</pub-id></citation></ref>
<ref id="B120"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rouder</surname> <given-names>J. N.</given-names></name> <name><surname>Speckman</surname> <given-names>P. L.</given-names></name> <name><surname>Sun</surname> <given-names>D.</given-names></name> <name><surname>Morey</surname> <given-names>R. D.</given-names></name> <name><surname>Iverson</surname> <given-names>G.</given-names></name></person-group> (<year>2009</year>). <article-title>Bayesian t tests for accepting and rejecting the null hypothesis.</article-title> <source><italic>Psychon. Bull. Rev.</italic></source> <volume>16</volume> <fpage>225</fpage>&#x2013;<lpage>237</lpage>. <pub-id pub-id-type="doi">10.3758/pbr.16.2.225</pub-id> <pub-id pub-id-type="pmid">19293088</pub-id></citation></ref>
<ref id="B121"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Royall</surname> <given-names>R. M.</given-names></name></person-group> (<year>1986</year>). <article-title>The effect of sample size on the meaning of significance tests.</article-title> <source><italic>Am. Stat.</italic></source> <volume>40</volume>:<issue>313</issue>. <pub-id pub-id-type="doi">10.2307/2684616</pub-id></citation></ref>
<ref id="B122"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Royall</surname> <given-names>R. M.</given-names></name></person-group> (<year>1997</year>). <source><italic>Statistical Evidence: A Likelihood Paradigm.</italic></source> <publisher-loc>Boca Raton, FL</publisher-loc>: <publisher-name>CRC Press</publisher-name>.</citation></ref>
<ref id="B123"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Samartsidis</surname> <given-names>P.</given-names></name> <name><surname>Montagna</surname> <given-names>S.</given-names></name> <name><surname>Laird</surname> <given-names>A. R.</given-names></name> <name><surname>Fox</surname> <given-names>P. T.</given-names></name> <name><surname>Johnson</surname> <given-names>T. D.</given-names></name> <name><surname>Nichols</surname> <given-names>T. E.</given-names></name></person-group> (<year>2020</year>). <article-title>Estimating the prevalence of missing experiments in a neuroimaging meta-analysis.</article-title> <source><italic>Res. Synth. Methods</italic></source> <volume>11</volume> <fpage>866</fpage>&#x2013;<lpage>883</lpage>. <pub-id pub-id-type="doi">10.1002/jrsm.1448</pub-id> <pub-id pub-id-type="pmid">32860642</pub-id></citation></ref>
<ref id="B124"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schatz</surname> <given-names>P.</given-names></name> <name><surname>Jay</surname> <given-names>K.</given-names></name> <name><surname>McComb</surname> <given-names>J.</given-names></name> <name><surname>McLaughlin</surname> <given-names>J.</given-names></name></person-group> (<year>2005</year>). <article-title>Misuse of statistical tests in publications.</article-title> <source><italic>Arch. Clin. Neuropsychol.</italic></source> <volume>20</volume> <fpage>1053</fpage>&#x2013;<lpage>1059</lpage>. <pub-id pub-id-type="doi">10.1016/j.acn.2005.06.006</pub-id> <pub-id pub-id-type="pmid">16095871</pub-id></citation></ref>
<ref id="B125"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schneider</surname> <given-names>J. W.</given-names></name></person-group> (<year>2014</year>). <article-title>Null hypothesis significance tests. A mix-up of two different theories: the basis for widespread confusion and numerous misinterpretations.</article-title> <source><italic>Scientometrics</italic></source> <volume>102</volume> <fpage>411</fpage>&#x2013;<lpage>432</lpage>. <pub-id pub-id-type="doi">10.1007/s11192-014-1251-5</pub-id></citation></ref>
<ref id="B126"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schneider</surname> <given-names>J. W.</given-names></name></person-group> (<year>2018</year>). <article-title>NHST is still logically flawed.</article-title> <source><italic>Scientometrics</italic></source> <volume>115</volume> <fpage>627</fpage>&#x2013;<lpage>635</lpage>. <pub-id pub-id-type="doi">10.1007/s11192-018-2655-4</pub-id></citation></ref>
<ref id="B127"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sch&#x00F6;nbrodt</surname> <given-names>F. D.</given-names></name> <name><surname>Wagenmakers</surname> <given-names>E.-J.</given-names></name> <name><surname>Zehetleitner</surname> <given-names>M.</given-names></name> <name><surname>Perugini</surname> <given-names>M.</given-names></name></person-group> (<year>2017</year>). <article-title>Sequential hypothesis testing with Bayes factors: Efficiently testing mean differences</article-title>. <source><italic>Psychol. Methods</italic></source> <volume>22</volume>, <fpage>322</fpage>&#x2013;<lpage>339</lpage>. <pub-id pub-id-type="doi">10.1037/met0000061</pub-id> <pub-id pub-id-type="pmid">26651986</pub-id></citation></ref>
<ref id="B128"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schuirmann</surname> <given-names>D. J.</given-names></name></person-group> (<year>1987</year>). <article-title>A comparison of the two one-sided tests procedure and the power approach for assessing the equivalence of average bioavailability.</article-title> <source><italic>J. Pharmacokinet. Biopharm.</italic></source> <volume>15</volume> <fpage>657</fpage>&#x2013;<lpage>680</lpage>. <pub-id pub-id-type="doi">10.1007/bf01068419</pub-id> <pub-id pub-id-type="pmid">3450848</pub-id></citation></ref>
<ref id="B129"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schwartzman</surname> <given-names>A.</given-names></name> <name><surname>Dougherty</surname> <given-names>R.</given-names></name> <name><surname>Lee</surname> <given-names>J.</given-names></name> <name><surname>Ghahremani</surname> <given-names>D.</given-names></name> <name><surname>Taylor</surname> <given-names>J.</given-names></name></person-group> (<year>2009</year>). <article-title>Empirical null and false discovery rate analysis in neuroimaging.</article-title> <source><italic>Neuroimage</italic></source> <volume>44</volume> <fpage>71</fpage>&#x2013;<lpage>82</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2008.04.182</pub-id> <pub-id pub-id-type="pmid">18547821</pub-id></citation></ref>
<ref id="B130"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Serlin</surname> <given-names>R. C.</given-names></name> <name><surname>Lapsley</surname> <given-names>D. K.</given-names></name></person-group> (<year>1985</year>). <article-title>Rationality in psychological research: the good-enough principle.</article-title> <source><italic>Am. Psychol.</italic></source> <volume>40</volume> <fpage>73</fpage>&#x2013;<lpage>83</lpage>. <pub-id pub-id-type="doi">10.1037/0003-066x.40.1.73</pub-id></citation></ref>
<ref id="B131"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Serlin</surname> <given-names>R. C.</given-names></name> <name><surname>Lapsley</surname> <given-names>D. K.</given-names></name></person-group> (<year>1993</year>). &#x201C;<article-title>Rational appraisal of psychological research and the good-enough principle</article-title>,&#x201D; in <source><italic>A Handbook for Data Analysis in the Behavioral Sciences: Methodological Issues</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Keren</surname> <given-names>G.</given-names></name> <name><surname>Lewis</surname> <given-names>C.</given-names></name></person-group> (<publisher-loc>Mahwah, NJ</publisher-loc>: <publisher-name>Lawrence Erlbaum Associates, Inc.</publisher-name>), <fpage>199</fpage>&#x2013;<lpage>228</lpage>.</citation></ref>
<ref id="B132"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shapiro</surname> <given-names>S. S.</given-names></name> <name><surname>Wilk</surname> <given-names>M. B.</given-names></name></person-group> (<year>1965</year>). <article-title>An analysis of variance test for normality (complete samples).</article-title> <source><italic>Biometrika</italic></source> <volume>52</volume> <fpage>591</fpage>&#x2013;<lpage>611</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/52.3-4.591</pub-id></citation></ref>
<ref id="B133"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sj&#x00F6;lander</surname> <given-names>A.</given-names></name> <name><surname>Vansteelandt</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>Frequentist versus Bayesian approaches to multiple testing.</article-title> <source><italic>Eur. J. Epidemiol.</italic></source> <volume>34</volume> <fpage>809</fpage>&#x2013;<lpage>821</lpage>. <pub-id pub-id-type="doi">10.1007/s10654-019-00517-2</pub-id> <pub-id pub-id-type="pmid">31087218</pub-id></citation></ref>
<ref id="B134"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Smith</surname> <given-names>S. M.</given-names></name> <name><surname>Nichols</surname> <given-names>T. E.</given-names></name></person-group> (<year>2018</year>). <article-title>Statistical challenges in &#x201C;big data&#x201D; human neuroimaging.</article-title> <source><italic>Neuron</italic></source> <volume>97</volume> <fpage>263</fpage>&#x2013;<lpage>268</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuron.2017.12.018</pub-id> <pub-id pub-id-type="pmid">29346749</pub-id></citation></ref>
<ref id="B135"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sober</surname> <given-names>E.</given-names></name></person-group> (<year>2008</year>). <source><italic>Evidence and Evolution: The Logic Behind the Science.</italic></source> <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>.</citation></ref>
<ref id="B136"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Storey</surname> <given-names>J. D.</given-names></name></person-group> (<year>2003</year>). <article-title>The positive false discovery rate: a Bayesian interpretation and the q-value.</article-title> <source><italic>Ann. Stat.</italic></source> <volume>31</volume> <fpage>2013</fpage>&#x2013;<lpage>2035</lpage>. <pub-id pub-id-type="doi">10.1214/aos/1074290335</pub-id></citation></ref>
<ref id="B137"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Streiner</surname> <given-names>D. L.</given-names></name></person-group> (<year>2015</year>). <article-title>Best (but oft-forgotten) practices: the multiple problems of multiplicity&#x2014;whether and how to correct for many statistical tests.</article-title> <source><italic>Am. J. Clin. Nutr.</italic></source> <volume>102</volume> <fpage>721</fpage>&#x2013;<lpage>728</lpage>. <pub-id pub-id-type="doi">10.3945/ajcn.115.113548</pub-id> <pub-id pub-id-type="pmid">26245806</pub-id></citation></ref>
<ref id="B138"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Svensson</surname> <given-names>J.</given-names></name> <name><surname>Schain</surname> <given-names>M.</given-names></name> <name><surname>Knudsen</surname> <given-names>G. M.</given-names></name> <name><surname>Ogden</surname> <given-names>T.</given-names></name> <name><surname>Plav&#x00E9;n-Sigray</surname> <given-names>P.</given-names></name></person-group> (<year>2020</year>). <article-title>Early stopping in clinical PET studies: how to reduce expense and exposure.</article-title> <source><italic>MedRxiv</italic></source> <comment>[Preprint]</comment> <pub-id pub-id-type="doi">10.1101/2020.09.13.20192856</pub-id></citation></ref>
<ref id="B139"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Szucs</surname> <given-names>D.</given-names></name> <name><surname>Ioannidis</surname> <given-names>J. P.</given-names></name></person-group> (<year>2020</year>). <article-title>Sample size evolution in neuroimaging research: an evaluation of highly-cited studies (1990&#x2013;2012) and of latest practices (2017&#x2013;2018) in high-impact journals.</article-title> <source><italic>Neuroimage</italic></source> <volume>221</volume>:<issue>117164</issue>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2020.117164</pub-id> <pub-id pub-id-type="pmid">32679253</pub-id></citation></ref>
<ref id="B140"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Szucs</surname> <given-names>D.</given-names></name> <name><surname>Ioannidis</surname> <given-names>J. P. A.</given-names></name></person-group> (<year>2017</year>). <article-title>When null hypothesis significance testing is unsuitable for research: a reassessment.</article-title> <source><italic>Front. Hum. Neurosci.</italic></source> <volume>11</volume>:<issue>390</issue>. <pub-id pub-id-type="doi">10.3389/fnhum.2017.00390</pub-id> <pub-id pub-id-type="pmid">28824397</pub-id></citation></ref>
<ref id="B141"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Turkheimer</surname> <given-names>F. E.</given-names></name> <name><surname>Aston</surname> <given-names>J. A. D.</given-names></name> <name><surname>Cunningham</surname> <given-names>V. J.</given-names></name></person-group> (<year>2004</year>). <article-title>On the logic of hypothesis testing in functional imaging.</article-title> <source><italic>Eur. J. Nuclear Med. Mol. Imaging</italic></source> <volume>31</volume> <fpage>725</fpage>&#x2013;<lpage>732</lpage>. <pub-id pub-id-type="doi">10.1007/s00259-003-1387-7</pub-id> <pub-id pub-id-type="pmid">14730402</pub-id></citation></ref>
<ref id="B142"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>UIudag</surname> <given-names>K.</given-names></name> <name><surname>M&#x00FC;ller-Bierl</surname> <given-names>B.</given-names></name> <name><surname>Ugurbil</surname> <given-names>K.</given-names></name></person-group> (<year>2009</year>). <article-title>An integrative model for neuronal activity-induced signal changes for gradient and spin echo functional imaging.</article-title> <source><italic>Neuroimage</italic></source> <volume>47</volume>:<issue>S56</issue>. <pub-id pub-id-type="doi">10.1016/s1053-8119(09)70204-6</pub-id></citation></ref>
<ref id="B143"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wagenmakers</surname> <given-names>E. J.</given-names></name></person-group> (<year>2007</year>). <article-title>A practical solution to the pervasive problems of p values.</article-title> <source><italic>Psychon. Bull. Rev.</italic></source> <volume>14</volume> <fpage>779</fpage>&#x2013;<lpage>804</lpage>. <pub-id pub-id-type="doi">10.3758/bf03194105</pub-id> <pub-id pub-id-type="pmid">18087943</pub-id></citation></ref>
<ref id="B144"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wagenmakers</surname> <given-names>E. J.</given-names></name> <name><surname>Lee</surname> <given-names>M.</given-names></name> <name><surname>Lodewyckx</surname> <given-names>T.</given-names></name> <name><surname>Iverson</surname> <given-names>G. J.</given-names></name></person-group> (<year>2008</year>). &#x201C;<article-title>Bayesian versus Frequentist inference</article-title>,&#x201D; in <source><italic>Bayesian Evaluation of Informative Hypotheses. Statistics for Social and Behavioral Sciences</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Hoijtink</surname> <given-names>H.</given-names></name> <name><surname>Klugkist</surname> <given-names>I.</given-names></name> <name><surname>Boelen</surname> <given-names>P. A.</given-names></name></person-group> (<publisher-loc>New York, NY</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>181</fpage>&#x2013;<lpage>207</lpage>.</citation></ref>
<ref id="B145"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wagenmakers</surname> <given-names>E. J.</given-names></name> <name><surname>Lodewyckx</surname> <given-names>T.</given-names></name> <name><surname>Kuriyal</surname> <given-names>H.</given-names></name> <name><surname>Grasman</surname> <given-names>R.</given-names></name></person-group> (<year>2010</year>). <article-title>Bayesian hypothesis testing for psychologists: a tutorial on the Savage&#x2013;Dickey method.</article-title> <source><italic>Cogn. Psychol.</italic></source> <volume>60</volume> <fpage>158</fpage>&#x2013;<lpage>189</lpage>. <pub-id pub-id-type="doi">10.1016/j.cogpsych.2009.12.001</pub-id> <pub-id pub-id-type="pmid">20064637</pub-id></citation></ref>
<ref id="B146"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wagenmakers</surname> <given-names>E. J.</given-names></name> <name><surname>Verhagen</surname> <given-names>J.</given-names></name> <name><surname>Ly</surname> <given-names>A.</given-names></name> <name><surname>Matzke</surname> <given-names>D.</given-names></name> <name><surname>Steingroever</surname> <given-names>H.</given-names></name> <name><surname>Rouder</surname> <given-names>J. N.</given-names></name><etal/></person-group> (<year>2017</year>). &#x201C;<article-title>The need for Bayesian Hypothesis testing in psychological science</article-title>,&#x201D; in <source><italic>Psychological Science Under Scrutiny: Recent Challenges and Proposed Solutions</italic></source>, <role>eds</role> <person-group person-group-type="editor"><name><surname>Lilienfeld</surname> <given-names>S. O.</given-names></name> <name><surname>Waldman</surname> <given-names>I. D.</given-names></name></person-group> (<publisher-loc>Hoboken, NJ</publisher-loc>: <publisher-name>Wiley Blackwell</publisher-name>), <fpage>123</fpage>&#x2013;<lpage>138</lpage>.</citation></ref>
<ref id="B147"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wasserstein</surname> <given-names>R. L.</given-names></name> <name><surname>Lazar</surname> <given-names>N. A.</given-names></name></person-group> (<year>2016</year>). <article-title>The ASA statement on p-values: context, process, and purpose.</article-title> <source><italic>Am. Stat.</italic></source> <volume>70</volume> <fpage>129</fpage>&#x2013;<lpage>133</lpage>. <pub-id pub-id-type="doi">10.1080/00031305.2016.1154108</pub-id></citation></ref>
<ref id="B148"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Wellek</surname> <given-names>S.</given-names></name></person-group> (<year>2010</year>). <source><italic>Testing Statistical Hypotheses of Equivalence and Noninferiority</italic></source>, <edition>2nd Edn</edition>. <publisher-loc>Milton Park</publisher-loc>: <publisher-name>Taylor &#x0026; Francis</publisher-name>.</citation></ref>
<ref id="B149"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Westfall</surname> <given-names>P.</given-names></name> <name><surname>Johnson</surname> <given-names>W. O.</given-names></name> <name><surname>Utts</surname> <given-names>J. M.</given-names></name></person-group> (<year>1997</year>). <article-title>A Bayesian perspective on the Bonferroni adjustment.</article-title> <source><italic>Biometrika</italic></source> <volume>84</volume> <fpage>419</fpage>&#x2013;<lpage>427</lpage>. <pub-id pub-id-type="doi">10.1093/biomet/84.2.419</pub-id></citation></ref>
<ref id="B150"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Westlake</surname> <given-names>W. J.</given-names></name></person-group> (<year>1972</year>). <article-title>Use of confidence intervals in analysis of comparative bioavailability trials.</article-title> <source><italic>J. Pharm. Sci.</italic></source> <volume>61</volume> <fpage>1340</fpage>&#x2013;<lpage>1341</lpage>. <pub-id pub-id-type="doi">10.1002/jps.2600610845</pub-id> <pub-id pub-id-type="pmid">5050398</pub-id></citation></ref>
<ref id="B151"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Woo</surname> <given-names>C. W.</given-names></name> <name><surname>Krishnan</surname> <given-names>A.</given-names></name> <name><surname>Wager</surname> <given-names>T. D.</given-names></name></person-group> (<year>2014</year>). <article-title>Cluster-extent based thresholding in fMRI analyses: pitfalls and recommendations.</article-title> <source><italic>Neuroimage</italic></source> <volume>91</volume> <fpage>412</fpage>&#x2013;<lpage>419</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2013.12.058</pub-id> <pub-id pub-id-type="pmid">24412399</pub-id></citation></ref>
<ref id="B152"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Woolrich</surname> <given-names>M. W.</given-names></name> <name><surname>Behrens</surname> <given-names>T. E.</given-names></name> <name><surname>Beckmann</surname> <given-names>C. F.</given-names></name> <name><surname>Jenkinson</surname> <given-names>M.</given-names></name> <name><surname>Smith</surname> <given-names>S. M.</given-names></name></person-group> (<year>2004</year>). <article-title>Multilevel linear modelling for FMRI group analysis using Bayesian inference.</article-title> <source><italic>Neuroimage</italic></source> <volume>21</volume> <fpage>1732</fpage>&#x2013;<lpage>1747</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2003.12.023</pub-id> <pub-id pub-id-type="pmid">15050594</pub-id></citation></ref>
<ref id="B153"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Woolrich</surname> <given-names>M. W.</given-names></name> <name><surname>Jbabdi</surname> <given-names>S.</given-names></name> <name><surname>Patenaude</surname> <given-names>B.</given-names></name> <name><surname>Chappell</surname> <given-names>M.</given-names></name> <name><surname>Makni</surname> <given-names>S.</given-names></name> <name><surname>Behrens</surname> <given-names>T.</given-names></name><etal/></person-group> (<year>2009</year>). <article-title>Bayesian analysis of neuroimaging data in FSL.</article-title> <source><italic>Neuroimage</italic></source> <volume>45</volume> <fpage>S173</fpage>&#x2013;<lpage>S186</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2008.10.055</pub-id> <pub-id pub-id-type="pmid">19059349</pub-id></citation></ref>
<ref id="B154"><citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yarkoni</surname> <given-names>T.</given-names></name> <name><surname>Poldrack</surname> <given-names>R. A.</given-names></name> <name><surname>Nichols</surname> <given-names>T. E.</given-names></name> <name><surname>Van Essen</surname> <given-names>D. C.</given-names></name> <name><surname>Wager</surname> <given-names>T. D.</given-names></name></person-group> (<year>2011</year>). <article-title>Large-scale automated synthesis of human functional neuroimaging data.</article-title> <source><italic>Nat. Methods</italic></source> <volume>8</volume> <fpage>665</fpage>&#x2013;<lpage>670</lpage>. <pub-id pub-id-type="doi">10.1038/nmeth.1635</pub-id> <pub-id pub-id-type="pmid">21706013</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn id="footnote1">
<label>1</label>
<p>Here are some examples of &#x2018;no effect&#x2019; conclusions that can be found in the fMRI literature: (a) brain area was not activated, (b) brain area was not involved in the function, (c) no effect was found in the brain area (<italic>p</italic> &#x003E; 0.05), (d) both groups showed no differences, which can be interpreted as evidence against the alternative hypothesis; (e) patients have similar responses to both conditions (<italic>p</italic> &#x003E; 0.05), that is, they have difficulties in differentiating these conditions; (f) lack of significant correlation during treatment suggest a protective impact of the therapy on brain areas.</p></fn>
<fn id="footnote2">
<label>2</label>
<p><ext-link ext-link-type="uri" xlink:href="https://www.fil.ion.ucl.ac.uk/spm/software/spm12">https://www.fil.ion.ucl.ac.uk/spm/software/spm12</ext-link></p></fn>
<fn id="footnote3">
<label>3</label>
<p>FDR correction controls the rate of false discoveries (false positives in frequentist terminology) among all significant voxels. FWE correction controls the rate of any false positives in the whole brain.</p></fn>
<fn id="footnote4">
<label>4</label>
<p><ext-link ext-link-type="uri" xlink:href="https://github.com/Masharipov/Bayesian_inference">https://github.com/Masharipov/Bayesian_inference</ext-link></p></fn>
<fn id="footnote5">
<label>5</label>
<p><ext-link ext-link-type="uri" xlink:href="https://github.com/Masharipov/BPI_2021/tree/main/simulations">https://github.com/Masharipov/BPI_2021/tree/main/simulations</ext-link></p></fn>
<fn id="footnote6">
<label>6</label>
<p>This is especially true for PET studies. The BPI method described in this work can also be applied to PET data to reduce the sample size and thus exposure to radioactivity (<xref ref-type="bibr" rid="B138">Svensson et al., 2020</xref>).</p></fn>
</fn-group>
</back>
</article>
