<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Psychol.</journal-id>
<journal-title>Frontiers in Psychology</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Psychol.</abbrev-journal-title>
<issn pub-type="epub">1664-1078</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fpsyg.2016.01027</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Psychology</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Modeling the Overalternating Bias with an Asymmetric Entropy Measure</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name><surname>Gronchi</surname> <given-names>Giorgio</given-names></name>
<xref ref-type="aff" rid="aff1"><sup>1</sup></xref>
<xref ref-type="author-notes" rid="fn001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/346903/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Raglianti</surname> <given-names>Marco</given-names></name>
<xref ref-type="aff" rid="aff2"><sup>2</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/352379/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Noventa</surname> <given-names>Stefano</given-names></name>
<xref ref-type="aff" rid="aff3"><sup>3</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/358034/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Lazzeri</surname> <given-names>Alessandro</given-names></name>
<xref ref-type="aff" rid="aff4"><sup>4</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/346915/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Guazzini</surname> <given-names>Andrea</given-names></name>
<xref ref-type="aff" rid="aff5"><sup>5</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/81073/overview"/>
</contrib>
</contrib-group>
<aff id="aff1"><sup>1</sup><institution>Department of Neuroscience, Psychology, Drug Research and Child&#x00027;s Health - Section of Psychology, University of Florence</institution> <country>Florence, Italy</country></aff>
<aff id="aff2"><sup>2</sup><institution>Formerly affiliated with the BioRobotics Institute, Scuola Superiore Sant&#x00027;Anna</institution> <country>Pisa, Italy</country></aff>
<aff id="aff3"><sup>3</sup><institution>Center for Assessment, University of Verona</institution> <country>Verona, Italy</country></aff>
<aff id="aff4"><sup>4</sup><institution>Department of Information Engineering, University of Pisa</institution> <country>Pisa, Italy</country></aff>
<aff id="aff5"><sup>5</sup><institution>Department of Sciences of Education and Psychology, University of Florence</institution> <country>Florence, Italy</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Pietro Cipresso, Istituto Auxologico Italiano - Istituto di Ricovero e Cura a Carattere Scientifico, Italy</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Luis Diambra, Universidad Nacional de La Plata, Argentina; Andrei Martinez-Finkelshtein, University of Almeria, Spain</p></fn>
<fn fn-type="corresp" id="fn001"><p>&#x0002A;Correspondence: Giorgio Gronchi <email>giorgio.gronchi&#x00040;gmail.com</email></p></fn>
<fn fn-type="other" id="fn002"><p>This article was submitted to Quantitative Psychology and Measurement, a section of the journal Frontiers in Psychology</p></fn>
</author-notes>
<pub-date pub-type="epub">
<day>06</day>
<month>07</month>
<year>2016</year>
</pub-date>
<pub-date pub-type="collection">
<year>2016</year>
</pub-date>
<volume>7</volume>
<elocation-id>1027</elocation-id>
<history>
<date date-type="received">
<day>03</day>
<month>05</month>
<year>2016</year>
</date>
<date date-type="accepted">
<day>22</day>
<month>06</month>
<year>2016</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2016 Gronchi, Raglianti, Noventa, Lazzeri and Guazzini.</copyright-statement>
<copyright-year>2016</copyright-year>
<copyright-holder>Gronchi, Raglianti, Noventa, Lazzeri and Guazzini</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) or licensor are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license>
</permissions>
<abstract><p>Psychological research has found that human perception of randomness is biased. In particular, people consistently show the overalternating bias: they rate binary sequences of symbols (such as Heads and Tails in coin flipping) with an excess of alternation as more random than prescribed by the normative criteria of Shannon&#x00027;s entropy. Within data mining for medical applications, Marcellin proposed an asymmetric measure of entropy that can be ideal to account for such bias and to quantify subjective randomness. We fitted Marcellin&#x00027;s entropy and Renyi&#x00027;s entropy (a generalized form of uncertainty measure comprising many different kinds of entropies) to experimental data found in the literature with the Differential Evolution algorithm. We observed a better fit for Marcellin&#x00027;s entropy compared to Renyi&#x00027;s entropy. The fitted asymmetric entropy measure also showed good predictive properties when applied to different datasets of randomness-related tasks. We concluded that Marcellin&#x00027;s entropy can be a parsimonious and effective measure of subjective randomness that can be useful in psychological research about randomness perception.</p></abstract>
<kwd-group><kwd>randomness perception</kwd>
<kwd>overalternating bias</kwd>
<kwd>asymmetric entropy</kwd>
<kwd>Renyi&#x00027;s entropy</kwd>
<kwd>Marcellin&#x00027;s entropy</kwd>
<kwd>Shannon&#x00027;s entropy</kwd>
<kwd>Differential Evolution algorithm</kwd>
</kwd-group>
<contract-num rid="cn001">611299</contract-num>
<contract-sponsor id="cn001">European Commission<named-content content-type="fundref-id">10.13039/501100000780</named-content></contract-sponsor>
<counts>
<fig-count count="3"/>
<table-count count="2"/>
<equation-count count="9"/>
<ref-count count="22"/>
<page-count count="8"/>
<word-count count="6442"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<sec>
<title>1.1. Inductive reasoning and subjective randomness</title>
<p>Explaining how people make inductive reasoning (e.g., inferring general laws or principles from the observation of particular instances) is a central topic within the psychology of reasoning. In particular, perception of randomness is a key aspect of these inferential processes. Perceiving a situation as non-random requires some kind of subjective explanation which entails the onset of inductive reasoning (Lopes, <xref ref-type="bibr" rid="B15">1982</xref>). On the contrary, if the phenomenon is seen as a mere coincidence, the observer does not hypothesize any explanation. For example, during World War II the German air force dropped on London V1 bombs: many Londoners saw particular patterns related to the impacts and consequently they developed specific theories about German strategy (e.g., thinking that poor districts of London were privileged targets). However, a statistical analysis of the bombing patterns made after the end of the war revealed that the distribution of the impacts was not statistically different from an actual random pattern (Hastie and Dawes, <xref ref-type="bibr" rid="B12">2010</xref>). The opposite mistake happens when an observer fails to detect a regularity, thus attributing to chance a potential relation noticed (Griffiths, <xref ref-type="bibr" rid="B7">2004</xref>): before Halley, no one had ever thought that the comets observed in 1531, 1607, and 1682 were the very same comet (Halley, <xref ref-type="bibr" rid="B11">1752</xref>). Since 1950, many psychological studies have been devoted to investigate randomness perception and production: an important result is that people&#x00027;s intuitive understanding of randomness in binary sequences is biased toward an over-alternation between different possible outcomes (the so-called overalternating bias).</p>
<p>Given the importance of having a viable and flexible measure of subjective randomness, this study aims to evaluate how different kinds of entropy measures can predict judgments about sequence randomness. In particular, within the context of data mining and growing decision trees, an asymmetric measure of entropy has been developed (Marcellin et al., <xref ref-type="bibr" rid="B16">2006</xref>). Such measure has proven to be very useful in dealing with unbalanced classes in medical and economic decisions. Nonetheless, such asymmetric entropy measure might also be beneficial in cognitive domains. In this paper we investigate its usefulness in order to model the overalternating bias.</p>
</sec>
<sec>
<title>1.2. The overalternating bias</title>
<p>From a formal point of view, randomness is still an elusive concept and a shared definition has yet to be established. A variety of efforts have been sustained in order to provide a formal measure of randomness within mathematics, physics, and computer science (Li and Vit&#x000E1;nyi, <xref ref-type="bibr" rid="B14">1997</xref>; Volchan, <xref ref-type="bibr" rid="B22">2002</xref>). Despite the lack of a clear and shared normative criterion, psychologists have been investigating extensively people&#x00027;s subjective sense of randomness. Usually participants&#x00027; responses are compared to sampling distributions of statistics that characterize the stimuli. This strand of research has employed classically two types of tasks: production tasks and perception tasks. In the former, participants are asked to generate the outcomes of a random mechanism, for example simulating the results of tossing a fair coin. On the contrary, in perception tasks participants have to rate how much random on a Likert scale the stimulus is (commonly a string of binary elements) or to categorize the stimulus on the basis of the generating source (e.g., has the sequence been produced by a random or a non-random mechanism?).</p>
<p>Despite some methodological issues that characterize the psychological investigation of randomness (Nickerson, <xref ref-type="bibr" rid="B17">2002</xref>), the basic finding of generation and perception of random binary strings (and two-dimensional grids of binary elements) is the overalternating bias: people identify randomness with an excess of alternation between symbol types compared to the normative criterion employed. In other terms, those sequences which actually present the modal number of alternations expected by chance are not perceived as maximally random because they contain too long runs of the same element. Falk and Konold (<xref ref-type="bibr" rid="B6">1997</xref>) made a series of randomness perception experiments that clearly showed such an overalternating bias. They employed 21-elements strings composed by two symbols, Xs and Os, such as XXXX&#x02026;OOOO. The alternation rate of such sequences can be defined through the probability of alternation [<italic>P</italic>(<italic>A</italic>)] statistics: this value is defined as the ratio between the number of actual transitions and the number of total transitions in the sequence. More formally, for strings of length <italic>n</italic> and with a number of runs (i.e., unbroken subsequences) <italic>r</italic>, the probability of alternation is</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M1"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>P</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>A</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mi>r</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>-</mml:mo><mml:mn>1</mml:mn></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Falk and Konold (<xref ref-type="bibr" rid="B6">1997</xref>) employed as a normative criterion to quantify the randomness of a sequence the second order entropy of the sequence computed with the classical Shannon entropy (Shannon, <xref ref-type="bibr" rid="B19">1948</xref>). Such measure is based on the relative frequencies of the ordered pairs of symbols, called digrams (in the example, XO, OX, XX, and OO); in particular, it quantifies the new information (in bits) contributed by the second member of the pair. It is possible to define second order entropy as the difference between the entropy of the digrams and the first order entropy (Attneave, <xref ref-type="bibr" rid="B1">1959</xref>):</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M2"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>m</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>First order entropy can be computed through the classical Shannon formula:</p>
<disp-formula id="E3"><label>(3)</label><mml:math id="M3"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>&#x02211;</mml:mo><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Where <italic>p</italic><sub><italic>i</italic></sub> is the probability of the symbol <italic>i</italic>. Similarly, the entropy of the digrams can be obtained on the basis of the probability of the ordered pairs of symbols:</p>
<disp-formula id="E4"><label>(4)</label><mml:math id="M4"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>m</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mo>&#x02211;</mml:mo><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>m</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub><mml:mi>l</mml:mi><mml:mi>o</mml:mi><mml:mi>g</mml:mi><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>d</mml:mi><mml:mi>i</mml:mi><mml:mi>g</mml:mi><mml:mi>r</mml:mi><mml:mi>a</mml:mi><mml:mi>m</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msub></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The relationship between the probability of alternation and the second order entropy is a symmetrical, unimodal reversed-U curve with a maximum in correspondence of a probability of alternation value of 0.5 (Figure <xref ref-type="fig" rid="F1">1</xref>). However, while measuring the subjective randomness rating of binary strings by manipulating the probability of alternation, participants indicated that the most random rated sequences were the ones with a probability of alternation of about 0.7. The resulting function is an asymmetrical U-reversed relationship negatively skewed (Figure <xref ref-type="fig" rid="F1">1</xref>). This is a clear example of overalternating bias. The empirical function of subjective randomness is different from the function that is obtained by computing the second order entropy as a normative criterion of randomness.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p><bold>The empirical subjective randomness measured by Falk and Konold (solid line) and the second order entropy computed by Shannon&#x00027;s formula (dashed line)</bold>.</p></caption>
<graphic xlink:href="fpsyg-07-01027-g0001.tif"/>
</fig>
<p>For example, strings such as XXXOXXOOXOO [<italic>P</italic>(<italic>A</italic>) &#x0003D; 0.5] are rated less random than OXXOXXOXOOX [<italic>P</italic>(<italic>A</italic>) &#x0003D; 0.7] although this is not true from a normative point of view. This kind of result is very robust and it has been found in a variety of studies. Reviewing the literature, Falk and Konold (<xref ref-type="bibr" rid="B6">1997</xref>) found that the sequences rated as most random ranged from a <italic>P</italic>(<italic>A</italic>) &#x0003D; 0.57 to 0.8. Nevertheless, these works employed a variety of stimuli (strings or two-dimensional grids), different sizes of the set of stimuli and a variety of task instructions (such as select the most random sequence or rate their randomness on a Likert scale).</p>
</sec>
<sec>
<title>1.3. Measuring subjective randomness</title>
<p>Within psychology literature, two main measures of subjective randomness for strings of symbols have been proposed: Difficulty Predictor (DP) (Falk and Konold, <xref ref-type="bibr" rid="B6">1997</xref>) and the model of Griffiths and Tenenbaum (<xref ref-type="bibr" rid="B8">2003</xref>, <xref ref-type="bibr" rid="B9">2004</xref>). Both measures try to quantify the complexity of a sequence in order to compute a score of subjective randomness in accordance with empirical data on human judgments.</p>
<p>The DP score is computed by counting the number of runs (any uninterrupted subsequence of the same symbol), and adding twice the number of subsequences in which the symbols alternate. For example, the sequence XXXOOOXOXO is composed by a run of Xs, a run of Os, and an alternating sub-sequence (XOXO), for a total value of DP equal to 4. If there are multiple ways to segment the string, DP is calculated on the partition that results in the lowest score. DP correlates very highly with randomness judgments and a variety of related tasks. However, DP is a parameter-free score and it is not possible to use it to quantify how subjective randomness changes in different conditions (e.g., by fitting DP to data obtained with different tasks to investigate the variation of the parameters). Moreover, as Griffiths and Tenenbaum (<xref ref-type="bibr" rid="B8">2003</xref>) observed, DP is not able to account for subjective randomness with strings of different length: for example, XXXXXXXXXXXOOXO and XXXOOXO have the same value of DP (4) but clearly the long uninterrupted run of Xs of the former provides a stronger evidence for some kind of regularity.</p>
<p>Griffiths and Tenenbaum instead employed the Bayesian framework to develop a probabilistic model of the overalternating bias (Griffiths and Tenenbaum, <xref ref-type="bibr" rid="B8">2003</xref>, <xref ref-type="bibr" rid="B9">2004</xref>). The randomness perception task is addressed in terms of the statistical problem of Bayesian model selection: given a string, it has to be inferred whether the process that generated it was random or regular. From a rational point of view, the probability of obtaining a specific binary string given a random generating process is constant and equal to <inline-formula><mml:math id="M5"><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:mfrac></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>k</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> where <italic>k</italic> is the number of elements of the string. Conversely, the probability of obtaining that particular sequence given a regular generating process is computed by means of a Hidden Markov Model (HMM): through the parameters of the model it is possible to determine regularities that people perceive when judging the randomness of a binary sequence. In sum, the authors showed how through a Bayesian framework is possible to model the perceived randomness of binary sequences and its sensitivity to motif repetition and other kind of regularities (such as various types of symmetry). By means of these models it is possible to predict accurately human judgments, including the overalternating bias. Depending on the kind of regularities that can be detected, it is possible to specify models of increasing complexity (from 4 to 8 parameters). Overall, results show that the model with the highest number of parameters account better for observed data and that such parameters vary coherently with different experimental conditions (Griffiths and Tenenbaum, <xref ref-type="bibr" rid="B8">2003</xref>, <xref ref-type="bibr" rid="B9">2004</xref>). This model has a very high number of parameters and it is deeply grounded in a specific psychological theoretical framework (the Bayesian probabilistic perspective), greatly complicating its use for those who do not adhere to such perspective.</p>
<p>DP and the Griffiths and Tenenbaum model are highly correlated and they are both able to account very well for randomness judgments. The aim of the present work is then to explore the possibility of modeling randomness judgments with a parsimonious, parameter-based model not grounded into any specific psychological framework. To this purpose, we focused on some of the various measures of entropy proposed within mathematics, physics, and information sciences.</p>
</sec>
</sec>
<sec id="s2">
<title>2. Measures of uncertainty: Renyi&#x00027;s entropy and the asymmetric entropy of Marcellin</title>
<p>As we have seen in Section 1.2, information theory has provided a normative criterion (the second order entropy) to quantify the uncertainty of strings of characters which are employed in experimental psychology.</p>
<p>Moving however from Shannon&#x00027;s definition and relaxing some of its assumptions, others generalized versions and families of information entropies have been obtained by authors like R&#x000E9;nyi (<xref ref-type="bibr" rid="B18">1961</xref>), Beck and Cohen (<xref ref-type="bibr" rid="B2">2003</xref>), Tsekouras and Tsallis (<xref ref-type="bibr" rid="B21">2005</xref>), and Marcellin et al. (<xref ref-type="bibr" rid="B16">2006</xref>). Given indeed a distribution over a set of events <inline-formula><mml:math id="M6"><mml:mi mathvariant="-tex-caligraphic">P</mml:mi></mml:math></inline-formula> &#x0003D; (<italic>p</italic><sub>1</sub>, &#x02026;, <italic>p</italic><sub><italic>N</italic></sub>), information entropy <italic>H</italic> was derived as a measure of the choice involved in the selection of an event (or in the uncertainty of the outcome), by requiring continuity in the events <italic>p</italic><sub><italic>i</italic></sub>, monotonicity in <italic>N</italic> when equiprobability holds, and that if a choice can be broken down into two successive choices, the original <italic>H</italic> should be the weighted sum of the individual values of <italic>H</italic> (Shannon, <xref ref-type="bibr" rid="B19">1948</xref>). By relaxing, for instance, the third requirement to a less restrictive form of additivity, in which not only weighted sums are allowed but more general additive functions, R&#x000E9;nyi (<xref ref-type="bibr" rid="B18">1961</xref>) obtained the following generalization</p>
<disp-formula id="E5"><label>(5)</label><mml:math id="M7"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">P</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:mfrac><mml:mo class="qopname">log</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:msubsup><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where &#x003B1; is a non-negative integer and the scaling factor <inline-formula><mml:math id="M8"><mml:mfrac><mml:mrow><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mtext>&#x000A0;</mml:mtext><mml:mo>-</mml:mo><mml:mtext>&#x000A0;</mml:mtext><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:mfrac></mml:math></inline-formula> is given so that for a uniform distribution <inline-formula><mml:math id="M9"><mml:mi mathvariant="-tex-caligraphic">U</mml:mi></mml:math></inline-formula> it always holds <italic>H</italic><sub>&#x003B1;</sub>(<inline-formula><mml:math id="M10"><mml:mi mathvariant="-tex-caligraphic">U</mml:mi></mml:math></inline-formula>) &#x0003D; log<italic>N</italic> for all values of &#x003B1;. The previous expression is defined as the Renyi entropy of order &#x003B1; of a distribution <inline-formula><mml:math id="M11"><mml:mi mathvariant="-tex-caligraphic">P</mml:mi></mml:math></inline-formula> and it is widely used in statistics, biology, and in quantum information theory as a measure of entanglement. It is a bounded, continuous, non-increasing and non-negative function of &#x003B1;, it is concave for &#x003B1; &#x02264; 1 and it loses concavity over the critical value &#x003B1;<sub><italic>c</italic></sub> which is a function of <italic>N</italic>. On passing, notice also that Renyi&#x00027;s entropy can be given an interpretation in terms of p-norm on a simplex in <italic>N</italic> dimensions. Most of all, it obeys additivity meaning that given two distributions <inline-formula><mml:math id="M12"><mml:mi mathvariant="-tex-caligraphic">P</mml:mi></mml:math></inline-formula> and <inline-formula><mml:math id="M13"><mml:mi mathvariant="-tex-caligraphic">Q</mml:mi></mml:math></inline-formula>, it holds:</p>
<disp-formula id="E6"><mml:math id="M14"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">P</mml:mi><mml:mo>&#x0002A;</mml:mo><mml:mi mathvariant="-tex-caligraphic">Q</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">P</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">Q</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Interestingly, entropy (5) encompasses several measures of uncertainty such as Hartley&#x00027;s entropy, quadratic entropy, min entropy, and the Shannon entropy. Indeed, changes in the parameter &#x003B1; imply that probabilities are sort of weighted. More in detail, for &#x003B1; &#x0003D; 0 it returns the Hartley (max) entropy <italic>H</italic><sub>0</sub>(<inline-formula><mml:math id="M15"><mml:mi mathvariant="-tex-caligraphic">P</mml:mi></mml:math></inline-formula>) &#x0003D; log<italic>N</italic>, so that lower values of &#x003B1; move toward equiprobability; if instead &#x003B1; &#x0003D; 2 it returns the quadratic (collision) entropy <inline-formula><mml:math id="M16"><mml:mrow><mml:msub><mml:mi>H</mml:mi><mml:mn>2</mml:mn></mml:msub><mml:mo stretchy='false'>(</mml:mo><mml:mi mathvariant="-tex-caligraphic">P</mml:mi><mml:mo stretchy='false'>)</mml:mo><mml:mo>=</mml:mo><mml:mo>&#x02212;</mml:mo><mml:mi>log</mml:mi><mml:mo stretchy='false'>(</mml:mo><mml:mstyle displaystyle='true'><mml:msubsup><mml:mo>&#x02211;</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mi>N</mml:mi></mml:msubsup><mml:mrow><mml:msubsup><mml:mi>p</mml:mi><mml:mi>i</mml:mi><mml:mn>2</mml:mn></mml:msubsup></mml:mrow></mml:mstyle><mml:mo stretchy='false'>)</mml:mo></mml:mrow></mml:math></inline-formula>; while in the limit &#x003B1; &#x02192; &#x0221E; it returns the min entropy <inline-formula><mml:math id="M17"><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x0221E;</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">P</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:munder class="msub"><mml:mrow><mml:mo class="qopname">min</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:munder><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mo>-</mml:mo><mml:mo class="qopname">log</mml:mo><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> so that higher values of &#x003B1; shift the attention toward the event with maximum probability. Finally, in the limit &#x003B1; &#x02192; 1, by means of L&#x00027;Hopital&#x00027;s rule, one can show that Renyi&#x00027;s entropy becomes exactly Shannon&#x00027;s entropy, which is the only limit in which the chain rule (or glomming formula) for conditional probability is satisfied.</p>
<p>Alternatively, one might characterize a generic measure of entropy (including Renyi&#x00027;s) as a non-negative, symmetric and strictly concave function, which is also bounded between a minimum (usually zero, attained when there is one <italic>p</italic><sub><italic>k</italic></sub> &#x0003D; 1 while all others <italic>p</italic><sub><italic>i</italic></sub> &#x0003D; 0 for <italic>i</italic> &#x02260; <italic>k</italic>) and a maximum (attained for the uniform distribution). Within several fields like medicine, marketing, and fraud detection, however, two assumptions from the previous set can become critical: namely, the symmetrical behavior with respect to different permutations of the probabilities, and the association of the maximum entropy with the uniform distribution (which is essentially the Laplace&#x00027;s principle of indifference). Entropy measures are indeed often employed in learning tasks and, in particular, in growing decision trees, in order to assign a leaf of the tree to a specific class by means of suitable splitting rules. Marcellin et al. (<xref ref-type="bibr" rid="B16">2006</xref>) noticed that in these cases a symmetric measure of uncertainty can be deceiving since not necessarily the different classes are balanced, meaning that their distribution is not a priori uniform. Moreover, the meaning of detecting a particular class can vary: for example, predicting a wrong disease (a false positive) has different consequences than missing a disease (a false negative), which reflects in non-equal misclassifications costs. In order to overcome these limits, an asymmetrical measure of entropy was proposed:</p>
<disp-formula id="E7"><label>(6)</label><mml:math id="M18"><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>H</mml:mi></mml:mrow><mml:mrow><mml:mi mathvariant="-tex-caligraphic">W</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi mathvariant="-tex-caligraphic">P</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:mi>N</mml:mi></mml:mrow></mml:munderover></mml:mstyle><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>1</mml:mn><mml:mo>-</mml:mo><mml:mn>2</mml:mn><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow></mml:mfrac></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where <inline-formula><mml:math id="M19"><mml:mi mathvariant="-tex-caligraphic">W</mml:mi></mml:math></inline-formula> &#x0003D; (<italic>w</italic><sub>1</sub>, &#x02026;, <italic>w</italic><sub><italic>N</italic></sub>) is the worst distribution for which the maximum value is attained. Such a measure of entropy is non-negative, asymmetric (symmetry is restored if <inline-formula><mml:math id="M20"><mml:mi mathvariant="-tex-caligraphic">W</mml:mi></mml:math></inline-formula> is uniform) and it is bounded between zero and a maximum.</p>
</sec>
<sec id="s3">
<title>3. Fitting Renyi&#x00027;s and Marcellin&#x00027;s entropies to randomness judgements</title>
<sec>
<title>3.1. Materials and methods</title>
<sec>
<title>3.1.1. Rationale of the study</title>
<p>Given the properties of Marcellin&#x00027;s asymmetric entropy, such measure might represent a suitable tool to model the overalternating bias. The most notable feature of Marcellin&#x00027;s entropy is that the most uncertain distribution must be estimated from data if not a priori available. This feature should reflect that an asymmetry in the distribution entropy for the probability of alternation would imply that the maximum randomness is actually perceived when the frequencies of alternating and non-alternating digrams are not equal. In our specific case it is expected that maximum randomness is perceived when alternating digrams exceed non-alternating ones. On these basis, we fitted the second order entropy with Marcellin&#x00027;s employing four parameters: <italic>w</italic><sub><italic>OO</italic></sub>, <italic>w</italic><sub><italic>XX</italic></sub>, <italic>w</italic><sub><italic>XO</italic></sub>, <italic>w</italic><sub><italic>OX</italic></sub>. The first couple is related to uniform digrams, whereas the second couple of parameters is related to alternating digrams. Given that XX and OO should be equivalent from a psychological point of view (as well as XO and OX), we constrained the corresponding parameters to be close to each other (see Section 3.2.1, Equations (7) and (8) for further details).</p>
<p>Fitting Marcellin&#x00027;s entropy to data, we expect that <italic>w</italic><sub><italic>XO</italic></sub> and <italic>w</italic><sub><italic>OX</italic></sub> will be comprised between 0.5 and 1, thus maximally contributing to the overall entropy of the sequence when the alternating digrams are more likely to appear. Their contribution to the sequence&#x00027;s entropy should be reduced approaching a probability of 1.0 for alternating digrams, since it represents a completely alternating sequence, such as XOXOXOXO, that doesn&#x00027;t result in a high subjective randomness. On the contrary, we expect that <italic>w</italic><sub><italic>OO</italic></sub> and <italic>w</italic><sub><italic>XX</italic></sub> will be comprised between 0 and 0.5 suggesting a high subjective randomness when an observer sees a low proportion of uniform digrams. Similarly to the previous case, the parameters <italic>w</italic><sub><italic>OO</italic></sub> and <italic>w</italic><sub><italic>XX</italic></sub> must be higher than 0 because a complete absence of uniform digrams does not suggest an high subjective randomness (as in the XOXOXOXO string).</p>
<p>We compared the fit of Marcellin&#x00027;s measure of entropy with the second order entropy computed with Shannon formula (as a reference) and with Renyi&#x00027;s entropy (because it encompasses several measures of uncertainty). We fitted these three measures of entropy to the 10 mean points of subjective ratings observed in Falk and Konold&#x00027;s experiment (Falk and Konold, <xref ref-type="bibr" rid="B6">1997</xref>). Finally, by employing the parameters of Marcellin&#x00027;s entropy estimated with these means, we compared the correlations between such measures and other datasets of random judgments (obtained by Gronchi and Sloman, <xref ref-type="bibr" rid="B10">2009</xref>), as well as DP and Griffiths and Tenenbaum model predictions.</p>
</sec>
<sec>
<title>3.1.2. Target values</title>
<p>We used Falk and Konold&#x00027;s (<xref ref-type="bibr" rid="B6">1997</xref>) results to fit the models. As described before, in that work the authors asked to &#x0201C;rate each sequence on a scale of 0 to 10 according to your intuition of how likely it is that such a sequence was obtained by flipping a fair coin.&#x0201D; They employed forty strings comprised in four alternative sets of 21 binary symbols (O and X). Each set was composed by 10 sequences with a probability of alternation ranging from 0.1 to 1 (in intervals of 0.1). Half of the sequences had 11 Xs and 10 Os and other half 10 Xs and 11 Os. For each value of probability of alternation, the mean randomness rating was computed obtaining a set of 10 points. We employed those values as a target function for the fitting problem.</p>
</sec>
<sec>
<title>3.1.3. Parameter fitting</title>
<p>In this Section we present the approach followed to find the optimal parameters of Renyi&#x00027;s and Marcellin&#x00027;s entropy models to fit the target function. First, we adopted the Euclidean distance between the target function and the <italic>H</italic><sub>2</sub> of model (2) as a fitness measure to quantify the goodness of the solution. More precisely, we wanted to find (i) an optimal alpha for Renyi&#x00027;s entropy and (ii) an optimal set of weights for Marcellin&#x00027;s entropy. To adapt the parameters for the minimization of a fitness measure is a classic optimization problem.</p>
<p>In several domains of application researchers employ <italic>search</italic> methods, i.e., algorithms that test solutions of the problem until a satisfactory condition is met. These methods are usually adopted because they are &#x0201C;black box&#x0201D; approaches, thus they are not based on the formal properties of the quality function. As a consequence, convergence to the optimal solution is not guaranteed, thus we need statistical measures to identify the goodness of the solution.</p>
<p>We used the Differential Evolution (DE) algorithm (Storn and Price, <xref ref-type="bibr" rid="B20">1997</xref>) to solve our problem. DE has been recently used by researchers for several optimization problems because of its performance in unimodal, multimodal, separable, and non-separable problems (Das and Suganthan, <xref ref-type="bibr" rid="B5">2011</xref>). DE is a population based algorithm, in which a member of the population is a vector that represents the parameters of the model. The size <italic>N</italic> of the population is usually between 2 and 20 times the number of elements of the vector. A large <italic>N</italic> increases the time to compute a new generation, but speeds up the convergence of the algorithm. To balance the two aspects we use <italic>N</italic> &#x0003D; 20. Each member of the population is evaluated via the fitness measure previously described. DE iteratively improves the population selecting a target member <italic>v</italic><sub><italic>ta</italic></sub> and making a comparison with a trial member <italic>v</italic><sub><italic>tr</italic></sub>. The trial is generated in two steps: the mutation and the crossover. In the mutation, three random vectors <italic>v</italic><sub>1</sub>, <italic>v</italic><sub>2</sub>, <italic>v</italic><sub>3</sub>, from the population, excluded the target, are combined in a mutant vector: <italic>v</italic><sub><italic>m</italic></sub> &#x0003D; <italic>v</italic><sub>1</sub> &#x0002B; <italic>F</italic> &#x000B7; (<italic>v</italic><sub>2</sub> &#x02212; <italic>v</italic><sub>3</sub>), where <italic>F</italic> &#x02208; [0, 2] is the <italic>differential weight</italic>. In the crossover, given the <italic>crossover rate CR</italic> &#x02208; [0, 1], the trial member is computed randomly selecting an element either from the target or the mutant with probability 1&#x02212;<italic>CR</italic> and <italic>CR</italic> respectively. Finally, DE compares the fitness measure of the target and the trial vectors. The one with the best value remains in the next generation and the other is discarded. The interested reader can refer to Cimino et al. (<xref ref-type="bibr" rid="B4">2015</xref>) for further information on the parameterization and the behavior of DE.</p>
<p>To identify the proper values of <italic>F</italic> and <italic>CR</italic> we ran the algorithm 10 times for 100 generations with the following combinations of parameters: <italic>F</italic> from 0.1 to 2 in steps of 0.1, and <italic>CR</italic> from 0.1 to 0.9 in steps of 0.1. We considered two criteria: (i) the algorithm converges to the best fitness and (ii) the least average number of generations needed to find the solution. The convergence condition is met when at least one of the member fitness is lesser than the best fitness among all trials increased by 1%.</p>
</sec>
</sec>
<sec>
<title>3.2. Results</title>
<sec>
<title>3.2.1. Parameter fitting&#x00027;s results</title>
<p>The best set of <italic>CR</italic> and <italic>F</italic> is 0.9 and 0.6, respectively. With this setting we ran DE for 100 times and in Table <xref ref-type="table" rid="T1">1</xref> we summarized the results. For Renyi&#x00027;s model optimization, all the runs converged toward the same solution. For Marcellin&#x00027;s model 95% converged toward the best solution found among all trials. Only 5% converged toward a local minimum. However, the worst solution found by DE with Marcellin&#x00027;s model is still better than the best found with Renyi&#x00027;s model.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p><bold>Best parameter tuning for different entropy models and their respective best and worst fitness measures (and percentage of convergence) found by DE</bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th valign="top" align="left"><bold>Entropy</bold></th>
<th valign="top" align="left"><bold>Best parameters tuning</bold></th>
<th valign="top" align="center" colspan="2" style="border-bottom: thin solid #000000;"><bold>Fitness measure value (% convergence)</bold></th>
</tr>
<tr>
<th/>
<th/>
<th valign="top" align="center"><bold>Best</bold></th>
<th valign="top" align="center"><bold>Worst</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Shannon</td>
<td valign="top" align="left">&#x02013;</td>
<td valign="top" align="center">0.949</td>
<td valign="top" align="center">x</td>
</tr>
<tr>
<td valign="top" align="left">Renyi</td>
<td valign="top" align="left">&#x003B1; &#x0003D; 2.37</td>
<td valign="top" align="center">0.875 (100%)</td>
<td valign="top" align="center">(0%)</td>
</tr>
<tr>
<td valign="top" align="left">Marcellin</td>
<td valign="top" align="left"><italic>w<sub>OO</sub></italic> &#x0003D; 0.33, <italic>w<sub>XO</sub></italic> &#x0003D; 0.68, <italic>w<sub>OX</sub></italic> &#x0003D; 0.69, <italic>w<sub>XX</sub></italic> &#x0003D; 0.30</td>
<td valign="top" align="center">0.728 (95%)</td>
<td valign="top" align="center">0.755 (5%)</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>As described in Section 3.1.1, Marcellin&#x00027;s model is subjected to two constraints: the first binds the weights <italic>w</italic><sub><italic>OO</italic></sub> and <italic>w</italic><sub><italic>XX</italic></sub> to be close to one another, and the second binds <italic>w</italic><sub><italic>XO</italic></sub> to <italic>w</italic><sub><italic>OX</italic></sub>. We implemented the constraint binding the relative distance between the weights <italic>w</italic><sub><italic>OO</italic></sub> and <italic>w</italic><sub><italic>XX</italic></sub> to be lesser than or equal to 0.1 as in Equation (7) [same for weights <italic>w</italic><sub><italic>OX</italic></sub> and <italic>w</italic><sub><italic>XO</italic></sub> in Equation (8)]. This implementation restricts the search space of DE, i.e., the trial vectors violating at least one of the constraints are discarded and a new one is instead generated. These constraints reflect that the uniform digram XX should be equivalent to OO from a psychological point of view (as well as XO should be equivalent to OX).</p>
<disp-formula id="E8"><label>(7)</label><mml:math id="M21"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mn>2</mml:mn><mml:mo>&#x000B7;</mml:mo><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>O</mml:mi><mml:mi>O</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>X</mml:mi><mml:mi>X</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>O</mml:mi><mml:mi>O</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>X</mml:mi><mml:mi>X</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo>&#x02264;</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>1</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<disp-formula id="E9"><label>(8)</label><mml:math id="M22"><mml:mtable class="eqnarray" columnalign="right center left"><mml:mtr><mml:mtd><mml:mn>2</mml:mn><mml:mo>&#x000B7;</mml:mo><mml:mfrac><mml:mrow><mml:mo>|</mml:mo><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>O</mml:mi><mml:mi>X</mml:mi></mml:mrow></mml:msub><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>X</mml:mi><mml:mi>O</mml:mi></mml:mrow></mml:msub><mml:mo>|</mml:mo></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>O</mml:mi><mml:mi>X</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>w</mml:mi></mml:mrow><mml:mrow><mml:mi>X</mml:mi><mml:mi>O</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:mfrac><mml:mo>&#x02264;</mml:mo><mml:mn>0</mml:mn><mml:mo>.</mml:mo><mml:mn>1</mml:mn></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>In line with our expectations, we found that <italic>w</italic><sub><italic>OO</italic></sub>, <italic>w</italic><sub><italic>XX</italic></sub> were comprised between 0 and 0.5 whereas <italic>w</italic><sub><italic>XO</italic></sub> and <italic>w</italic><sub><italic>OX</italic></sub> were comprised between 0.5 and 1 (Table <xref ref-type="table" rid="T1">1</xref>). Figure <xref ref-type="fig" rid="F2">2</xref> shows the best fit of the empirical data from Falk and Konold (target function, solid line) by the three entropy models. While Shannon&#x00027;s (dashed line) and Renyi&#x00027;s (dotted line) models show a symmetrical curve centered in <italic>P</italic>(<italic>A</italic>) &#x0003D; 0.5, Marcellin&#x00027;s model (solid with circles) shows an asymmetrically right-skewed shape more closely approximating Falk and Konold&#x00027;s data.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p><bold>The target function (solid) is the empirical data from Falk and Konold</bold>. The entropies are respectively computed by Shannon (dashed), Renyi (dotted), and Marcellin (solid with circles) formulas after parameter fitting.</p></caption>
<graphic xlink:href="fpsyg-07-01027-g0002.tif"/>
</fig>
</sec>
<sec>
<title>3.2.2. Validation of Marcellin&#x00027;s entropy as a measure of subjective randomness</title>
<p>Given the parameters obtained for Marcellin&#x00027;s entropy in the previous section, we computed its Pearson product-moment correlations with both the results of other randomness task experiments and other two subjective randomness scores (DP and Griffiths and Tenenbaum model). Griffiths and Tenenbaum parameters were estimated in a separate experiment (Gronchi and Sloman, <xref ref-type="bibr" rid="B10">2009</xref>). The two datasets that we employed for validation were based on a categorization task: participants observed sequences of eight binary elements (Heads and Tails). All the possible sequences of eight elements are 256, but since there are two sequences for each different configuration of elements (e.g., TTTTTTTT is equivalent to HHHHHHHH), only half of them were employed (128). Participants were instructed that they were going to see sequences which had either been produced by a random process (flipping a fair coin) or by some other process in which the sequences of outcomes were not random, and they had to classify these sequences according to what they believed to be their source (random or regular). For each sequence, the authors computed the proportion of participants that classified the strings as random (thus, obtaining a value between 0 and 1). Experiment A (Gronchi and Sloman, <xref ref-type="bibr" rid="B10">2009</xref>) was conducted without measuring reaction times of participants whereas in experiment B (Gronchi and Sloman, <xref ref-type="bibr" rid="B10">2009</xref>) participants were required to respond as fast as they could and reaction times were recorded.</p>
<p>The relationship between the percentage of random responses given to each of the 128 sequences of Experiment A (Figure <xref ref-type="fig" rid="F3">3A</xref>) and B (Figure 3B) and their Marcellin&#x00027;s entropy (with fitted parameters) resulted in a Pearson&#x00027;s <italic>r</italic> &#x0003D; 0.60 and <italic>r</italic> &#x0003D; 0.67 respectively (Figure <xref ref-type="fig" rid="F3">3</xref>). Correlations were also computed between the percentage of random responses and the other corresponding measures of subjective entropy (DP and Griffiths and Tenenbaum, Table <xref ref-type="table" rid="T2">2</xref>). Experiment A results were highly correlated to all measures of subjective randomness: it was observed a correlation value equal to 0.60 for Marcellin, 0.67 for DP, and 0.76 for Griffiths and Tenenbaum model. With regard to experiment B, correlation values were 0.67, 0.73, and 0.80 for Marcellin, DP, and Griffiths and Tenenbaum, respectively.</p>
<table-wrap position="float" id="T2">
<label>Table 2</label>
<caption><p><bold>Pearson product-moment correlations of Experiment A and B results with different subjective randomness scores (Marcellin&#x00027;s Entropy, Difficulty Predictor, Griffiths and Tenenbaum&#x00027;s model)</bold>.</p></caption>
<table frame="hsides" rules="groups">
<thead>
<tr>
<th/>
<th valign="top" align="center"><bold>Marcellin</bold></th>
<th valign="top" align="center"><bold>DP</bold></th>
<th valign="top" align="center"><bold>G&#x00026;T</bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Experiment A</td>
<td valign="top" align="center">0.60</td>
<td valign="top" align="center">0.67</td>
<td valign="top" align="center">0.76</td>
</tr>
<tr>
<td valign="top" align="left">Experiment B</td>
<td valign="top" align="center">0.67</td>
<td valign="top" align="center">0.73</td>
<td valign="top" align="center">0.80</td>
</tr>
</tbody>
</table>
</table-wrap>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p><bold>Relationship between the percentage of random responses for the set of 128 sequences of Experiment A (A) and B (B) and their Marcellin&#x00027;s entropy (with fitted parameters, Table <xref ref-type="table" rid="T1">1</xref>)</bold>. Pearson&#x00027;s <italic>r</italic> &#x0003D; 0.60 and <italic>r</italic> &#x0003D; 0.67, respectively.</p></caption>
<graphic xlink:href="fpsyg-07-01027-g0003.tif"/>
</fig>
</sec>
</sec>
</sec>
<sec id="s4">
<title>4. Discussion and conclusions</title>
<p>In this paper we investigated the potentiality of Marcellin&#x00027;s asymmetric entropy for predicting randomness judgments and the overalternating bias. Fitting Marcellin&#x00027;s entropy to randomness rating, we observed a better fit compared to subjective randomness measures based on classical Shannon&#x00027;s entropy and on Renyi&#x00027;s entropy, which represents a generalized form of such measures of uncertainty comprising many different kinds of symmetric entropies. The estimated parameters for Marcellin&#x00027;s are coherent with the overalternating bias: the overabundance of alternating substrings corresponds to a high value of subjective randomness (compared to the Shannon entropy criterion which provides an equal proportion of these substrings). In the same way, the lack of uniform substrings indicates a high value of subjective randomness. Frequencies are about 68% for the alternating substrings and 30&#x02013;32% for the uniform ones. Differently from Marcellin&#x00027;s, the other entropy measures are symmetric around the equipartition of the events, so they are unable to account for the overalternating bias.</p>
<p>We validated the asymmetric-entropy-based measure of subjective randomness correlating it with different datasets and other subjective randomness scores (DP and Griffiths and Tenenbaum model). As expected by previous literature (Falk and Konold, <xref ref-type="bibr" rid="B6">1997</xref>; Griffiths and Tenenbaum, <xref ref-type="bibr" rid="B8">2003</xref>, <xref ref-type="bibr" rid="B9">2004</xref>), DP and Griffiths and Tenenbaum&#x00027;s model were highly correlated with empirical judgments. Although such correlations were higher compared to Marcellin&#x00027;s entropy, the Pearsons&#x00027;s <italic>r</italic>-values of the latter are indeed high, with a minimum value of 0.60. Given the very noisy nature of these experiments, this result confirms the potentiality of Marcellin&#x00027;s asymmetric entropy for modeling randomness judgments.</p>
<p>Marcellin&#x00027;s entropy may thus represent a viable alternative to DP and Griffiths and Tenenbaum&#x00027;s measure. As a matter of fact, there can be some cases in which such measures are of limited use. Being a parameter-free measure, the simplicity and the lack of any theoretical framework of DP are together its strengths and weaknesses. On the one hand, DP can easily be computed for quantifying subjective randomness without any fitting procedure. On the other hand, DP cannot be used to investigate how different factors can affect randomness judgments and the overalternating bias. Moreover, DP is a coherent measure only when computed over strings of the same length because it is not affected by the length of uniform subsequences of outcomes (such as XXX or XXXXX) in a sequence. Indeed, the subjective randomness of a uniform subsequence decreases as the length of the substring increases.</p>
<p>On the contrary, the measure of Griffiths and Tenenbaum (<xref ref-type="bibr" rid="B8">2003</xref>, <xref ref-type="bibr" rid="B9">2004</xref>) has several parameters and it is theoretically grounded in the Bayesian probabilistic framework. Its complexity is counterbalanced by the possibility to model what are the kinds of regularities (motifs repetition and length, symmetries, duplications) that influence randomness judgments. So, by using the Griffiths and Tenenbaum (<xref ref-type="bibr" rid="B8">2003</xref>, <xref ref-type="bibr" rid="B9">2004</xref>) model it is possible to investigate how different factors can alter subjective randomness and hypothesis about randomness perception. For example, employing this model, Hsu et al. (<xref ref-type="bibr" rid="B13">2010</xref>) explored the hypothesis that the regularities detected in two-dimensional binary visual arrays (and the resulting randomness evaluation) is affected by the statistical structure of our visual world. Furthermore, the model of Griffiths and Tenenbaum was conceived combining the rational statistical inference approach with the algorithmic information theory<xref ref-type="fn" rid="fn0001"><sup>1</sup></xref>. Being grounded into well-known mathematical and information science theories, their measure can exploit the advantages of being expressed in formal terms. Significantly, the authors demonstrated how the Bayesian probabilistic modeling approach (that has been proven to account for many psychological phenomena) is also able to address the domain of randomness perception. However, this aspect can also be a limiting factor because the use of the Bayesian approach in psychology is still a controversial issue (Bowers and Davis, <xref ref-type="bibr" rid="B3">2012</xref>) and there is no unanimously accepted opinion about its application in modeling cognitive processes.</p>
<p>Marcellin&#x00027;s entropy as a measure of subjective randomness stands in a middle ground between DP and Griffiths and Tenenbaum&#x00027;s model. Differently from DP, Marcellin&#x00027;s entropy is a parameter-based measure but it is more simple and parsimonious than Griffiths and Tenenbaum&#x00027;s model. Marcellin&#x00027;s entropy can be employed to quantify how much randomness judgments are distorted toward the overalternating bias and thus it is possible to investigate how different factors may affect participants&#x00027; responses. However, the greater parsimony of Marcellin&#x00027;s measure entails the impossibility to assess the kinds of regularities that influence the judgments. In sum, Marcellin&#x00027;s entropy is a measure, defined in formal terms and drawn from current data mining literature, whose parameters appear to suitably characterize the bias in our perception and does not require to accept the Bayesian approach as a theoretical reference framework. Such measure can be considered a suitable alternative to DP and Griffiths and Tenenbaum&#x00027;s model to quantify subjective randomness in future psychological studies.</p>
</sec>
<sec id="s5">
<title>Author contributions</title>
<p>GG conceived of the study, participated to simulation, drafted and discussed with MR and SN the initial manuscript, and corrected and organized the subsequent versions. MR assisted in drafting the initial manuscript and developed the MATLAB scripts for carrying out the initial analysis. SN wrote Section 2 and contributed to the formal analysis of the comparisons among measures. AL implemented the Differential Evolution algorithm carrying out the fitting procedures. AG participated to simulation and data analyses and reviewed the work. All authors approved the final manuscript as submitted.</p>
</sec>
<sec id="s6">
<title>Funding</title>
<p>This work was supported by EU Commission (FP7-ICT-2013-10) Proposal No. 611299 SciCafe 2.0.</p>
<sec>
<title>Conflict of interest statement</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</sec>
</body>
<back>
<ack><p>The authors would like to thank Thomas Griffiths for sharing his work (together with Josh Tenenbaum) with us and Luis Diambra and Andrei Martinez-Finkelshtein for their useful criticism in improving the quality of the final manuscript.</p>
</ack>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Attneave</surname> <given-names>F.</given-names></name></person-group> (<year>1959</year>). <source>Applications of Information Theory to Psychology: A Summary of Basic Concepts, Methods, and Results</source>. <publisher-loc>NewYork, NY</publisher-loc>: <publisher-name>Holt-Dryden</publisher-name>.</citation>
</ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Beck</surname> <given-names>C.</given-names></name> <name><surname>Cohen</surname> <given-names>E. G. D.</given-names></name></person-group> (<year>2003</year>). <article-title>Superstatistics</article-title>. <source>Phys. A Stat. Mech. Appl.</source> <volume>322</volume>, <fpage>267</fpage>&#x02013;<lpage>275</lpage>. <pub-id pub-id-type="doi">10.1016/S0378-4371(03)00019-0</pub-id></citation>
</ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bowers</surname> <given-names>J. S.</given-names></name> <name><surname>Davis</surname> <given-names>C. J.</given-names></name></person-group> (<year>2012</year>). <article-title>Bayesian just-so stories in psychology and neuroscience</article-title>. <source>Psychol. Bull.</source> <volume>138</volume>, <fpage>389</fpage>. <pub-id pub-id-type="doi">10.1037/a0026450</pub-id><pub-id pub-id-type="pmid">22545686</pub-id></citation>
</ref>
<ref id="B4">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Cimino</surname> <given-names>M. G. C. A.</given-names></name> <name><surname>Lazzeri</surname> <given-names>A.</given-names></name> <name><surname>Vaglini</surname> <given-names>G.</given-names></name></person-group> (<year>2015</year>). <article-title>Improving the analysis of context-aware information via marker-based stigmergy and differential evolution</article-title>, in <source>14th International Conference on Artificial Intelligence and Soft Computing (ICAISC 2015)</source>, eds <person-group person-group-type="editor"><name><surname>Rutkowski</surname> <given-names>L.</given-names></name> <name><surname>Korytkowski</surname> <given-names>M.</given-names></name> <name><surname>Scherer</surname> <given-names>R.</given-names></name> <name><surname>Tadeusiewicz</surname> <given-names>R.</given-names></name> <name><surname>Zadeh</surname> <given-names>L. A.</given-names></name> <name><surname>Zurada</surname> <given-names>J.</given-names></name></person-group> (<publisher-loc>Zakopane</publisher-loc>: <publisher-name>Springer International Publishing</publisher-name>), <fpage>341</fpage>&#x02013;<lpage>352</lpage>.</citation>
</ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Das</surname> <given-names>S.</given-names></name> <name><surname>Suganthan</surname> <given-names>P. N.</given-names></name></person-group> (<year>2011</year>). <article-title>Differential evolution: a survey of the state-of-the-art</article-title>. <source>IEEE Trans. Evol. Comput.</source> <volume>15</volume>, <fpage>4</fpage>&#x02013;<lpage>31</lpage>. <pub-id pub-id-type="doi">10.1109/TEVC.2010.2059031</pub-id></citation>
</ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Falk</surname> <given-names>R.</given-names></name> <name><surname>Konold</surname> <given-names>C.</given-names></name></person-group> (<year>1997</year>). <article-title>Making sense of randomness: implicit encoding as a basis for judgment</article-title>. <source>Psychol. Rev.</source> <volume>104</volume>, <fpage>301</fpage>&#x02013;<lpage>318</lpage>. <pub-id pub-id-type="doi">10.1037/0033-295X.104.2.301</pub-id></citation>
</ref>
<ref id="B7">
<citation citation-type="other"><person-group person-group-type="author"><name><surname>Griffiths</surname> <given-names>T. L.</given-names></name></person-group> (<year>2004</year>). <source>Causes, Coincidences, and Theories</source>. Ph.D. thesis, Stanford University.</citation>
</ref>
<ref id="B8">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Griffiths</surname> <given-names>T. L.</given-names></name> <name><surname>Tenenbaum</surname> <given-names>J. B.</given-names></name></person-group> (<year>2003</year>). <article-title>Probability, algorithmic complexity, and subjective randomness</article-title>, in <source>25th Annual Conference of the Cognitive Science Society</source>, eds <person-group person-group-type="editor"><name><surname>Altman</surname> <given-names>R.</given-names></name> <name><surname>Kirsh</surname> <given-names>D.</given-names></name></person-group> (<publisher-loc>Boston, MA</publisher-loc>: <publisher-name>Cognitive Science Society</publisher-name>), <fpage>480</fpage>&#x02013;<lpage>485</lpage>.</citation>
</ref>
<ref id="B9">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Griffiths</surname> <given-names>T. L.</given-names></name> <name><surname>Tenenbaum</surname> <given-names>J. B.</given-names></name></person-group> (<year>2004</year>). <article-title>From algorithmic to subjective randomness</article-title>, in <source>Advances in Neural Information Processing Systems</source>, <volume>Vol. 16</volume>, eds <person-group person-group-type="editor"><name><surname>Thrun</surname> <given-names>S.</given-names></name> <name><surname>Lawrence</surname> <given-names>K. S.</given-names></name> <name><surname>Sch&#x000F6;lkopf</surname> <given-names>B.</given-names></name></person-group> (<publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>MIT Press</publisher-name>), <fpage>953</fpage>&#x02013;<lpage>960</lpage>.</citation>
</ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gronchi</surname> <given-names>G.</given-names></name> <name><surname>Sloman</surname> <given-names>S. A.</given-names></name></person-group> (<year>2009</year>). <article-title>Using reaction times to compare two models of randomness perception</article-title>, in <source>31st Annual Conference of the Cognitive Science Society</source>, eds <person-group person-group-type="editor"><name><surname>Taatgen</surname> <given-names>N.</given-names></name> <name><surname>van Rijn</surname> <given-names>H.</given-names></name></person-group> (Cognitive Science Society), <fpage>993</fpage>.</citation>
</ref>
<ref id="B11">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Halley</surname> <given-names>E.</given-names></name></person-group> (<year>1752</year>). <source>Astronomical Tables with Precepts both in English and in Latin for Computing the Places of the Sun, Moon, Planets, and Comets</source>. <publisher-loc>London</publisher-loc>: <publisher-name>W. Innys</publisher-name>.</citation>
</ref>
<ref id="B12">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Hastie</surname> <given-names>R.</given-names></name> <name><surname>Dawes</surname> <given-names>R. M.</given-names></name></person-group> (<year>2010</year>). <source>Rational Choice in an Uncertain World: The Psychology of Judgment and Decision Making</source>. <publisher-loc>Thousands Oaks, CA</publisher-loc>: <publisher-name>Sage</publisher-name>.</citation>
</ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Hsu</surname> <given-names>A. S.</given-names></name> <name><surname>Griffiths</surname> <given-names>T. L.</given-names></name> <name><surname>Schreiber</surname> <given-names>E.</given-names></name></person-group> (<year>2010</year>). <article-title>Subjective randomness and natural scene statistics</article-title>. <source>Psychon. Bull. Rev.</source> <volume>17</volume>, <fpage>624</fpage>&#x02013;<lpage>629</lpage>. <pub-id pub-id-type="doi">10.3758/PBR.17.5.624</pub-id><pub-id pub-id-type="pmid">21037158</pub-id></citation>
</ref>
<ref id="B14">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Li</surname> <given-names>M.</given-names></name> <name><surname>Vit&#x000E1;nyi</surname> <given-names>P.</given-names></name></person-group> (<year>1997</year>). <source>An Introduction to Kolmogorov Complexity and its Applications</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Springer Heidelberg</publisher-name>.</citation>
</ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Lopes</surname> <given-names>L. L.</given-names></name></person-group> (<year>1982</year>). <article-title>Doing the impossible: a note on induction and the experience of randomness</article-title>. <source>J. Exp. Psychol. Learn. Mem. Cogn.</source> <volume>8</volume>, <fpage>626</fpage>&#x02013;<lpage>636</lpage>.</citation>
</ref>
<ref id="B16">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Marcellin</surname> <given-names>S.</given-names></name> <name><surname>Zighed</surname> <given-names>D. A.</given-names></name> <name><surname>Ritschard</surname> <given-names>G.</given-names></name></person-group> (<year>2006</year>). <article-title>An asymmetric entropy measure for decision trees</article-title>, in <source>The 11th International Conference on Information Processing and Management of Uncertainty (IPMU)</source>, eds <person-group person-group-type="editor"><name><surname>Bouchon-Meunier</surname> <given-names>B.</given-names></name> <name><surname>Yager</surname> <given-names>R. R.</given-names></name></person-group> (<publisher-loc>Paris</publisher-loc>: <publisher-name>Editions EDK</publisher-name>), <fpage>1292</fpage>&#x02013;<lpage>1299</lpage>.</citation>
</ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Nickerson</surname> <given-names>R. S.</given-names></name></person-group> (<year>2002</year>). <article-title>The production and perception of randomness</article-title>. <source>Psychol. Rev.</source> <volume>109</volume>, <fpage>330</fpage>&#x02013;<lpage>357</lpage>. <pub-id pub-id-type="doi">10.1016/0196-8858(91)90029-I</pub-id><pub-id pub-id-type="pmid">11990321</pub-id></citation>
</ref>
<ref id="B18">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>R&#x000E9;nyi</surname> <given-names>A.</given-names></name></person-group> (<year>1961</year>). <article-title>On measures of entropy and information</article-title>, in <source>4th Berkeley Symposium on Mathematics, Statistics and Probability, Vol. 1: Contributions to the Theory of Statistics</source>, ed <person-group person-group-type="editor"><name><surname>Neymanpages</surname> <given-names>J.</given-names></name></person-group> (<publisher-loc>Berkeley, CA</publisher-loc>: <publisher-name>University of California Press</publisher-name>), <fpage>547</fpage>&#x02013;<lpage>561</lpage>.</citation>
</ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shannon</surname> <given-names>C. E.</given-names></name></person-group> (<year>1948</year>). <article-title>A mathematical theory of communication</article-title>. <source>Bell Syst. Tech. J.</source> <volume>27</volume>, <fpage>379</fpage>&#x02013;<lpage>423</lpage>.</citation>
</ref>
<ref id="B20">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Storn</surname> <given-names>R.</given-names></name> <name><surname>Price</surname> <given-names>K.</given-names></name></person-group> (<year>1997</year>). <article-title>Differential evolution&#x02013;a simple and efficient heuristic for global optimization over continuous spaces</article-title>. <source>J. Glob. Optimization</source> <volume>11</volume>, <fpage>341</fpage>&#x02013;<lpage>359</lpage>.</citation>
</ref>
<ref id="B21">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tsekouras</surname> <given-names>G. A.</given-names></name> <name><surname>Tsallis</surname> <given-names>C.</given-names></name></person-group> (<year>2005</year>). <article-title>Generalized entropy arising from a distribution of q indices</article-title>. <source>Phys. Rev. E</source> <volume>71</volume>:<fpage>046144</fpage>. <pub-id pub-id-type="doi">10.1103/PhysRevE.71.046144</pub-id><pub-id pub-id-type="pmid">15903763</pub-id></citation>
</ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Volchan</surname> <given-names>S. B.</given-names></name></person-group> (<year>2002</year>). <article-title>What is a random sequence?</article-title> <source>Am. Math. Mon.</source> <volume>109</volume>, <fpage>46</fpage>&#x02013;<lpage>63</lpage>. <pub-id pub-id-type="doi">10.2307/2695767</pub-id><pub-id pub-id-type="pmid">22233803</pub-id></citation>
</ref>
</ref-list>
<fn-group>
<fn id="fn0001"><p><sup>1</sup>Within algorithmic information theory, the concept of Kolmogorov&#x00027;s complexity has been used to mathematically define the randomness of sequences of outcomes (Li and Vit&#x000E1;nyi, <xref ref-type="bibr" rid="B14">1997</xref>).</p></fn>
</fn-group>
</back>
</article>