<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink" article-type="research-article">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Neurorobot.</journal-id>
<journal-title>Frontiers in Neurorobotics</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Neurorobot.</abbrev-journal-title>
<issn pub-type="epub">1662-5218</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="doi">10.3389/fnbot.2021.648527</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Neuroscience</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>A Dynamical Generative Model of Social Interactions</article-title>
</title-group>
<contrib-group>
<contrib contrib-type="author">
<name><surname>Salatiello</surname> <given-names>Alessandro</given-names></name>
<xref ref-type="author-notes" rid="fn001"><sup>&#x02020;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/1173795/overview"/>
</contrib>
<contrib contrib-type="author">
<name><surname>Hovaidi-Ardestani</surname> <given-names>Mohammad</given-names></name>
<xref ref-type="author-notes" rid="fn001"><sup>&#x02020;</sup></xref>
</contrib>
<contrib contrib-type="author" corresp="yes">
<name><surname>Giese</surname> <given-names>Martin A.</given-names></name>
<xref ref-type="corresp" rid="c001"><sup>&#x0002A;</sup></xref>
<uri xlink:href="http://loop.frontiersin.org/people/11058/overview"/>
</contrib>
</contrib-group>
<aff><institution>Section for Computational Sensomotorics, Department of Cognitive Neurology, Centre for Integrative Neuroscience, Hertie Institute for Clinical Brain Research, University Clinic T&#x000FC;bingen</institution>, <addr-line>T&#x000FC;bingen</addr-line>, <country>Germany</country></aff>
<author-notes>
<fn fn-type="edited-by"><p>Edited by: Letizia Marchegiani, Aalborg University, Denmark</p></fn>
<fn fn-type="edited-by"><p>Reviewed by: Ashley Liddiard, Ford Motor Company, United States; Bin Zhi Li, Chongqing Institute of Green and Intelligent Technology (CAS), China</p></fn>
<corresp id="c001">&#x0002A;Correspondence: Martin A. Giese <email>martin.giese&#x00040;uni-tuebingen.de</email></corresp>
<fn fn-type="other" id="fn001"><p>&#x02020;These authors have contributed equally to this work</p></fn></author-notes>
<pub-date pub-type="epub">
<day>09</day>
<month>06</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>15</volume>
<elocation-id>648527</elocation-id>
<history>
<date date-type="received">
<day>31</day>
<month>12</month>
<year>2020</year>
</date>
<date date-type="accepted">
<day>23</day>
<month>04</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#x000A9; 2021 Salatiello, Hovaidi-Ardestani and Giese.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Salatiello, Hovaidi-Ardestani and Giese</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/"><p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.</p></license></permissions>
<abstract><p>The ability to make accurate social inferences makes humans able to navigate and act in their social environment effortlessly. Converging evidence shows that motion is one of the most informative cues in shaping the perception of social interactions. However, the scarcity of parameterized generative models for the generation of highly-controlled stimuli has slowed down both the identification of the most critical motion features and the understanding of the computational mechanisms underlying their extraction and processing from rich visual inputs. In this work, we introduce a novel generative model for the automatic generation of an arbitrarily large number of videos of socially interacting agents for comprehensive studies of social perception. The proposed framework, validated with three psychophysical experiments, allows generating as many as 15 distinct interaction classes. The model builds on classical dynamical system models of biological navigation and is able to generate visual stimuli that are parametrically controlled and representative of a heterogeneous set of social interaction classes. The proposed method represents thus an important tool for experiments aimed at unveiling the computational mechanisms mediating the perception of social interactions. The ability to generate highly-controlled stimuli makes the model valuable not only to conduct behavioral and neuroimaging studies, but also to develop and validate neural models of social inference, and machine vision systems for the automatic recognition of social interactions. In fact, contrasting human and model responses to a heterogeneous set of highly-controlled stimuli can help to identify critical computational steps in the processing of social interaction stimuli.</p></abstract>
<kwd-group>
<kwd>social interactions</kwd>
<kwd>generative model</kwd>
<kwd>motion cues</kwd>
<kwd>social perception</kwd>
<kwd>social inference</kwd>
</kwd-group>
<counts>
<fig-count count="6"/>
<table-count count="2"/>
<equation-count count="7"/>
<ref-count count="68"/>
<page-count count="13"/>
<word-count count="8988"/>
</counts>
</article-meta>
</front>
<body>
<sec sec-type="intro" id="s1">
<title>1. Introduction</title>
<p>Human and non-human primates are able to recognize the social interactions taking place in their environment quickly and effortlessly: with a few glances out of the window, we can easily understand whether two people are following each other, avoiding each other, fighting, or are engaging in some other form of social behavior. Notably, such interactive behaviors can be recognized even when the available visual information is poor: for example, when the scene we are watching is unfolding behind the leaves of a tree, at a considerable distance from us, or in a low-resolution video. In some of these situations, critical visual cues such as facial expressions might be completely occluded, yet our ability to make social inference is largely unaffected. Such perceptual ability is instrumental in allowing us to move in our social environment and flexibly interact with it, while abiding by the social norms (Troje et al., <xref ref-type="bibr" rid="B63">2013</xref>). Therefore, it constitutes an important social skill that is worth characterizing and modeling also for the development of social robots.</p>
<p>Understanding the neural mechanisms underlying the inference of animacy and social interactions from visual inputs is a long-standing research challenge (Heider and Simmel, <xref ref-type="bibr" rid="B22">1944</xref>; Michotte, <xref ref-type="bibr" rid="B31">1946</xref>; Scholl and Tremoulet, <xref ref-type="bibr" rid="B44">2000</xref>; Troje et al., <xref ref-type="bibr" rid="B63">2013</xref>). Recent work has started identifying some of the responsible neural circuits (Castelli et al., <xref ref-type="bibr" rid="B10">2000</xref>; Isik et al., <xref ref-type="bibr" rid="B23">2017</xref>; Sliwa and Freiwald, <xref ref-type="bibr" rid="B55">2017</xref>; Walbrin et al., <xref ref-type="bibr" rid="B66">2018</xref>; Freiwald, <xref ref-type="bibr" rid="B14">2020</xref>). Even though the detailed computational mechanisms mediating the formation of social percepts from visual inputs remain largely unknown, converging evidence has shown that the observation of biological motion alone is enough for humans to make accurate social inferences (e.g., Heider and Simmel, <xref ref-type="bibr" rid="B22">1944</xref>; Tremoulet and Feldman, <xref ref-type="bibr" rid="B61">2000</xref>; McAleer and Pollick, <xref ref-type="bibr" rid="B30">2008</xref>; Roether et al., <xref ref-type="bibr" rid="B40">2009</xref>). For example, Heider and Simmel (<xref ref-type="bibr" rid="B22">1944</xref>) demonstrated that humans can reliably decode animacy and social interactions from strongly impoverished stimuli consisting of simple geometrical figures moving around in the two-dimensional plane. Remarkably, despite their highly abstract nature, the visual stimuli used in this study were perceived as <italic>alive</italic> and sometimes even <italic>anthropomorphic</italic>: the agents were often considered as endowed with intentions, emotions, and even personality traits.</p>
<p>Several subsequent studies (e.g., Oatley and Yuill, <xref ref-type="bibr" rid="B33">1985</xref>; Rim&#x000E9; et al., <xref ref-type="bibr" rid="B38">1985</xref>; Springer et al., <xref ref-type="bibr" rid="B56">1996</xref>; Castelli et al., <xref ref-type="bibr" rid="B10">2000</xref>, <xref ref-type="bibr" rid="B9">2002</xref>) replicated these findings using similar stimuli and showed that the inference of social interactions from impoverished stimuli is a cross-cultural phenomenon (Rim&#x000E9; et al., <xref ref-type="bibr" rid="B38">1985</xref>) that is present even in 5-year-old preschoolers (Springer et al., <xref ref-type="bibr" rid="B56">1996</xref>). Taken together, these findings support the view that the perception of animacy and social interactions might rely on some innate and automatic processing of low-level kinematic features present in the visual inputs, rather than on higher-level cognitive processing (Scholl and Gao, <xref ref-type="bibr" rid="B43">2013</xref>).</p>
<p>The identification of the most critical visual features that shape these social percepts has also received great attention (Tremoulet and Feldman, <xref ref-type="bibr" rid="B61">2000</xref>, <xref ref-type="bibr" rid="B62">2006</xref>). For example, influential work suggested that these percepts are mediated by the detection of apparent violations of the principle of conservation of energy (Dittrich and Lea, <xref ref-type="bibr" rid="B12">1994</xref>; Gelman et al., <xref ref-type="bibr" rid="B18">1995</xref>; Csibra, <xref ref-type="bibr" rid="B11">2008</xref>; Kaduk et al., <xref ref-type="bibr" rid="B24">2013</xref>). Later research proved that also agent&#x00027;s orientation, velocity, and acceleration play a major role (Szego and Rutherford, <xref ref-type="bibr" rid="B58">2008</xref>; Tr&#x000E4;uble et al., <xref ref-type="bibr" rid="B60">2014</xref>). At the same time, neuroimaging work has shed light on some of the brain regions mediating these phenomena: the right posterior superior temporal sulcus (pSTS&#x02014;Isik et al., <xref ref-type="bibr" rid="B23">2017</xref>; Walbrin et al., <xref ref-type="bibr" rid="B66">2018</xref>), the medial prefrontal cortex (mPFC&#x02014;Castelli et al., <xref ref-type="bibr" rid="B10">2000</xref>; Sliwa and Freiwald, <xref ref-type="bibr" rid="B55">2017</xref>), and the right temporoparietal junction (TPJ&#x02014;Castelli et al., <xref ref-type="bibr" rid="B10">2000</xref>; Saxe and Kanwisher, <xref ref-type="bibr" rid="B42">2003</xref>) are among the brain regions most frequently reported as being involved in the perception of social interaction. Interestingly, Schultz and B&#x000FC;lthoff (<xref ref-type="bibr" rid="B48">2019</xref>), recently identified another region&#x02014;the right intraparietal sulcus (IPS)&#x02014;that seems to be exclusively engaged during the perception of animacy.</p>
<p>Clearly, the success of both behavioral and neuroimaging social perception studies is tightly linked to the ability to finely control the visual stimuli that participants are exposed to. Specifically, such stimuli should ideally be generated through a process that allows complete parametric control, the creation of a high number of replicates with sufficient variety, and the gradual reduction of complexity. <italic>Parametric control</italic> (e.g., over agents&#x00027; speed) facilitates the identification of brain regions and individual neurons whose activation covaries with the kinematic features of agents&#x00027; behavior. <italic>Variety</italic> in classes of social interaction allows the characterization of the class-specific and general response properties of such brain regions. <italic>Numerosity</italic> allows averaging out response properties that are independent of social interaction processing. Finally, the ability to control stimulus complexity allows the generation of <italic>impoverished stimuli</italic> that are fundamental to minimize the impact of confounding factors, inevitably present, for example, in real videos. Similarly, such properties are also desirable when designing and validating neural and mechanistic models of human social perceptions: contrasting human and model responses to a variety of highly controlled stimuli can help discriminate between the computational mechanisms that the models capture well from those that need further refinement. This is especially critical for state-of-the-art deep learning models (e.g., Yamins et al., <xref ref-type="bibr" rid="B68">2014</xref>), which can easily have millions of parameters and be prone to over-fitting.</p>
<p>Currently, no well-established method can generate visual stimuli for the analysis of social perception that satisfy all of the above conditions. Because of this, researchers often have to resort to time-consuming and class-specific, heuristic procedures. A creative approach to this problem has been the one adopted by Gordon and Roemmele (<xref ref-type="bibr" rid="B20">2014</xref>), where the task of generating videos was assigned to a set of participants&#x02014;who were asked to create their own videos of socially interacting geometrical shapes, and to label them accordingly. However, typically, researches use visual stimuli where agents&#x00027; trajectories are hand-crafted or hard-coded (e.g., Heider and Simmel, <xref ref-type="bibr" rid="B22">1944</xref>; Oatley and Yuill, <xref ref-type="bibr" rid="B33">1985</xref>; Rim&#x000E9; et al., <xref ref-type="bibr" rid="B38">1985</xref>; Springer et al., <xref ref-type="bibr" rid="B56">1996</xref>; Castelli et al., <xref ref-type="bibr" rid="B10">2000</xref>, <xref ref-type="bibr" rid="B9">2002</xref>; Baker et al., <xref ref-type="bibr" rid="B1">2009</xref>; Gao et al., <xref ref-type="bibr" rid="B17">2009</xref>, <xref ref-type="bibr" rid="B16">2010</xref>; Kaduk et al., <xref ref-type="bibr" rid="B24">2013</xref>; Tr&#x000E4;uble et al., <xref ref-type="bibr" rid="B60">2014</xref>; Isik et al., <xref ref-type="bibr" rid="B23">2017</xref>; van Buren et al., <xref ref-type="bibr" rid="B64">2017</xref>; Walbrin et al., <xref ref-type="bibr" rid="B66">2018</xref>), based on rules (e.g., Kerr and Cohen, <xref ref-type="bibr" rid="B26">2010</xref>; Pantelis et al., <xref ref-type="bibr" rid="B34">2014</xref>), or derived from real videos (e.g., McAleer and Pollick, <xref ref-type="bibr" rid="B30">2008</xref>; McAleer et al., <xref ref-type="bibr" rid="B29">2011</xref>; Thurman and Lu, <xref ref-type="bibr" rid="B59">2014</xref>; Sliwa and Freiwald, <xref ref-type="bibr" rid="B55">2017</xref>; Shu et al., <xref ref-type="bibr" rid="B53">2018</xref>). All of these approaches suffer from significant limitations. Hand-crafted trajectories need to be generated <italic>de novo</italic> for each experimental condition and are not easily amenable to parametric control. Likewise, the extraction of trajectories from real videos also comes with its burdens: real videos need to be recorded, labeled, and heavily processed to remove unwanted background information. Rule-based approaches offer an interesting alternative. However, it is generally difficult to define natural classes of social interactions using rules akin to those used in Kerr and Cohen (<xref ref-type="bibr" rid="B26">2010</xref>) and Pantelis et al. (<xref ref-type="bibr" rid="B34">2014</xref>). Recent work (Schultz and B&#x000FC;lthoff, <xref ref-type="bibr" rid="B48">2019</xref>; Shu et al., <xref ref-type="bibr" rid="B54">2019</xref>, <xref ref-type="bibr" rid="B52">2020</xref>) has generated visual stimuli using model-based methods; however, these models can only generate limited and generic classes of social interaction (namely, cooperative and obstructive behaviors). Finally, specialized literature on the collective behavior of humans and animals has produced a wealth of influential models (Blackwell, <xref ref-type="bibr" rid="B6">1997</xref>; Paris et al., <xref ref-type="bibr" rid="B35">2007</xref>; Luo et al., <xref ref-type="bibr" rid="B28">2008</xref>; Russell et al., <xref ref-type="bibr" rid="B41">2017</xref>); however, such models can also typically account only for simple behaviors (e.g., feeding, resting, and traveling) and for basic interactions (e.g., avoidance and following).</p>
<p>To overcome the limitations of the above methods, in this work, we introduce a dynamical generative model of social interactions. In stark contrast to previous work, our model is able to automatically generate an arbitrary number of parameterized motion trajectories to animate virtual agents with 15 distinct interactive motion styles; the modeled trajectories include the six fundamental interaction categories frequently used in psychophysical experiments (i.e., <italic>Chasing, Fighting, Flirting, Following, Guarding</italic>, and <italic>Playing</italic>&#x02014;Blythe et al. <xref ref-type="bibr" rid="B7">1999</xref>; Barrett et al. <xref ref-type="bibr" rid="B2">2005</xref>; McAleer and Pollick <xref ref-type="bibr" rid="B30">2008</xref>) and nine relevant others. The model controls <italic>speed</italic>, and <italic>motion direction</italic>, arguably the two most critical determinants of social interaction perception (Tremoulet and Feldman, <xref ref-type="bibr" rid="B61">2000</xref>; Szego and Rutherford, <xref ref-type="bibr" rid="B58">2008</xref>; Tr&#x000E4;uble et al., <xref ref-type="bibr" rid="B60">2014</xref>). Finally, we validated the model with three psychophysical experiments, which demonstrate that participants are able to consistently attribute the intended interaction classes to the animations generated with our model.</p>
<p>The rest of the paper is organized as follows. In section 2, we describe the generative model and the experiments we conducted to validate it. Next, in section 3, we summarize the experimental results. Finally, in section 4, we (1) explain how our results validate the developed model, (2) explain how the model compares to related work, and (3) discuss the main limitations of our model and future directions.</p>
</sec>
<sec sec-type="methods" id="s2">
<title>2. Methods</title>
<sec>
<title>2.1. Related Modeling Work</title>
<p>The generative model we introduce in this work builds on classical models of biological and robotic navigation. In the classical work by Reichardt and Poggio (<xref ref-type="bibr" rid="B36">1976</xref>), the authors proposed a dynamical model to describe the navigation behavior of flies intent on chasing moving targets as part of their mating behavior. The core idea was to consider the moving targets as <italic>attractors</italic> of the dynamical system describing the flies&#x00027; trajectories. Subsequently, Sch&#x000F6;ner and Dose (<xref ref-type="bibr" rid="B46">1992</xref>) and Sch&#x000F6;ner et al. (<xref ref-type="bibr" rid="B47">1995</xref>) used a similar approach to develop a biomimetic control system for the navigation of autonomous robots. Critically, such a system was also able to deal with the presence of obstacles in the environment, which were modeled as <italic>repellors</italic>. Extending this system, Fajen and Warren (<xref ref-type="bibr" rid="B13">2003</xref>) built a model of human navigation that was able to closely capture the trajectories described by their participants as they walked naturally toward targets while avoiding obstacles on their way. Specifically, this model was able to describe the dynamics of the participants&#x00027; average heading direction very accurately; however, their speed was roughly approximated as constant.</p>
<p>Alternative approaches can characterize richer navigation behaviors by jointly modeling both heading direction and speed dynamics. This idea was successfully used to control the motion of both autonomous vehicles (Bicho and Sch&#x000F6;ner, <xref ref-type="bibr" rid="B5">1997</xref>; Bicho et al., <xref ref-type="bibr" rid="B4">2000</xref>) and robotic arms (Reimann et al., <xref ref-type="bibr" rid="B37">2011</xref>). Similar approaches have also been used in computer graphics to model the navigation of articulated agents (Mukovskiy et al., <xref ref-type="bibr" rid="B32">2013</xref>).</p>
</sec>
<sec>
<title>2.2. The Generative Model</title>
<p>To model the interactive behavior of two virtual agents, we define, for each agent <italic>i</italic>, a dynamical system of two nonlinear differential equations. Specifically, the equations describe the dynamics of the agent&#x00027;s heading direction &#x003D5;<sub><italic>i</italic></sub>(<italic>t</italic>) and instantaneous propagation speed <italic>s</italic><sub><italic>i</italic></sub>(<italic>t</italic>).</p>
<p>The heading direction dynamics, derived from Fajen and Warren (<xref ref-type="bibr" rid="B13">2003</xref>), are defined by:</p>
<disp-formula id="E1"><label>(1)</label><mml:math id="M1"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x000A8;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:mi>b</mml:mi><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x002D9;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mi>A</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C8;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:mi>R</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>&#x003C8;</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>In this equation, <inline-formula><mml:math id="M2"><mml:mi>A</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C8;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> defines the <italic>attraction</italic> of agent <italic>i</italic> to the goal <italic>g</italic> located along the direction <inline-formula><mml:math id="M3"><mml:msubsup><mml:mrow><mml:mi>&#x003C8;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, at a distance <inline-formula><mml:math id="M4"><mml:msubsup><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> from it. Similarly, <inline-formula><mml:math id="M5"><mml:mi>R</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>&#x003C8;</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>o</mml:mi></mml:mstyle></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> defines the <italic>repulsion</italic> of agent <italic>i</italic> for the obstacles <inline-formula><mml:math id="M6"><mml:mstyle mathvariant="bold-italic"><mml:mi>o</mml:mi></mml:mstyle><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mrow><mml:mo stretchy="false">[</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub><mml:mo>,</mml:mo><mml:mo>.</mml:mo><mml:mo>.</mml:mo><mml:mo>.</mml:mo><mml:mo>,</mml:mo><mml:msub><mml:mrow><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>b</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">]</mml:mo></mml:mrow></mml:mrow><mml:mrow><mml:mi>T</mml:mi></mml:mrow></mml:msup></mml:math></inline-formula> located along the directions <inline-formula><mml:math id="M7"><mml:msubsup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>&#x003C8;</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>o</mml:mi></mml:mstyle></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, at a distance <inline-formula><mml:math id="M8"><mml:msubsup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>d</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mstyle mathvariant="bold"><mml:mstyle mathvariant="bold-italic"><mml:mi>o</mml:mi></mml:mstyle></mml:mstyle></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula> from it. These two functions are given by:</p>
<disp-formula id="E2"><label>(2)</label><mml:math id="M9"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mtable columnalign="left"><mml:mtr><mml:mtd><mml:mi>A</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C8;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:msup><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C8;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mn>1</mml:mn></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mi>R</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mstyle mathvariant="bold-italic"><mml:mi>&#x003C8;</mml:mi></mml:mstyle></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:msup><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi></mml:mrow></mml:msup><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>n</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>o</mml:mi><mml:mi>b</mml:mi><mml:mi>s</mml:mi><mml:mi>t</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munderover></mml:mstyle><mml:msup><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>The contributions of the individual obstacles to the repulsion function are given by:</p>
<disp-formula id="E3"><label>(3)</label><mml:math id="M10"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msup><mml:mrow><mml:mi>r</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C8;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mn>3</mml:mn></mml:mrow></mml:msub><mml:mo stretchy="false">|</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003D5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C8;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo stretchy="false">|</mml:mo></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mn>4</mml:mn></mml:mrow></mml:msub><mml:msubsup><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>In these equations, <italic>k</italic><sup><italic>j</italic></sup> and <italic>c</italic><sub><italic>j</italic></sub> are constants; <italic>o</italic><sub><italic>n</italic></sub> indicates the <italic>nth</italic> obstacle. Note that, in general, <inline-formula><mml:math id="M11"><mml:msubsup><mml:mrow><mml:mi>&#x003C8;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>o</mml:mi></mml:mrow><mml:mrow><mml:mi>n</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>, which is the direction of the <italic>nth</italic> obstacle of the <italic>ith</italic> agent is time-dependent; for example, depending on the specific social interaction class it might be a function of the instantaneous heading direction of other agents.</p>
<p>The propagation speed dynamics are specified by the following stochastic differential equation:</p>
<disp-formula id="E4"><label>(4)</label><mml:math id="M12"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:mi>&#x003C4;</mml:mi><mml:mover accent="true"><mml:mrow><mml:msub><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo>&#x002D9;</mml:mo></mml:mover><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>s</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>F</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msubsup><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>g</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow></mml:msubsup><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>t</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>where &#x003F5;<sub><italic>i</italic></sub>(<italic>t</italic>) is Gaussian white noise. The nonlinear function <italic>F</italic><sub><italic>i</italic></sub> specifies how the agent&#x00027;s speed changes as a function of the distance from its goal:</p>
<disp-formula id="E5"><label>(5)</label><mml:math id="M13"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mi>F</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>d</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mfrac><mml:mrow><mml:msub><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mn>5</mml:mn></mml:mrow></mml:msub></mml:mrow><mml:mrow><mml:mn>1</mml:mn><mml:mo>&#x0002B;</mml:mo><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mn>6</mml:mn></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>d</mml:mi><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mn>7</mml:mn></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mrow></mml:msup></mml:mrow></mml:mfrac><mml:mo>-</mml:mo><mml:msubsup><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mn>8</mml:mn></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup><mml:msup><mml:mrow><mml:mi>e</mml:mi></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:msub><mml:mrow><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mi>d</mml:mi></mml:mrow></mml:msup><mml:mo>&#x0002B;</mml:mo><mml:msubsup><mml:mrow><mml:mi>c</mml:mi></mml:mrow><mml:mrow><mml:mn>9</mml:mn></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>Critically, we choose this specific functional form because it provides us with enough flexibility to reproduce several relevant interaction classes, including the six fundamental interaction categories traditionally studied in psychophysical experiments (Blythe et al., <xref ref-type="bibr" rid="B7">1999</xref>; Barrett et al., <xref ref-type="bibr" rid="B2">2005</xref>; McAleer and Pollick, <xref ref-type="bibr" rid="B30">2008</xref>): <italic>Chasing, Fighting, Flirting, Following, Guarding</italic>, and <italic>Playing</italic>.</p>
<p>To generate the trajectories, we first randomly sample a series of goal points for the first agent from a two-dimensional uniform distribution over the 2D plane of action. Such goal points are commonly referred to as <italic>via points</italic>. We then use the instantaneous position of the first agent as goal position for the second agent. Samples that are too close to the current agent&#x00027;s position are rejected. Further details about the implementation of the generative model are provided in the Algorithm 1 box. Representative trajectories of six example social interactions are illustrated in <xref ref-type="fig" rid="F1">Figure 1</xref>. Note that the speed control dynamics are not influenced by the presence of obstacles, since their effect was not needed to realistically capture the social interactive behaviors we chose to model.</p>
<fig id="F1" position="float">
<label>Figure 1</label>
<caption><p>Trajectories of six example social interactions. Color indicates agent identity: agent 1 is represented in blue; agent 2 is represented in red. Color saturation indicates time: darker colors indicate recent time samples.</p></caption>
<graphic xlink:href="fnbot-15-648527-g0001.tif"/>
</fig>
<table-wrap position="float">
<label>Algorithm 1</label>
<caption><p>Pseudocode for trajectory generation</p></caption>
<graphic xlink:href="fnbot-15-648527-i0001.tif"/>
</table-wrap>
</sec>
<sec>
<title>2.3. Model Validation</title>
<p>To assess whether our model is able to generate perceptually valid socially interactive behaviors, we carried out three behavioral experiments. In these experiments, we asked participants to categorize videos of interacting agents generated with our model in a free-choice task (Experiment 1), and in a forced-choice task (Experiment 2). Finally, we analyzed the semantic similarities between the labels chosen by the participants (Experiment 3).</p>
<sec>
<title>2.3.1. Dataset Generation</title>
<p>To validate our approach, we chose to model the six fundamental interaction classes (i.e., <italic>Chasing, Fighting, Flirting, Following, Guarding</italic>, and <italic>Playing</italic>; Blythe et al. <xref ref-type="bibr" rid="B7">1999</xref>; Barrett et al. <xref ref-type="bibr" rid="B2">2005</xref>; McAleer and Pollick <xref ref-type="bibr" rid="B30">2008</xref>), and nine other relevant ones (i.e., <italic>Avoiding, Bumping, Dodging, Frightening, Meeting, Pulling Pushing, Tug of War</italic>, and <italic>Walking</italic>) resulting in a total of 15 interaction classes. To generate the trajectories corresponding to these classes, we simulated the model with 15 distinct parameter sets, which we identified through a simulation-based heuristic procedure. A list of the most critical parameters is presented in <xref ref-type="table" rid="T1">Table 1</xref>. The complete dataset we generated for our experiments included five random realizations of each interaction class, for a total of 75 videos. Each random realization is defined by different via points and noise realizations.</p>
<table-wrap position="float" id="T1">
<label>Table 1</label>
<caption><p>Main model parameters.</p></caption>
<table frame="hsides" rules="groups">
<thead><tr>
<th valign="top" align="left"><bold>Interaction class</bold></th>
<th valign="top" align="center" style="border-bottom: thin solid #000000;" colspan="7"><bold>Agent 1</bold></th>
<th valign="top" align="center" style="border-bottom: thin solid #000000;" colspan="7"><bold>Agent 2</bold></th>
</tr>
<tr>
<th/>
<th valign="top" align="center"><bold><italic>k</italic></bold></th>
<th valign="top" align="center"><bold><italic>k</italic><sup><bold>&#x003F5;</bold></sup></bold></th>
<th valign="top" align="center"><bold><italic>c</italic><sub><bold>5</bold></sub></bold></th>
<th valign="top" align="center"><bold><italic>c</italic><sub><bold>6</bold></sub></bold></th>
<th valign="top" align="center"><bold><italic>c</italic><sub><bold>7</bold></sub></bold></th>
<th valign="top" align="center"><bold><italic>c</italic><sub><bold>8</bold></sub></bold></th>
<th valign="top" align="center"><bold><italic>c</italic><sub><bold>9</bold></sub></bold></th>
<th valign="top" align="center"><bold><italic>k</italic></bold></th>
<th valign="top" align="center"><bold><italic>k</italic><sup><bold>&#x003F5;</bold></sup></bold></th>
<th valign="top" align="center"><bold><italic>c</italic><sub><bold>5</bold></sub></bold></th>
<th valign="top" align="center"><bold><italic>c</italic><sub><bold>6</bold></sub></bold></th>
<th valign="top" align="center"><bold><italic>c</italic><sub><bold>7</bold></sub></bold></th>
<th valign="top" align="center"><bold><italic>c</italic><sub><bold>8</bold></sub></bold></th>
<th valign="top" align="center"><bold><italic>c</italic><sub><bold>9</bold></sub></bold></th>
</tr>
</thead>
<tbody>
<tr>
<td valign="top" align="left">Avoiding</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.4</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">2.7</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="left">Bumping</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.9</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.8</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.8</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="left">Chasing</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">7</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">7</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="left">Dodging</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.5</td>
<td valign="top" align="center">7</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="left">Fighting</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="left">Flirting</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.5</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.6</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="left">Following</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">7</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">4</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="left">Frightening</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.5</td>
</tr>
<tr>
<td valign="top" align="left">Guarding</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.5</td>
</tr>
<tr>
<td valign="top" align="left">Meeting</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">2</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.5</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0.22</td>
<td valign="top" align="center">3</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="left">Playing</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.5</td>
</tr>
<tr>
<td valign="top" align="left">Pulling</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">2.6</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.9</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">2.6</td>
<td valign="top" align="center">0</td>
</tr>
<tr>
<td valign="top" align="left">Pushing</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">2.5</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.1</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">2.5</td>
</tr>
<tr>
<td valign="top" align="left">Tug of War</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">6</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.5</td>
<td valign="top" align="center">0.9</td>
<td valign="top" align="center">5</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.5</td>
</tr>
<tr>
<td valign="top" align="left">Walking</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.2</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">1</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0.22</td>
<td valign="top" align="center">10</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
<td valign="top" align="center">0</td>
</tr>
</tbody>
</table>
</table-wrap>
</sec>
<sec>
<title>2.3.2. Participants</title>
<p>A total of 39 participants with normal or corrected vision took part in the experiments: 13 in Experiment 1 (9 females, 4 males), ten in Experiment 2 (5 females, 5 males), and 16 in Experiment 3 (9 females, 7 males). All participants were college students attending the University of T&#x000FC;bingen and provided written informed consent before the experiments. All experiments were in full compliance with the Declaration of Helsinki. Participants were na&#x000EF;ve to the purpose of the study and were financially compensated for their participation.</p>
</sec>
<sec>
<title>2.3.3. Experiment Setup</title>
<p>In Experiment 1 and Experiment 2, participants sat in a dimly lit room in front of an LCD monitor (resolution: 1,920 &#x000D7; 1,080, refresh rate: 60<italic>Hz</italic>), at a distance of 60<italic>cm</italic> from it. To ensure that all participants would observe the stimuli with the same view parameters and the same distance from the screen, they were asked to place their heads in a chin-and-forehead rest during the experimental sessions. The experiments started with a short familiarization session during which the participants learned to use the computer interface. Subsequently, the participants were shown the videos generated with our model. Their task was to describe the videos by using their own words (Experiment 1) or by selecting labels among those provided to them (Experiment 2), and to provide animacy ratings through a standard 0&#x02013;10 Likert scale. To increase the confidence in their answers, we gave participants the opportunity to re-watch each video up to three times. The videos were presented in pseudo-randomized order over five blocks. Five-minute rest breaks were given after each block. The animated videos always showed two agents moving in a 2D plane following speed and direction dynamics generated offline with our model. Critically, unlike in previous work (Blythe et al., <xref ref-type="bibr" rid="B7">1999</xref>; Barrett et al., <xref ref-type="bibr" rid="B2">2005</xref>), our agents were very simple geometrical shapes, namely a blue circle and a red rectangle (as in Tremoulet and Feldman, <xref ref-type="bibr" rid="B61">2000</xref>); this choice ensured that participants&#x00027; perception would not be biased by additional visual cues beyond the agents&#x00027; motion and relative positions. In Experiment 3, subjects were asked to fill out a questionnaire to rate the semantic similarity between social interaction classes (0&#x02013;10 Likert scale).</p>
</sec>
<sec>
<title>2.3.4. Experiment 1</title>
<p>The first experiment was aimed at assessing whether subjects would perceive the motion of virtual agents generated with our model as a social interaction. The second goal of this experiment was the identification of unequivocal labels for the interaction classes generated with our model. To this end, we asked participants to watch all the videos in our stimulus set (section 2.3.1). After watching the videos, subjects were asked to provide their own interpretations by summarizing what they had perceived with a few sentences or keywords. Importantly, in this experiment, to make sure we would not bias the participants&#x00027; perceptions, we did not provide them with any labels or other cues: they had to come up with their own words. In addition, subjects were asked to provide an animacy rating for each agent. The most commonly reported keywords were used as <italic>ground-truth</italic> interaction labels for the remaining experiments.</p>
<p>To test whether participants assigned different animacy ratings depending on agent identity and social interaction class, we fitted a linear mixed-effect model to the animacy ratings, with Agent and Social Interaction as fixed effects, and Subject as random effect:</p>
<disp-formula id="E7"><label>(6)</label><mml:math id="M24"><mml:mtable class="eqnarray" columnalign="left"><mml:mtr><mml:mtd><mml:msub><mml:mrow><mml:mtext>Animacy</mml:mtext></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003B1;</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>a</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munderover></mml:mstyle><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x000B7;</mml:mo><mml:mtext>Agent</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:mtd></mml:mtr><mml:mtr><mml:mtd><mml:mtext>&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;&#x000A0;</mml:mtext><mml:mo>&#x0002B;</mml:mo><mml:mstyle displaystyle="true"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munderover></mml:mstyle><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>&#x000B7;</mml:mo><mml:mtext>SocialInteraction</mml:mtext><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mtd></mml:mtr></mml:mtable></mml:math></disp-formula>
<p>In this model, Animacy<sub><italic>sl</italic></sub> is the <italic>lth</italic> animacy rating reported by subject <italic>s</italic>, with <italic>s</italic> &#x0003D; 1, 2, ..., <italic>N</italic><sub><italic>s</italic></sub> and <italic>l</italic> &#x0003D; 1, 2, ..., <italic>N</italic><sub><italic>a</italic></sub><italic>N</italic><sub><italic>c</italic></sub>; <italic>N</italic><sub><italic>a</italic></sub>, <italic>N</italic><sub><italic>c</italic></sub>, and <italic>N</italic><sub><italic>s</italic></sub> are the number of agents, social interaction classes, and subjects, respectively. Moreover, Agent(<italic>i, l</italic>) is a dummy variable that is equal to 1 when the rating <italic>l</italic> is for agent <italic>i</italic>, and 0 otherwise. Similarly, SocialInteraction(<italic>i, l</italic>) is a dummy variable that is equal to 1 when the rating <italic>l</italic> is for social interaction <italic>i</italic>, and 0 otherwise. Finally, <italic>b</italic><sub>0<italic>s</italic></sub> is the subject-specific random effect [<inline-formula><mml:math id="M25"><mml:msub><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mn>0</mml:mn><mml:mi>s</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msubsup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mi>b</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msubsup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>] and &#x003F5;<sub><italic>sl</italic></sub> are the residual error terms [<inline-formula><mml:math id="M26"><mml:msub><mml:mrow><mml:mi>&#x003F5;</mml:mi></mml:mrow><mml:mrow><mml:mi>s</mml:mi><mml:mi>l</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0007E;</mml:mo><mml:mi>N</mml:mi><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mn>0</mml:mn><mml:mo>,</mml:mo><mml:msup><mml:mrow><mml:mi>&#x003C3;</mml:mi></mml:mrow><mml:mrow><mml:mn>2</mml:mn></mml:mrow></mml:msup></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>]. Notably, the model was fitted with a sum-to-zero constrain, that is <inline-formula><mml:math id="M27"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>a</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi>&#x003B2;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:math></inline-formula> and <inline-formula><mml:math id="M28"><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi>&#x003B3;</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:math></inline-formula>; therefore, in this model, &#x003B1;<sub>0</sub> represents the overall average animacy rating. All the analyses described in this and in the next sections were performed in MATLAB R2020a (The MathWorks, Natick, MA).</p>
</sec>
<sec>
<title>2.3.5. Experiment 2</title>
<p>The second experiment was aimed at further studying the social interaction classes perceived by the participants while watching our animated videos. To this end, new subjects were exposed to a subset of the videos in our original dataset. Specifically, for this experiment we excluded the videos corresponding to the classes <italic>Following, Guarding</italic>, and <italic>Playing</italic>, as these tended either to be often confused with other classes, or to be labeled with a broad variety of related terms. Critically, unlike in Experiment 1, after watching the videos, participants were asked to describe the videos by choosing up to three labels, among those selected in Experiment 1.</p>
<p>To assess the classification performance, we computed the confusion matrix <italic>M</italic>. In this matrix, each element <italic>m</italic><sub><italic>i,j</italic></sub> is the number of times participants assigned the class <italic>j</italic> to a video from class <italic>i</italic>. Starting from <italic>M</italic>, we computed, for each social interaction class, Recall, Precision, and <italic>F</italic><sub>1</sub> score. Recall measures the fraction of videos of class <italic>i</italic> that are correctly classified, and is defined as <inline-formula><mml:math id="M29"><mml:mi>R</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>a</mml:mi><mml:mi>l</mml:mi><mml:msub><mml:mrow><mml:mi>l</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>/</mml:mo><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. Precision measures the fraction of times participants correctly assigned the class <italic>j</italic> to a video, and is defined as <inline-formula><mml:math id="M30"><mml:mi>P</mml:mi><mml:mi>r</mml:mi><mml:mi>e</mml:mi><mml:mi>c</mml:mi><mml:mi>i</mml:mi><mml:mi>s</mml:mi><mml:mi>i</mml:mi><mml:mi>o</mml:mi><mml:msub><mml:mrow><mml:mi>n</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>=</mml:mo><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>/</mml:mo><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munderover><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub></mml:math></inline-formula>. Finally, the <italic>F</italic><sub>1</sub> score is the harmonic mean of Precision and Recall; it measures the overall classification accuracy and is defined as <italic>F</italic><sub>1</sub> &#x0003D; 2&#x000B7;<italic>Precision</italic>&#x000B7;<italic>Recall</italic>/(<italic>Precision</italic> &#x0002B; <italic>Recall</italic>).</p>
<p>To evaluate whether some classes were more likely to be confused with each other, we computed, for each pair of classes (<italic>i, j</italic>), with <italic>i</italic> &#x02260; <italic>j</italic>, the empirical pairwise mislabeling probability, defined as <inline-formula><mml:math id="M31"><mml:msub><mml:mrow><mml:mi>P</mml:mi></mml:mrow><mml:mrow><mml:mi>M</mml:mi><mml:mi>S</mml:mi></mml:mrow></mml:msub><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>=</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi><mml:mo>,</mml:mo><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x0002B;</mml:mo><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>i</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow><mml:mo>/</mml:mo><mml:mrow><mml:mo stretchy="false">(</mml:mo><mml:mrow><mml:munderover accentunder="false" accent="false"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:mrow><mml:mrow><mml:msub><mml:mrow><mml:mi>N</mml:mi></mml:mrow><mml:mrow><mml:mi>c</mml:mi></mml:mrow></mml:msub></mml:mrow></mml:munderover><mml:munder class="msub"><mml:mrow><mml:mo>&#x02211;</mml:mo></mml:mrow><mml:mrow><mml:mi>l</mml:mi><mml:mo>&#x02260;</mml:mo><mml:mi>k</mml:mi></mml:mrow></mml:munder><mml:msub><mml:mrow><mml:mi>m</mml:mi></mml:mrow><mml:mrow><mml:mi>k</mml:mi><mml:mo>,</mml:mo><mml:mi>l</mml:mi></mml:mrow></mml:msub></mml:mrow><mml:mo stretchy="false">)</mml:mo></mml:mrow></mml:math></inline-formula>.</p>
<p>To assess whether participants improved their classification performance during the experiment, we computed the average Precision, Recall, and <italic>F</italic><sub>1</sub> score across social interaction class, as a function of experimental block; we then fitted linear models to test whether experimental block explained a significant fraction of variation in the performance measures defined above.</p>
</sec>
<sec>
<title>2.3.6. Experiment 3</title>
<p>The third and last experiment was aimed at assessing whether there are interpretable semantic similarities among the labels provided in Experiment 2. Some interaction classes were misclassified by the participants in Experiment 2. This suggests that either the generated animated videos are not distinctive enough or that the classes semantically overlap with each other. To disambiguate between the two options, we ran a semantic survey test with a new set of participants. Participants in this experiment did not watch any video. After providing them with precise definitions for each social interaction class, we asked them to indicate the level of semantic similarity for each pair of classes, by providing rates ranging from 0 to 10. Specifically, using this scoring system, participants were asked to assign 0 to pairs of classes perceived as not sharing any semantic similarity, and 10 to those perceived as equivalent classes.</p>
<p>To assess the geometry of the semantic similarity space, we first transformed all the similarity ratings <italic>s</italic> into distance ratings <italic>d</italic> by computing their complement (i.e., <italic>d</italic> &#x0003D; 10 &#x02212; <italic>s</italic>), and then rescaled them between 0 and 1. All the resulting semantic distances collected from participant <italic>i</italic> were then stored in a matrix <italic>D</italic><sup><italic>i</italic></sup>. In this matrix, <inline-formula><mml:math id="M32"><mml:msubsup><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>0</mml:mn></mml:math></inline-formula> if the classes <italic>j</italic> and <italic>k</italic> were considered as semantically equivalent by subject <italic>i</italic>; <inline-formula><mml:math id="M33"><mml:msubsup><mml:mrow><mml:mi>D</mml:mi></mml:mrow><mml:mrow><mml:mi>j</mml:mi><mml:mo>,</mml:mo><mml:mi>k</mml:mi></mml:mrow><mml:mrow><mml:mi>i</mml:mi></mml:mrow></mml:msubsup><mml:mo>=</mml:mo><mml:mn>1</mml:mn></mml:math></inline-formula> if the classes <italic>j</italic> and <italic>k</italic> were considered as semantically unrelated. We then used non-metric multidimensional scaling (MDS; Shepard, <xref ref-type="bibr" rid="B50">1962a</xref>,<xref ref-type="bibr" rid="B51">b</xref>) to visualize in a 2D space the underlying relational structure contained in the distance matrix.</p>
<p>To determine whether some groups of classes were consistently considered as semantically similar, we performed agglomerative hierarchical clustering on the distance matrix <italic>D</italic> using the Ward&#x00027;s linkage method (Ward, <xref ref-type="bibr" rid="B67">1963</xref>), which minimizes the within-cluster variance. Clusters were then identified using a simple cut-off method, using as a threshold &#x003C4; &#x0003D; 0.7 &#x000B7; <italic>M</italic><sub><italic>WD</italic></sub>, where <italic>M</italic><sub><italic>WD</italic></sub> is the maximum observed Ward&#x00027;s distance.</p>
<p>Finally, to estimate whether the semantic similarity between pairs of classes explained the mislabelings observed in Experiment 2, we computed the Pearson&#x00027;s correlation coefficient (&#x003C1;) between the empirical mislabeling probability <italic>P</italic><sub><italic>MS</italic></sub>(<italic>j, k</italic>) measured in Experiment 2 and the semantic distance <italic>D</italic>(<italic>j, k</italic>).</p>
</sec>
</sec>
</sec>
<sec sec-type="results" id="s3">
<title>3. Results</title>
<sec>
<title>3.1. Experiment 1</title>
<p>As mentioned above, participants in this experiment were completely free to provide interpretations about the videos through either labels or short sentences. For each video class, we pooled together all the definitions and labels, and we considered the most used term as the <italic>ground-truth</italic> class label. <xref ref-type="fig" rid="F2">Figure 2</xref> summarizes the reported labels for six example social interaction classes. The pie charts show that some classes such as <italic>Avoiding</italic> and <italic>Fighting</italic> tended to be consistently described with very few labels (i.e., 2 &#x02212; 3). Other classes such as <italic>Dodging</italic> were instead described with more labels (i.e., 6). Regardless of the number of labels used to describe a social interaction class, these were generally semantically similar. For example, some classes were named interchangeably depending on the perspective from which subjects reported their interpretation about the videos. A typical example of this issue is the ambiguity between the classes <italic>Pulling</italic> and <italic>Pushing</italic>. On the other hand, some other classes (for instance <italic>Bumping</italic> and <italic>Pushing</italic>) were sometimes misclassified regardless of the perspective from which subjects might have observed the videos.</p>
<fig id="F2" position="float">
<label>Figure 2</label>
<caption><p>Distribution of reported keywords for six example social interactions. Pie charts&#x00027; titles indicate the true classes. Individual slices are assigned to all the keywords reported in Experiment 1 occurring with a frequency &#x0003E;5%. Keywords reported with a frequency &#x0003C;5% are pooled together in the slice <italic>Other</italic> (in gray). Offset slices (in green) represent the most frequently reported keywords.</p></caption>
<graphic xlink:href="fnbot-15-648527-g0002.tif"/>
</fig>
<p>Average animacy ratings are reported in <xref ref-type="fig" rid="F3">Figure 3A</xref>, with classes sorted in ascending order of average across-agent animacy. Agents, were consistently perceived as animate [&#x003B1;<sub>0</sub> &#x0003D; 53.27%, <italic>t</italic><sub>(299)</sub> &#x0003D; 11.72, <italic>p</italic> &#x0003D; 2.3&#x000B7;10<sup>&#x02212;26</sup>]. This is consistent with the fact that self-propulsion (Csibra, <xref ref-type="bibr" rid="B11">2008</xref>), goal directedness (van Buren et al., <xref ref-type="bibr" rid="B65">2016</xref>), being reactive to social contingencies (Dittrich and Lea, <xref ref-type="bibr" rid="B12">1994</xref>), acceleration (Tremoulet and Feldman, <xref ref-type="bibr" rid="B61">2000</xref>), and speed (Szego and Rutherford, <xref ref-type="bibr" rid="B58">2008</xref>) are the most prominent cues for perceived animacy in psychophysical experiments. Moreover, the blue circle was consistently rated as less animate than the red rectangle [&#x003B2;<sub>1</sub> &#x0003D; &#x02212;&#x003B2;<sub>2</sub> &#x0003D; &#x02212;8.37%, <italic>t</italic><sub>(299)</sub> &#x0003D; &#x02212;10, <italic>p</italic> &#x0003D; 1.74 &#x000B7; 10<sup>&#x02212;20</sup>], consistently with the finding that geometrical figures with a body axis are perceived as more animate than those without one, such as circles (Tremoulet and Feldman, <xref ref-type="bibr" rid="B61">2000</xref>).</p>
<fig id="F3" position="float">
<label>Figure 3</label>
<caption><p>Reported agent animacy. <bold>(A)</bold> Mean animacy ratings obtained in Experiment 1; error bars represent standard errors; results are rescaled between 0 and 100. Classes are sorted in ascending order by average across-agent animacy rating. The asterisk denotes a significant effect (<italic>p</italic> &#x0003C; 0.05) of Agent on Animacy [<italic>F</italic><sub>(1,299)</sub> &#x0003D; 99.98, <italic>p</italic> &#x0003D; 1.74 &#x000B7; 10<sup>&#x02212;20</sup>]. <bold>(B)</bold> F-statistics of <italic>post-hoc</italic> tests to assess the difference in animacy ratings between social interaction classes [i.e., <italic>F</italic><sub>(1,299)</sub>]. <bold>(C)</bold> Bonferroni adjusted <italic>p</italic>-values corresponding to the F-statistics reported in <bold>(B)</bold>; black dots represent significant pairwise differences.</p></caption>
<graphic xlink:href="fnbot-15-648527-g0003.tif"/>
</fig>
<p>We further found a significant effect of social interaction on animacy [<italic>F</italic><sub>(11,299)</sub> &#x0003D; 18.3, <italic>p</italic> &#x0003D; 8.29 &#x000B7; 10<sup>&#x02212;28</sup>]; this suggests that certain classes of social interactions tended to elicit stronger animacy percepts than others. To assess which specific pairs of classes were assigned significantly different animacy rating, we performed <italic>post-hoc F</italic>-tests. This analysis revealed that some classes consistently received higher average animacy ratings: for example, <italic>Fighting</italic> received higher animacy ratings than all other classes [<italic>F</italic><sub>(1,299)</sub> &#x02265; 24.04, <inline-formula><mml:math id="M34"><mml:msub><mml:mrow><mml:mi>p</mml:mi></mml:mrow><mml:mrow><mml:mi>a</mml:mi><mml:mi>d</mml:mi><mml:mi>j</mml:mi></mml:mrow></mml:msub><mml:mo>&#x02264;</mml:mo><mml:mn>1</mml:mn><mml:mo>.</mml:mo><mml:mn>03</mml:mn><mml:mo>&#x000B7;</mml:mo><mml:mn>1</mml:mn><mml:msup><mml:mrow><mml:mn>0</mml:mn></mml:mrow><mml:mrow><mml:mo>-</mml:mo><mml:mn>4</mml:mn></mml:mrow></mml:msup></mml:math></inline-formula>], with the exception of <italic>Chasing</italic>, which was rated similarly [<italic>F</italic><sub>(1,299)</sub> &#x0003D; 5.25, <italic>p</italic><sub><italic>adj</italic></sub> &#x0003D; 1]. Analogously, <italic>Bumping</italic> tended to receive lower animacy ratings than all other classes [<italic>F</italic><sub>(1,299)</sub> &#x02265; 12.44, <italic>p</italic><sub><italic>adj</italic></sub> &#x02264; 0.03], with the exception of <italic>Pushing, Frightening</italic>, and <italic>Flirting</italic>, which were rated similarly [<italic>F</italic><sub>(1,299)</sub> &#x0003D; 8.42, <italic>p</italic><sub><italic>adj</italic></sub> &#x02265; 0.26]. We report in <xref ref-type="fig" rid="F3">Figure 3B</xref> all the <italic>post-hoc</italic> F-statistics, and in <xref ref-type="fig" rid="F3">Figure 3C</xref> all the corresponding Bonferroni adjusted <italic>p</italic>-values.</p>
</sec>
<sec>
<title>3.2. Experiment 2</title>
<p><xref ref-type="fig" rid="F4">Figure 4</xref> shows the total confusion matrix <italic>M</italic> of the classification task. Rows and columns are sorted by decreasing Recall. <italic>Avoiding</italic> was the most accurately classified class by our participants (<italic>Recall</italic> &#x0003D; 75.4%). However, even the hardest class was classified with largely above-chance accuracy (<italic>Walking</italic>: <italic>Recall</italic> &#x0003D; 53.4%; chance level: 8.3%). Nonetheless, there are obviously some misclassifications, especially between <italic>Bumping</italic> and <italic>Pushing</italic> (<italic>m</italic><sub><italic>BU,PS</italic></sub> &#x0003D; 19, <italic>m</italic><sub><italic>PS,BU</italic></sub> &#x0003D; 11), and between <italic>Fighting</italic> and <italic>Chasing</italic> (<italic>m</italic><sub><italic>FI,CH</italic></sub> &#x0003D; 17, <italic>m</italic><sub><italic>CH,FI</italic></sub> &#x0003D; 2). These two kinds of mislabeling alone accounted for a large fraction of the total number of mislabelings [<italic>P</italic><sub><italic>MS</italic></sub>(<italic>BU, PS</italic>) &#x0003D; 9.8%, <italic>P</italic><sub><italic>MS</italic></sub>(<italic>FI, CH</italic>) &#x0003D; 6.2%].</p>
<fig id="F4" position="float">
<label>Figure 4</label>
<caption><p>Average classification performance. This figure shows the confusion matrix of the classification experiment (Experiment 2). Rows represent the true interaction class; columns the interaction class reported by the participants in Experiment 2. Matrix entries <italic>m</italic><sub><italic>i,j</italic></sub> report the number of times participants assigned the class <italic>j</italic> to a video from class <italic>i</italic>. Rows and columns are sorted by decreasing Recall. AV, avoiding; BU, bumping; CH, chasing; DO, dodging; FI, fighting; FL, flirting; FR, frightening; ME, meeting; PL, pulling; PS, pushing; TG, tug of war; WA, walking.</p></caption>
<graphic xlink:href="fnbot-15-648527-g0004.tif"/>
</fig>
<p>One possible reason for this misclassification could be the fact that these labels are semantically intrinsically similar and even real videos of these types of social interactions could be mislabeled. This line of reasoning is supported by the fact that in Experiment 1, <italic>Pushing</italic> was the second preferred keyword used to label videos of class <italic>Bumping</italic> (see <xref ref-type="fig" rid="F2">Figure 2</xref>). Interestingly, both Precision and Recall (and thus <italic>F</italic><sub>1</sub> score) significantly improved across experimental blocks [Precision: <italic>t</italic><sub>(3)</sub> &#x0003D; 19.5, <italic>p</italic> &#x0003D; 2.93 &#x000B7; 10<sup>&#x02212;4</sup>; Recall: <italic>t</italic><sub>(3)</sub> &#x0003D; 10.8, <italic>p</italic> &#x0003D; 1.68 &#x000B7; 10<sup>&#x02212;3</sup>; see <xref ref-type="fig" rid="F5">Figure 5</xref>]. This indicates a latent learning of the categorization of the classes, which is remarkable since no external feedback about the correctness of the class assignments was provided during the experiment. Such a learning was particularly evident for the following often-confused pairs: <italic>Tug of War</italic> vs. <italic>Pulling, Frightening</italic> vs. <italic>Avoiding</italic>, and <italic>Fighting</italic> vs. <italic>Pushing</italic> (not shown).</p>
<fig id="F5" position="float">
<label>Figure 5</label>
<caption><p>Classification performance across experimental blocks. <bold>(A)</bold> Average block-wise recall. Results are averaged across subjects and social interactions; error bars represent standard errors. Insets show the slope of the estimated linear model, the corresponding <italic>t</italic>-statistic and <italic>p</italic>-value. <bold>(B)</bold> Average block-wise precision. <bold>(C)</bold> Average block-wise <italic>F</italic><sub>1</sub> score.</p></caption>
<graphic xlink:href="fnbot-15-648527-g0005.tif"/>
</fig>
</sec>
<sec>
<title>3.3. Experiment 3</title>
<p>The pairwise semantic distance matrix <italic>D</italic> is plotted in <xref ref-type="fig" rid="F6">Figure 6A</xref>: light shades of green indicate semantically close social interaction classes, while darker shades indicate semantically distant classes. The two pairs associated with the highest mislabeling probability in Experiment 2, <italic>Bumping</italic>-<italic>Pushing</italic>, and <italic>Fighting</italic>-<italic>Chasing</italic> [<italic>P</italic><sub><italic>MS</italic></sub>(<italic>BU, PS</italic>) &#x0003D; 9.8%, <italic>P</italic><sub><italic>MS</italic></sub>(<italic>FI, CH</italic>) &#x0003D; 6.2%] were generally considered as semantically similar [<italic>d</italic>(<italic>BU, PU</italic>) &#x0003D; 0.49, <italic>d</italic>(<italic>FI, CH</italic>) &#x0003D; 0.65]; however, they were not the most similar pairs. Rather, the three most semantically similar pairs were <italic>Pulling</italic>-<italic>Tug of War, Avoiding</italic>-<italic>Dodging</italic>, and <italic>Bumping</italic>-<italic>Fighting</italic> [<italic>d</italic>(<italic>PU, TG</italic>) &#x0003D; 0.23, <italic>d</italic>(<italic>AV, DO</italic>) &#x0003D; 0.27, <italic>d</italic>(<italic>BU, FI</italic>) &#x0003D; 0.32]. Nevertheless, regardless of this apparent discrepancy for these few extreme examples, mislabeling probability <italic>P</italic><sub><italic>MS</italic></sub>(<italic>i, j</italic>) and semantic distance <italic>d</italic>(<italic>i, j</italic>) were significantly anti-correlated [&#x003C1; &#x0003D; &#x02212;0.58, <italic>t</italic><sub>(64)</sub> &#x0003D; &#x02212;5.7, <italic>p</italic> &#x0003D; 3.24 &#x000B7; 10<sup>&#x02212;7</sup>; <xref ref-type="fig" rid="F6">Figure 6D</xref>]; this suggests that the more semantically similar two social interaction classes are, the more likely they are of being confused in a video labeling task.</p>
<fig id="F6" position="float">
<label>Figure 6</label>
<caption><p>Cluster analysis results. <bold>(A)</bold> Average semantic distances obtained in Experiment 3. <bold>(B)</bold> Dendrogram of hierachical clustering; the horizontal line represents the cut-off threshold used to identify the clusters (i.e., 0.7&#x0002A;<italic>M</italic><sub><italic>WD</italic></sub>, where <italic>M</italic><sub><italic>WD</italic></sub> is the maximum Ward distance). <bold>(C)</bold> Clusters of social interactions plotted in low dimensional distance-preserving 2D space identified with Multidimensional Scaling (MDS). <bold>(D)</bold> Average mislabeling probability (Experiment 2) as a function of semantic distance (Experiment 3); inset reports the Pearson&#x00027;s correlation coefficient, the corresponding <italic>t</italic>-statistic and <italic>p</italic>-value. AV, avoiding; BU, bumping; CH, chasing; DO, dodging; FI, fighting; FL, flirting; FR, frightening; ME, meeting; PL, pulling; PS, pushing; TG, tug of war; WA, walking.</p></caption>
<graphic xlink:href="fnbot-15-648527-g0006.tif"/>
</fig>
<p>Multidimensional scaling (MDS) provides a compact 2D visualization of the semantic similarity space (<xref ref-type="fig" rid="F6">Figure 6C</xref>). Since MDS is inherently spatial, items that were rated as being highly similar are spatially close to each other in the final map. The map effectively shows which classes of social interactions are semantically similar and which are not. For example, let us consider the hypothetical groups <italic>G</italic><sub>1</sub> ={Tug of War, Pulling} and <italic>G</italic><sub>2</sub> ={Frightening, Avoiding, Dodging}. Participants recognized that <italic>Tug of War</italic> and <italic>Pulling</italic> involve similar interactions between the agents, and that these interactions are different from those occurring in the classes <italic>Frightening, Avoiding</italic>, and <italic>Dodging</italic>. For this reason, participants tended to assign high pairwise similarity scores to intra-group pairs, and low to inter-group pairs. This pattern of scoring is captured by MDS and evident in the resulting map (<xref ref-type="fig" rid="F6">Figure 6C</xref>).</p>
<p>The agglomerative hierarchical cluster analysis on the distance matrix <italic>D</italic> (<xref ref-type="fig" rid="F6">Figure 6B</xref>) confirms this intuition and identifies four distinct semantic clusters; such clusters are visualized in the MDS map with four different symbols (<xref ref-type="fig" rid="F6">Figure 6C</xref>). This analysis supports the conclusion that misclassified labels tend to belong to the same semantic cluster. While not all misclassifications can be explained by semantic similarity, many confusions can be accounted for by this factor. For example, <italic>Pushing</italic> vs. <italic>Bumping, Walking</italic> vs. <italic>Meeting, Avoiding</italic> vs. <italic>Dodging</italic>.</p>
<p>To summarize, our analysis of semantic similarity shows that many of the labeling confusions observed in Experiment 2 can be explained by the semantic similarity of the class labels.</p>
</sec>
</sec>
<sec sec-type="discussion" id="s4">
<title>4. Discussion</title>
<p>In this work, we introduced a novel framework for the automatic generation of videos of socially interacting virtual agents. The underlying model is a nonlinear dynamical system that specifies heading direction and forward speed of the agents. Our model is able to generate as many as 15 different interaction classes, defined by different parameter sets. We validated our model with three different behavioral experiments, in which participants were able to consistently identify the intended interaction classes. Our model is thus suitable for the automatic generation of animations of socially interacting agents. Furthermore, the generation process is also amenable to full parametric control. This feature allows the creation of highly-controlled and arbitrarily-large datasets for in-depth psychophysical and electrophysiological characterization of the perception of social interactions. The model thus overcomes the major limitations that come with hand-crafted, hard-coded, rule-based, and real-video-based approaches (1) to visual stimuli generation. Importantly, the generative nature of the model, makes it a valuable tool also for the development of mechanistic and neural <italic>decoder</italic> models of social perception: model responses to the heterogeneous set of highly-controlled social stimuli here introduced can be rigorously tested for the development of more accurate and brain-like decoder models that replicate human behavioral and neural responses. Recent work (Shu et al., <xref ref-type="bibr" rid="B53">2018</xref>, <xref ref-type="bibr" rid="B54">2019</xref>, <xref ref-type="bibr" rid="B52">2020</xref>), aimed at building a mechanistic model of social inference, used a similar approach.</p>
<p>Shu et al. (<xref ref-type="bibr" rid="B54">2019</xref>, <xref ref-type="bibr" rid="B52">2020</xref>) also proposed generative models of social interactions. Unlike the ones proposed in these studies, the generative model introduced in this work does not directly lend itself to the study of the interactions between intuitive physics and social inferences (Battaglia et al., <xref ref-type="bibr" rid="B3">2013</xref>). However, substantial evidence suggests that physical and social judgments are mediated by different brain regions (Isik et al., <xref ref-type="bibr" rid="B23">2017</xref>; Sliwa and Freiwald, <xref ref-type="bibr" rid="B55">2017</xref>). More importantly, our model is not limited to describing cooperative and obstructive behaviors and thus seems better suited to study more general social interaction classes.</p>
<p>The identification of suitable parameters for the classes modeled in this work was not automatic: it was conducted using a simulation-based heuristic procedure. This is an obvious limitation of our work. Nevertheless, once the parameters are available, they can be used to automatically generate arbitrary numbers of coupled trajectories for each interaction class (by randomly sampling initial conditions, via-points, and noise). With this procedure, we were able to find suitable parameters for only 15 specific interaction classes. However, to the best of our knowledge, no other method is able to automatically generate more than a handful of individual or socially-interactive behaviors (Blackwell, <xref ref-type="bibr" rid="B6">1997</xref>; Paris et al., <xref ref-type="bibr" rid="B35">2007</xref>; Luo et al., <xref ref-type="bibr" rid="B28">2008</xref>; Russell et al., <xref ref-type="bibr" rid="B41">2017</xref>; Shu et al., <xref ref-type="bibr" rid="B54">2019</xref>, <xref ref-type="bibr" rid="B52">2020</xref>). Future work will extend the range of modeled classes by using system identification methods (e.g., Sch&#x000F6;n et al., <xref ref-type="bibr" rid="B45">2011</xref>; Gao et al., <xref ref-type="bibr" rid="B15">2018</xref>; Gon&#x000E7;alves et al., <xref ref-type="bibr" rid="B19">2020</xref>) to automatically extract model parameters from preexisting trajectories&#x02014;extracted, for example, from real videos.</p>
<p>Another possible limitation of our work is that all our participants were recruited from a German university; while this might, in theory, represent a biased sample, previous studies (Rim&#x000E9; et al., <xref ref-type="bibr" rid="B38">1985</xref>) suggest that the perception of social interactions from impoverished stimuli is a phenomenon that is highly stable across cultures. Specifically, these authors showed that African, European, and Northern American participants provided similar interpretations to animated videos of geometrical shapes. This suggests that our findings would not have significantly changed if we had recruited a more heterogeneous sample.</p>
<p>In this work, we used the trajectories generated by our model to animate simple geometrical figures. The resulting abstract visual stimuli can be directly applied to characterize the kinematic features underlying the inference of social interactions. However, the trajectories can also be used as a basis for richer visual stimuli. For example, in ongoing work, we have been developing methods to link the speed and direction dynamics generated by the model to articulating movements of three-dimensional animal models. This approach allows the generation of highly controlled and realistic videos of interacting animals, which can be used to study social interaction perception in the corresponding animal models with ecologically valid stimuli. Furthermore, contrasting the neural responses to impoverished and realistic visual stimuli can help identify the brain regions and neural computations mediating the extraction of the relevant kinematic features and the subsequent construction of social percepts.</p>
<p>Finally, even though the proposed model is mainly aimed to provide a tool to facilitate the design of in-depth psychophysical and electrophysiological studies of social interaction perception, we speculate that it can also be helpful in the development of machine vision systems for the automatic detection of social interactions. Specifically, the development of effective modern machine vision systems tends to be heavily dependent on the availability of large numbers of appropriately-labeled videos of social interactions (Rodr&#x00301;&#x00131;guez-Moreno et al., <xref ref-type="bibr" rid="B39">2019</xref>; Stergiou and Poppe, <xref ref-type="bibr" rid="B57">2019</xref>). A popular approach to this problem is to use clips extracted from already existing (YouTube) videos and movies. However, one of the reasons why feature-based (e.g. Kumar and John, <xref ref-type="bibr" rid="B27">2016</xref>; Sehgal, <xref ref-type="bibr" rid="B49">2018</xref>) and especially deep-neural-network-based (e.g., Karpathy et al., <xref ref-type="bibr" rid="B25">2014</xref>; Carreira and Zisserman, <xref ref-type="bibr" rid="B8">2017</xref>; Gupta et al., <xref ref-type="bibr" rid="B21">2018</xref>) vision systems require <italic>big data</italic> is that they need to learn to ignore irrelevant information that is inevitably present in real videos. Therefore, we hypothesize that pre-training such systems with stylized videos of socially interacting agents&#x02014;such as the very same generated by our model or appropriate avatar-based extensions&#x02014;might greatly reduce their training time and possibly improve their performance. Future work will test this hypothesis.</p>
<p>To sum up, this work introduced a novel generative model of social interactions. The results of our psychophysical experiments suggest that the model is suitable for the automatic generation of arbitrarily-numerous and highly-controlled videos of socially interacting agents for comprehensive studies of animacy and social interaction perception. Our model can also be potentially used to create large, noise-free, and annotated datasets that can facilitate the development of mechanistic and neural models of social perception, as well as the design of machine vision systems for the automatic recognition of human interactions.</p>
</sec>
<sec sec-type="data-availability-statement" id="s5">
<title>Data Availability Statement</title>
<p>The original contributions presented in the study are included in the article/<xref ref-type="supplementary-material" rid="SM1">Supplementary Material</xref>, further inquiries can be directed to the corresponding author/s.</p>
</sec>
<sec id="s6">
<title>Ethics Statement</title>
<p>The studies involving human participants were reviewed and approved by the ethics board of the University of T&#x000FC;bingen. The patients/participants provided their written informed consent to participate in this study.</p>
</sec>
<sec id="s7">
<title>Author Contributions</title>
<p>All authors listed have made a substantial, direct and intellectual contribution to the work, and approved it for publication.</p>
</sec>
<sec sec-type="COI-statement" id="conf1">
<title>Conflict of Interest</title>
<p>The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
</body>
<back>
<ack><p>The authors thank the International Max Planck Research School for Intelligent Systems (IMPRS-IS) for supporting AS. The authors would also like to thank the participants who took part in the study.</p>
</ack><sec sec-type="supplementary-material" id="s8">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fnbot.2021.648527/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fnbot.2021.648527/full#supplementary-material</ext-link></p>
<supplementary-material xlink:href="Data_Sheet_1.ZIP" id="SM1" mimetype="application/zip" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<ref-list>
<title>References</title>
<ref id="B1">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Baker</surname> <given-names>C. L.</given-names></name> <name><surname>Saxe</surname> <given-names>R.</given-names></name> <name><surname>Tenenbaum</surname> <given-names>J. B.</given-names></name></person-group> (<year>2009</year>). <article-title>Action understanding as inverse planning</article-title>. <source>Cognition</source> <volume>113</volume>, <fpage>329</fpage>&#x02013;<lpage>349</lpage>. <pub-id pub-id-type="doi">10.1016/j.cognition.2009.07.005</pub-id></citation></ref>
<ref id="B2">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Barrett</surname> <given-names>H. C.</given-names></name> <name><surname>Todd</surname> <given-names>P. M.</given-names></name> <name><surname>Miller</surname> <given-names>G. F.</given-names></name> <name><surname>Blythe</surname> <given-names>P. W.</given-names></name></person-group> (<year>2005</year>). <article-title>Accurate judgments of intention from motion cues alone: a cross-cultural study</article-title>. <source>Evol. Hum. Behav</source>. <volume>26</volume>, <fpage>313</fpage>&#x02013;<lpage>331</lpage>. <pub-id pub-id-type="doi">10.1016/j.evolhumbehav.2004.08.015</pub-id></citation></ref>
<ref id="B3">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Battaglia</surname> <given-names>P. W.</given-names></name> <name><surname>Hamrick</surname> <given-names>J. B.</given-names></name> <name><surname>Tenenbaum</surname> <given-names>J. B.</given-names></name></person-group> (<year>2013</year>). <article-title>Simulation as an engine of physical scene understanding</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A</source>. <volume>110</volume>, <fpage>18327</fpage>&#x02013;<lpage>18332</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1306572110</pub-id><pub-id pub-id-type="pmid">24145417</pub-id></citation></ref>
<ref id="B4">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bicho</surname> <given-names>E.</given-names></name> <name><surname>Mallet</surname> <given-names>P.</given-names></name> <name><surname>Sch&#x000F6;ner</surname> <given-names>G.</given-names></name></person-group> (<year>2000</year>). <article-title>Target representation on an autonomous vehicle with low-level sensors</article-title>. <source>Int. J. Robot. Res</source>. <volume>19</volume>, <fpage>424</fpage>&#x02013;<lpage>447</lpage>. <pub-id pub-id-type="doi">10.1177/02783640022066950</pub-id></citation></ref>
<ref id="B5">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Bicho</surname> <given-names>E.</given-names></name> <name><surname>Sch&#x000F6;ner</surname> <given-names>G.</given-names></name></person-group> (<year>1997</year>). <article-title>The dynamic approach to autonomous robotics demonstrated on a low-level vehicle platform</article-title>. <source>Robot. Auton. Syst</source>. <volume>21</volume>, <fpage>23</fpage>&#x02013;<lpage>35</lpage>. <pub-id pub-id-type="doi">10.1016/S0921-8890(97)00004-3</pub-id></citation></ref>
<ref id="B6">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blackwell</surname> <given-names>P.</given-names></name></person-group> (<year>1997</year>). <article-title>Random diffusion models for animal movement</article-title>. <source>Ecol. Model</source>. <volume>100</volume>, <fpage>87</fpage>&#x02013;<lpage>102</lpage>. <pub-id pub-id-type="doi">10.1016/S0304-3800(97)00153-1</pub-id></citation></ref>
<ref id="B7">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Blythe</surname> <given-names>P. W.</given-names></name> <name><surname>Todd</surname> <given-names>P. M.</given-names></name> <name><surname>Miller</surname> <given-names>G. F.</given-names></name></person-group> (<year>1999</year>). <article-title>&#x0201C;How motion reveals intention: categorizing social interactions,&#x0201D;</article-title> in <source>Simple Heuristics That Make Us Smart</source>, eds G. Gigerenzer and P. M. Todd (Oxford University Press). pp. <fpage>257</fpage>&#x02013;<lpage>285</lpage>.</citation></ref>
<ref id="B8">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Carreira</surname> <given-names>J.</given-names></name> <name><surname>Zisserman</surname> <given-names>A.</given-names></name></person-group> (<year>2017</year>). <article-title>&#x0201C;Quo vadis, action recognition? A new model and the kinetics dataset,&#x0201D;</article-title> in <source>proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Honolulu, HI</publisher-loc>), <fpage>6299</fpage>&#x02013;<lpage>6308</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2017.502</pub-id></citation></ref>
<ref id="B9">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Castelli</surname> <given-names>F.</given-names></name> <name><surname>Frith</surname> <given-names>C.</given-names></name> <name><surname>Happ&#x000E9;</surname> <given-names>F.</given-names></name> <name><surname>Frith</surname> <given-names>U.</given-names></name></person-group> (<year>2002</year>). <article-title>Autism, asperger syndrome and brain mechanisms for the attribution of mental states to animated shapes</article-title>. <source>Brain</source> <volume>125</volume>, <fpage>1839</fpage>&#x02013;<lpage>1849</lpage>. <pub-id pub-id-type="doi">10.1093/brain/awf189</pub-id><pub-id pub-id-type="pmid">12135974</pub-id></citation></ref>
<ref id="B10">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Castelli</surname> <given-names>F.</given-names></name> <name><surname>Happ&#x000E9;</surname> <given-names>F.</given-names></name> <name><surname>Frith</surname> <given-names>U.</given-names></name> <name><surname>Frith</surname> <given-names>C.</given-names></name></person-group> (<year>2000</year>). <article-title>Movement and mind: a functional imaging study of perception and interpretation of complex intentional movement patterns</article-title>. <source>Neuroimage</source> <volume>12</volume>, <fpage>314</fpage>&#x02013;<lpage>325</lpage>. <pub-id pub-id-type="doi">10.1006/nimg.2000.0612</pub-id><pub-id pub-id-type="pmid">10944414</pub-id></citation></ref>
<ref id="B11">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Csibra</surname> <given-names>G.</given-names></name></person-group> (<year>2008</year>). <article-title>Goal attribution to inanimate agents by 6.5-month-old infants</article-title>. <source>Cognition</source> <volume>107</volume>, <fpage>705</fpage>&#x02013;<lpage>717</lpage>. <pub-id pub-id-type="doi">10.1016/j.cognition.2007.08.001</pub-id><pub-id pub-id-type="pmid">17869235</pub-id></citation></ref>
<ref id="B12">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Dittrich</surname> <given-names>W. H.</given-names></name> <name><surname>Lea</surname> <given-names>S. E.</given-names></name></person-group> (<year>1994</year>). <article-title>Visual perception of intentional motion</article-title>. <source>Perception</source> <volume>23</volume>, <fpage>253</fpage>&#x02013;<lpage>268</lpage>. <pub-id pub-id-type="doi">10.1068/p230253</pub-id></citation></ref>
<ref id="B13">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Fajen</surname> <given-names>B. R.</given-names></name> <name><surname>Warren</surname> <given-names>W. H.</given-names></name></person-group> (<year>2003</year>). <article-title>Behavioral dynamics of steering, obstacle avoidance, and route selection</article-title>. <source>J. Exp. Psychol</source>. <volume>29</volume>:<fpage>343</fpage>. <pub-id pub-id-type="doi">10.1037/0096-1523.29.2.343</pub-id><pub-id pub-id-type="pmid">12760620</pub-id></citation></ref>
<ref id="B14">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Freiwald</surname> <given-names>W. A.</given-names></name></person-group> (<year>2020</year>). <article-title>The neural mechanisms of face processing: cells, areas, networks, and models</article-title>. <source>Curr. Opin. Neurobiol</source>. <volume>60</volume>, <fpage>184</fpage>&#x02013;<lpage>191</lpage>. <pub-id pub-id-type="doi">10.1016/j.conb.2019.12.007</pub-id><pub-id pub-id-type="pmid">31958622</pub-id></citation></ref>
<ref id="B15">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gao</surname> <given-names>S.</given-names></name> <name><surname>Zhou</surname> <given-names>M.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <name><surname>Cheng</surname> <given-names>J.</given-names></name> <name><surname>Yachi</surname> <given-names>H.</given-names></name> <name><surname>Wang</surname> <given-names>J.</given-names></name></person-group> (<year>2018</year>). <article-title>Dendritic neuron model with effective learning algorithms for classification, approximation, and prediction</article-title>. <source>IEEE Trans. Neural Netw. Learn. Syst</source>. <volume>30</volume>, <fpage>601</fpage>&#x02013;<lpage>614</lpage>. <pub-id pub-id-type="doi">10.1109/TNNLS.2018.2846646</pub-id><pub-id pub-id-type="pmid">30004892</pub-id></citation></ref>
<ref id="B16">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gao</surname> <given-names>T.</given-names></name> <name><surname>McCarthy</surname> <given-names>G.</given-names></name> <name><surname>Scholl</surname> <given-names>B. J.</given-names></name></person-group> (<year>2010</year>). <article-title>The wolfpack effect: perception of animacy irresistibly influences interactive behavior</article-title>. <source>Psychol. Sci</source>. <volume>21</volume>, <fpage>1845</fpage>&#x02013;<lpage>1853</lpage>. <pub-id pub-id-type="doi">10.1177/0956797610388814</pub-id><pub-id pub-id-type="pmid">21078895</pub-id></citation></ref>
<ref id="B17">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gao</surname> <given-names>T.</given-names></name> <name><surname>Newman</surname> <given-names>G. E.</given-names></name> <name><surname>Scholl</surname> <given-names>B. J.</given-names></name></person-group> (<year>2009</year>). <article-title>The psychophysics of chasing: a case study in the perception of animacy</article-title>. <source>Cogn. Psychol</source>. <volume>59</volume>, <fpage>154</fpage>&#x02013;<lpage>179</lpage>. <pub-id pub-id-type="doi">10.1016/j.cogpsych.2009.03.001</pub-id><pub-id pub-id-type="pmid">19500784</pub-id></citation></ref>
<ref id="B18">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gelman</surname> <given-names>R.</given-names></name> <name><surname>Durgin</surname> <given-names>F.</given-names></name> <name><surname>Kaufman</surname> <given-names>L.</given-names></name></person-group> (<year>1995</year>). <article-title>&#x0201C;Distinguishing between animates and inanimates: not by motion alone,&#x0201D;</article-title> in <source>Causal Cognition: A Multidisciplinary Debate</source>, eds, D, Sperber, D. Premack, and A. J. Premack (<publisher-loc>Oxford</publisher-loc>: <publisher-name>Clarendon Press</publisher-name>), <fpage>150</fpage>&#x02013;<lpage>184</lpage>. <pub-id pub-id-type="doi">10.1093/acprof:oso/9780198524021.003.0006</pub-id></citation></ref>
<ref id="B19">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Gon&#x000E7;alves</surname> <given-names>P. J.</given-names></name> <name><surname>Lueckmann</surname> <given-names>J.-M.</given-names></name> <name><surname>Deistler</surname> <given-names>M.</given-names></name> <name><surname>Nonnenmacher</surname> <given-names>M.</given-names></name> <name><surname>&#x000D6;cal</surname> <given-names>K.</given-names></name> <name><surname>Bassetto</surname> <given-names>G.</given-names></name> <etal/></person-group>. (<year>2020</year>). <article-title>Training deep neural density estimators to identify mechanistic models of neural dynamics</article-title>. <source>Elife</source> <volume>9</volume>:<fpage>e56261</fpage>. <pub-id pub-id-type="doi">10.7554/eLife.56261</pub-id><pub-id pub-id-type="pmid">32940606</pub-id></citation></ref>
<ref id="B20">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gordon</surname> <given-names>A. S.</given-names></name> <name><surname>Roemmele</surname> <given-names>M.</given-names></name></person-group> (<year>2014</year>). <article-title>&#x0201C;An authoring tool for movies in the style of Heider and Simmel,&#x0201D;</article-title> in <source>International Conference on Interactive Digital Storytelling</source> (<publisher-loc>Singapore</publisher-loc>: <publisher-name>Springer</publisher-name>), <fpage>49</fpage>&#x02013;<lpage>60</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-12337-0_5</pub-id></citation></ref>
<ref id="B21">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Gupta</surname> <given-names>A.</given-names></name> <name><surname>Johnson</surname> <given-names>J.</given-names></name> <name><surname>Fei-Fei</surname> <given-names>L.</given-names></name> <name><surname>Savarese</surname> <given-names>S.</given-names></name> <name><surname>Alahi</surname> <given-names>A.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Social GAN: socially acceptable trajectories with generative adversarial networks,&#x0201D;</article-title> in <source>Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Salt Lake City, UT</publisher-loc>), <fpage>2255</fpage>&#x02013;<lpage>2264</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2018.00240</pub-id></citation></ref>
<ref id="B22">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Heider</surname> <given-names>F.</given-names></name> <name><surname>Simmel</surname> <given-names>M.</given-names></name></person-group> (<year>1944</year>). <article-title>An experimental study of apparent behavior</article-title>. <source>Am. J. Psychol</source>. <volume>57</volume>, <fpage>243</fpage>&#x02013;<lpage>259</lpage>. <pub-id pub-id-type="doi">10.2307/1416950</pub-id></citation></ref>
<ref id="B23">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Isik</surname> <given-names>L.</given-names></name> <name><surname>Koldewyn</surname> <given-names>K.</given-names></name> <name><surname>Beeler</surname> <given-names>D.</given-names></name> <name><surname>Kanwisher</surname> <given-names>N.</given-names></name></person-group> (<year>2017</year>). <article-title>Perceiving social interactions in the posterior superior temporal sulcus</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A</source>. <volume>114</volume>, <fpage>E9145</fpage>&#x02013;<lpage>E9152</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1714471114</pub-id><pub-id pub-id-type="pmid">29279406</pub-id></citation></ref>
<ref id="B24">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Kaduk</surname> <given-names>K.</given-names></name> <name><surname>Elsner</surname> <given-names>B.</given-names></name> <name><surname>Reid</surname> <given-names>V. M.</given-names></name></person-group> (<year>2013</year>). <article-title>Discrimination of animate and inanimate motion in 9-month-old infants: an ERP study</article-title>. <source>Dev. Cogn. Neurosci</source>. <volume>6</volume>, <fpage>14</fpage>&#x02013;<lpage>22</lpage>. <pub-id pub-id-type="doi">10.1016/j.dcn.2013.05.003</pub-id><pub-id pub-id-type="pmid">23811318</pub-id></citation></ref>
<ref id="B25">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Karpathy</surname> <given-names>A.</given-names></name> <name><surname>Toderici</surname> <given-names>G.</given-names></name> <name><surname>Shetty</surname> <given-names>S.</given-names></name> <name><surname>Leung</surname> <given-names>T.</given-names></name> <name><surname>Sukthankar</surname> <given-names>R.</given-names></name> <name><surname>Fei-Fei</surname> <given-names>L.</given-names></name></person-group> (<year>2014</year>). <article-title>&#x0201C;Large-scale video classification with convolutional neural networks,&#x0201D;</article-title> in <source>Proceedings of the IEEE conference on Computer Vision and Pattern Recognition</source> (<publisher-loc>Columbus, OH</publisher-loc>), <fpage>1725</fpage>&#x02013;<lpage>1732</lpage>. <pub-id pub-id-type="doi">10.1109/CVPR.2014.223</pub-id></citation></ref>
<ref id="B26">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kerr</surname> <given-names>W.</given-names></name> <name><surname>Cohen</surname> <given-names>P.</given-names></name></person-group> (<year>2010</year>). <article-title>&#x0201C;Recognizing behaviors and the internal state of the participants,&#x0201D;</article-title> in <source>2010 IEEE 9th International Conference on Development and Learning</source> (<publisher-loc>Ann Arbor, MI</publisher-loc>), <fpage>33</fpage>&#x02013;<lpage>38</lpage>. <pub-id pub-id-type="doi">10.1109/DEVLRN.2010.5578868</pub-id></citation></ref>
<ref id="B27">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Kumar</surname> <given-names>S. S.</given-names></name> <name><surname>John</surname> <given-names>M.</given-names></name></person-group> (<year>2016</year>). <article-title>&#x0201C;Human activity recognition using optical flow based feature set,&#x0201D;</article-title> in <source>2016 IEEE International Carnahan Conference on Security Technology (ICCST)</source> (<publisher-loc>Orlando, FL</publisher-loc>), <fpage>1</fpage>&#x02013;<lpage>5</lpage>. <pub-id pub-id-type="doi">10.1109/CCST.2016.7815694</pub-id></citation></ref>
<ref id="B28">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Luo</surname> <given-names>L.</given-names></name> <name><surname>Zhou</surname> <given-names>S.</given-names></name> <name><surname>Cai</surname> <given-names>W.</given-names></name> <name><surname>Low</surname> <given-names>M. Y. H.</given-names></name> <name><surname>Tian</surname> <given-names>F.</given-names></name> <name><surname>Wang</surname> <given-names>Y.</given-names></name> <etal/></person-group>. (<year>2008</year>). <article-title>Agent-based human behavior modeling for crowd simulation</article-title>. <source>Comput. Anim. Virt. Worlds</source> <volume>19</volume>, <fpage>271</fpage>&#x02013;<lpage>281</lpage>. <pub-id pub-id-type="doi">10.1002/cav.238</pub-id></citation></ref>
<ref id="B29">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>McAleer</surname> <given-names>P.</given-names></name> <name><surname>Kay</surname> <given-names>J. W.</given-names></name> <name><surname>Pollick</surname> <given-names>F. E.</given-names></name> <name><surname>Rutherford</surname> <given-names>M.</given-names></name></person-group> (<year>2011</year>). <article-title>Intention perception in high functioning people with autism spectrum disorders using animacy displays derived from human actions</article-title>. <source>J. Autism Dev. Disord</source>. <volume>41</volume>, <fpage>1053</fpage>&#x02013;<lpage>1063</lpage>. <pub-id pub-id-type="doi">10.1007/s10803-010-1130-8</pub-id><pub-id pub-id-type="pmid">21069445</pub-id></citation></ref>
<ref id="B30">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>McAleer</surname> <given-names>P.</given-names></name> <name><surname>Pollick</surname> <given-names>F. E.</given-names></name></person-group> (<year>2008</year>). <article-title>Understanding intention from minimal displays of human activity</article-title>. <source>Behav. Res. Methods</source> <volume>40</volume>, <fpage>830</fpage>&#x02013;<lpage>839</lpage>. <pub-id pub-id-type="doi">10.3758/BRM.40.3.830</pub-id><pub-id pub-id-type="pmid">18697679</pub-id></citation></ref>
<ref id="B31">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Michotte</surname> <given-names>A.</given-names></name></person-group> (<year>1946</year>). <source>The Perception of Causality, Vol. 21</source>. <publisher-loc>New York, NY</publisher-loc>: <publisher-name>Basic Books</publisher-name>.</citation></ref>
<ref id="B32">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Mukovskiy</surname> <given-names>A.</given-names></name> <name><surname>Slotine</surname> <given-names>J.-J. E.</given-names></name> <name><surname>Giese</surname> <given-names>M. A.</given-names></name></person-group> (<year>2013</year>). <article-title>Dynamically stable control of articulated crowds</article-title>. <source>J. Comput. Sci</source>. <volume>4</volume>, <fpage>304</fpage>&#x02013;<lpage>310</lpage>. <pub-id pub-id-type="doi">10.1016/j.jocs.2012.08.019</pub-id></citation></ref>
<ref id="B33">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Oatley</surname> <given-names>K.</given-names></name> <name><surname>Yuill</surname> <given-names>N.</given-names></name></person-group> (<year>1985</year>). <article-title>Perception of personal and interpersonal action in a cartoon film</article-title>. <source>Br. J. Soc. Psychol</source>. <volume>24</volume>, <fpage>115</fpage>&#x02013;<lpage>124</lpage>. <pub-id pub-id-type="doi">10.1111/j.2044-8309.1985.tb00670.x</pub-id></citation></ref>
<ref id="B34">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Pantelis</surname> <given-names>P. C.</given-names></name> <name><surname>Baker</surname> <given-names>C. L.</given-names></name> <name><surname>Cholewiak</surname> <given-names>S. A.</given-names></name> <name><surname>Sanik</surname> <given-names>K.</given-names></name> <name><surname>Weinstein</surname> <given-names>A.</given-names></name> <name><surname>Wu</surname> <given-names>C.-C.</given-names></name> <etal/></person-group>. (<year>2014</year>). <article-title>Inferring the intentional states of autonomous virtual agents</article-title>. <source>Cognition</source> <volume>130</volume>, <fpage>360</fpage>&#x02013;<lpage>379</lpage>. <pub-id pub-id-type="doi">10.1016/j.cognition.2013.11.011</pub-id><pub-id pub-id-type="pmid">24389312</pub-id></citation></ref>
<ref id="B35">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Paris</surname> <given-names>S.</given-names></name> <name><surname>Pettr&#x000E9;</surname> <given-names>J.</given-names></name> <name><surname>Donikian</surname> <given-names>S.</given-names></name></person-group> (<year>2007</year>). <article-title>&#x0201C;Pedestrian reactive navigation for crowd simulation: a predictive approach,&#x0201D;</article-title> in <source>Computer Graphics Forum, Vol. 26</source> (<publisher-loc>Prague</publisher-loc>: <publisher-name>Wiley Online Library</publisher-name>), <fpage>665</fpage>&#x02013;<lpage>674</lpage>. <pub-id pub-id-type="doi">10.1111/j.1467-8659.2007.01090.x</pub-id></citation></ref>
<ref id="B36">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Reichardt</surname> <given-names>W.</given-names></name> <name><surname>Poggio</surname> <given-names>T.</given-names></name></person-group> (<year>1976</year>). <article-title>Visual control of orientation behaviour in the fly: Part I. A quantitative analysis</article-title>. <source>Q. Rev. Biophys</source>. <volume>9</volume>, <fpage>311</fpage>&#x02013;<lpage>375</lpage>. <pub-id pub-id-type="doi">10.1017/S0033583500002523</pub-id><pub-id pub-id-type="pmid">790441</pub-id></citation></ref>
<ref id="B37">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Reimann</surname> <given-names>H.</given-names></name> <name><surname>Iossifidis</surname> <given-names>I.</given-names></name> <name><surname>Sch&#x000F6;ner</surname> <given-names>G.</given-names></name></person-group> (<year>2011</year>). <article-title>&#x0201C;Autonomous movement generation for manipulators with multiple simultaneous constraints using the attractor dynamics approach,&#x0201D;</article-title> in <source>2011 IEEE International Conference on Robotics and Automation</source> (<publisher-loc>Shanghai</publisher-loc>), <fpage>5470</fpage>&#x02013;<lpage>5477</lpage>. <pub-id pub-id-type="doi">10.1109/ICRA.2011.5980184</pub-id></citation></ref>
<ref id="B38">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rim&#x000E9;</surname> <given-names>B.</given-names></name> <name><surname>Boulanger</surname> <given-names>B.</given-names></name> <name><surname>Laubin</surname> <given-names>P.</given-names></name> <name><surname>Richir</surname> <given-names>M.</given-names></name> <name><surname>Stroobants</surname> <given-names>K.</given-names></name></person-group> (<year>1985</year>). <article-title>The perception of interpersonal emotions originated by patterns of movement</article-title>. <source>Motiv. Emot</source>. <volume>9</volume>, <fpage>241</fpage>&#x02013;<lpage>260</lpage>. <pub-id pub-id-type="doi">10.1007/BF00991830</pub-id></citation></ref>
<ref id="B39">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Rodr&#x000ED;guez-Moreno</surname> <given-names>I.</given-names></name> <name><surname>Mart&#x000ED;nez-Otzeta</surname> <given-names>J. M.</given-names></name> <name><surname>Sierra</surname> <given-names>B.</given-names></name> <name><surname>Rodriguez</surname> <given-names>I.</given-names></name> <name><surname>Jauregi</surname> <given-names>E.</given-names></name></person-group> (<year>2019</year>). <article-title>Video activity recognition: state-of-the-art</article-title>. <source>Sensors</source> <volume>19</volume>:<fpage>3160</fpage>. <pub-id pub-id-type="doi">10.3390/s19143160</pub-id></citation></ref>
<ref id="B40">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Roether</surname> <given-names>C. L.</given-names></name> <name><surname>Omlor</surname> <given-names>L.</given-names></name> <name><surname>Christensen</surname> <given-names>A.</given-names></name> <name><surname>Giese</surname> <given-names>M. A.</given-names></name></person-group> (<year>2009</year>). <article-title>Critical features for the perception of emotion from gait</article-title>. <source>J. Vis</source>. <volume>9</volume>:<fpage>15</fpage>. <pub-id pub-id-type="doi">10.1167/9.6.15</pub-id><pub-id pub-id-type="pmid">19761306</pub-id></citation></ref>
<ref id="B41">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Russell</surname> <given-names>J. C.</given-names></name> <name><surname>Hanks</surname> <given-names>E. M.</given-names></name> <name><surname>Modlmeier</surname> <given-names>A. P.</given-names></name> <name><surname>Hughes</surname> <given-names>D. P.</given-names></name></person-group> (<year>2017</year>). <article-title>Modeling collective animal movement through interactions in behavioral states</article-title>. <source>J. Agric. Biol. Environ. Stat</source>. <volume>22</volume>, <fpage>313</fpage>&#x02013;<lpage>334</lpage>. <pub-id pub-id-type="doi">10.1007/s13253-017-0296-3</pub-id></citation></ref>
<ref id="B42">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Saxe</surname> <given-names>R.</given-names></name> <name><surname>Kanwisher</surname> <given-names>N.</given-names></name></person-group> (<year>2003</year>). <article-title>People thinking about thinking people: the role of the temporo-parietal junction in &#x0201C;theory of mind.&#x0201D;</article-title> <source>Neuroimage</source> <volume>19</volume>, <fpage>1835</fpage>&#x02013;<lpage>1842</lpage>. <pub-id pub-id-type="doi">10.1016/S1053-8119(03)00230-1</pub-id><pub-id pub-id-type="pmid">12948738</pub-id></citation></ref>
<ref id="B43">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Scholl</surname> <given-names>B. J.</given-names></name> <name><surname>Gao</surname> <given-names>T.</given-names></name></person-group> (<year>2013</year>). <article-title>&#x0201C;Perceiving animacy and intentionality: visual processing or higher-level judgment,&#x0201D;</article-title> in <source>Social Perception: Detection and Interpretation of Animacy, Agency, and Intention</source> eds, M. D. Rutherford and V. A. Kuhlmeier (<publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>MIT Press</publisher-name>), <fpage>197</fpage>&#x02013;<lpage>230</lpage>. <pub-id pub-id-type="doi">10.7551/mitpress/9780262019279.003.0009</pub-id></citation></ref>
<ref id="B44">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Scholl</surname> <given-names>B. J.</given-names></name> <name><surname>Tremoulet</surname> <given-names>P. D.</given-names></name></person-group> (<year>2000</year>). <article-title>Perceptual causality and animacy</article-title>. <source>Trends Cogn. Sci</source>. <volume>4</volume>, <fpage>299</fpage>&#x02013;<lpage>309</lpage>. <pub-id pub-id-type="doi">10.1016/S1364-6613(00)01506-0</pub-id></citation></ref>
<ref id="B45">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sch&#x000F6;n</surname> <given-names>T. B.</given-names></name> <name><surname>Wills</surname> <given-names>A.</given-names></name> <name><surname>Ninness</surname> <given-names>B.</given-names></name></person-group> (<year>2011</year>). <article-title>System identification of nonlinear state-space models</article-title>. <source>Automatica</source> <volume>47</volume>, <fpage>39</fpage>&#x02013;<lpage>49</lpage>. <pub-id pub-id-type="doi">10.1016/j.automatica.2010.10.013</pub-id></citation></ref>
<ref id="B46">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sch&#x000F6;ner</surname> <given-names>G.</given-names></name> <name><surname>Dose</surname> <given-names>M.</given-names></name></person-group> (<year>1992</year>). <article-title>A dynamical systems approach to task-level system integration used to plan and control autonomous vehicle motion</article-title>. <source>Robot. Auton. Syst</source>. <volume>10</volume>, <fpage>253</fpage>&#x02013;<lpage>267</lpage>. <pub-id pub-id-type="doi">10.1016/0921-8890(92)90004-I</pub-id></citation></ref>
<ref id="B47">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sch&#x000F6;ner</surname> <given-names>G.</given-names></name> <name><surname>Dose</surname> <given-names>M.</given-names></name> <name><surname>Engels</surname> <given-names>C.</given-names></name></person-group> (<year>1995</year>). <article-title>Dynamics of behavior: theory and applications for autonomous robot architectures</article-title>. <source>Robot. Auton. Syst</source>. <volume>16</volume>, <fpage>213</fpage>&#x02013;<lpage>245</lpage>. <pub-id pub-id-type="doi">10.1016/0921-8890(95)00049-6</pub-id></citation></ref>
<ref id="B48">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Schultz</surname> <given-names>J.</given-names></name> <name><surname>B&#x000FC;lthoff</surname> <given-names>H. H.</given-names></name></person-group> (<year>2019</year>). <article-title>Perceiving animacy purely from visual motion cues involves intraparietal sulcus</article-title>. <source>NeuroImage</source> <volume>197</volume>, <fpage>120</fpage>&#x02013;<lpage>132</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuroimage.2019.04.058</pub-id><pub-id pub-id-type="pmid">31028922</pub-id></citation></ref>
<ref id="B49">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Sehgal</surname> <given-names>S.</given-names></name></person-group> (<year>2018</year>). <article-title>&#x0201C;Human activity recognition using BPNN classifier on hog features,&#x0201D;</article-title> in <source>2018 International Conference on Intelligent Circuits and Systems (ICICS)</source> (<publisher-loc>Phagwara</publisher-loc>), <fpage>286</fpage>&#x02013;<lpage>289</lpage>. <pub-id pub-id-type="doi">10.1109/ICICS.2018.00065</pub-id></citation></ref>
<ref id="B50">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shepard</surname> <given-names>R. N.</given-names></name></person-group> (<year>1962a</year>). <article-title>The analysis of proximities: multidimensional scaling with an unknown distance function. I</article-title>. <source>Psychometrika</source> <volume>27</volume>, <fpage>125</fpage>&#x02013;<lpage>140</lpage>. <pub-id pub-id-type="doi">10.1007/BF02289630</pub-id></citation></ref>
<ref id="B51">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shepard</surname> <given-names>R. N.</given-names></name></person-group> (<year>1962b</year>). <article-title>The analysis of proximities: multidimensional scaling with an unknown distance function. II</article-title>. <source>Psychometrika</source> <volume>27</volume>, <fpage>219</fpage>&#x02013;<lpage>246</lpage>. <pub-id pub-id-type="doi">10.1007/BF02289621</pub-id></citation></ref>
<ref id="B52">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shu</surname> <given-names>T.</given-names></name> <name><surname>Kryven</surname> <given-names>M.</given-names></name> <name><surname>Ullman</surname> <given-names>T. D.</given-names></name> <name><surname>Tenenbaum</surname> <given-names>J. B.</given-names></name></person-group> (<year>2020</year>). <article-title>&#x0201C;Adventures in flatland: perceiving social interactions under physical dynamics,&#x0201D;</article-title> in <source>Proceedings of the 42nd Annual Conference of the Cognitive Science Society</source> (Toronto).</citation></ref>
<ref id="B53">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Shu</surname> <given-names>T.</given-names></name> <name><surname>Peng</surname> <given-names>Y.</given-names></name> <name><surname>Fan</surname> <given-names>L.</given-names></name> <name><surname>Lu</surname> <given-names>H.</given-names></name> <name><surname>Zhu</surname> <given-names>S.-C.</given-names></name></person-group> (<year>2018</year>). <article-title>Perception of human interaction based on motion trajectories: from aerial videos to decontextualized animations</article-title>. <source>Top. Cogn. Sci</source>. <volume>10</volume>, <fpage>225</fpage>&#x02013;<lpage>241</lpage>. <pub-id pub-id-type="doi">10.1111/tops.12313</pub-id><pub-id pub-id-type="pmid">29214731</pub-id></citation></ref>
<ref id="B54">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Shu</surname> <given-names>T.</given-names></name> <name><surname>Peng</surname> <given-names>Y.</given-names></name> <name><surname>Lu</surname> <given-names>H.</given-names></name> <name><surname>Zhu</surname> <given-names>S.</given-names></name></person-group> (<year>2019</year>). <article-title>&#x0201C;Partitioning the perception of physical and social events within a unified psychological space,&#x0201D;</article-title> in <source>Proceedings of the 41st Annual Conference of the Cognitive Science Society</source> (<publisher-loc>Montreal</publisher-loc>).</citation></ref>
<ref id="B55">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Sliwa</surname> <given-names>J.</given-names></name> <name><surname>Freiwald</surname> <given-names>W. A.</given-names></name></person-group> (<year>2017</year>). <article-title>A dedicated network for social interaction processing in the primate brain</article-title>. <source>Science</source> <volume>356</volume>, <fpage>745</fpage>&#x02013;<lpage>749</lpage>. <pub-id pub-id-type="doi">10.1126/science.aam6383</pub-id><pub-id pub-id-type="pmid">28522533</pub-id></citation></ref>
<ref id="B56">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Springer</surname> <given-names>K.</given-names></name> <name><surname>Meier</surname> <given-names>J. A.</given-names></name> <name><surname>Berry</surname> <given-names>D. S.</given-names></name></person-group> (<year>1996</year>). <article-title>Nonverbal bases of social perception: developmental change in sensitivity to patterns of motion that reveal interpersonal events</article-title>. <source>J. Nonverb. Behav</source>. <volume>20</volume>, <fpage>199</fpage>&#x02013;<lpage>211</lpage>. <pub-id pub-id-type="doi">10.1007/BF02248673</pub-id></citation></ref>
<ref id="B57">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Stergiou</surname> <given-names>A.</given-names></name> <name><surname>Poppe</surname> <given-names>R.</given-names></name></person-group> (<year>2019</year>). <article-title>Analyzing human-human interactions: a survey</article-title>. <source>Comput. Vis. Image Understand</source>. <volume>188</volume>:<fpage>102799</fpage>. <pub-id pub-id-type="doi">10.1016/j.cviu.2019.102799</pub-id></citation></ref>
<ref id="B58">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Szego</surname> <given-names>P. A.</given-names></name> <name><surname>Rutherford</surname> <given-names>M. D.</given-names></name></person-group> (<year>2008</year>). <article-title>Dissociating the perception of speed and the perception of animacy: a functional approach</article-title>. <source>Evol. Hum. Behav</source>. <volume>29</volume>, <fpage>335</fpage>&#x02013;<lpage>342</lpage>. <pub-id pub-id-type="doi">10.1016/j.evolhumbehav.2008.04.002</pub-id></citation></ref>
<ref id="B59">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Thurman</surname> <given-names>S. M.</given-names></name> <name><surname>Lu</surname> <given-names>H.</given-names></name></person-group> (<year>2014</year>). <article-title>Perception of social interactions for spatially scrambled biological motion</article-title>. <source>PLoS ONE</source> <volume>9</volume>:<fpage>e112539</fpage>. <pub-id pub-id-type="doi">10.1371/journal.pone.0112539</pub-id><pub-id pub-id-type="pmid">25406075</pub-id></citation></ref>
<ref id="B60">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tr&#x000E4;uble</surname> <given-names>B.</given-names></name> <name><surname>Pauen</surname> <given-names>S.</given-names></name> <name><surname>Poulin-Dubois</surname> <given-names>D.</given-names></name></person-group> (<year>2014</year>). <article-title>Speed and direction changes induce the perception of animacy in 7-month-old infants</article-title>. <source>Front. Psychol</source>. <volume>5</volume>:<fpage>1141</fpage>. <pub-id pub-id-type="doi">10.3389/fpsyg.2014.01141</pub-id><pub-id pub-id-type="pmid">25346712</pub-id></citation></ref>
<ref id="B61">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tremoulet</surname> <given-names>P. D.</given-names></name> <name><surname>Feldman</surname> <given-names>J.</given-names></name></person-group> (<year>2000</year>). <article-title>Perception of animacy from the motion of a single object</article-title>. <source>Perception</source> <volume>29</volume>, <fpage>943</fpage>&#x02013;<lpage>951</lpage>. <pub-id pub-id-type="doi">10.1068/p3101</pub-id><pub-id pub-id-type="pmid">26561971</pub-id></citation></ref>
<ref id="B62">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Tremoulet</surname> <given-names>P. D.</given-names></name> <name><surname>Feldman</surname> <given-names>J.</given-names></name></person-group> (<year>2006</year>). <article-title>The influence of spatial context and the role of intentionality in the interpretation of animacy from motion</article-title>. <source>Percept. Psychophys</source>. <volume>68</volume>, <fpage>1047</fpage>&#x02013;<lpage>1058</lpage>. <pub-id pub-id-type="doi">10.3758/BF03193364</pub-id><pub-id pub-id-type="pmid">17153197</pub-id></citation></ref>
<ref id="B63">
<citation citation-type="book"><person-group person-group-type="author"><name><surname>Troje</surname> <given-names>N.</given-names></name> <name><surname>Simion</surname> <given-names>F.</given-names></name> <name><surname>Bardi</surname> <given-names>L.</given-names></name> <name><surname>Mascalzoni</surname> <given-names>E.</given-names></name> <name><surname>Regolin</surname> <given-names>L.</given-names></name> <name><surname>Grossman</surname> <given-names>E.</given-names></name> <etal/></person-group>. (<year>2013</year>). <source>Social Perception: Detection and Interpretation of Animacy, Agency, and Intention</source>. <publisher-loc>Cambridge, MA</publisher-loc>: <publisher-name>MIT Press</publisher-name>.</citation></ref>
<ref id="B64">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>van Buren</surname> <given-names>B.</given-names></name> <name><surname>Gao</surname> <given-names>T.</given-names></name> <name><surname>Scholl</surname> <given-names>B. J.</given-names></name></person-group> (<year>2017</year>). <article-title>What are the underlying units of perceived animacy? Chasing detection is intrinsically object-based</article-title>. <source>Psychon. Bull. Rev</source>. <volume>24</volume>, <fpage>1604</fpage>&#x02013;<lpage>1610</lpage>. <pub-id pub-id-type="doi">10.3758/s13423-017-1229-4</pub-id><pub-id pub-id-type="pmid">28160268</pub-id></citation></ref>
<ref id="B65">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>van Buren</surname> <given-names>B.</given-names></name> <name><surname>Uddenberg</surname> <given-names>S.</given-names></name> <name><surname>Scholl</surname> <given-names>B. J.</given-names></name></person-group> (<year>2016</year>). <article-title>The automaticity of perceiving animacy: goal-directed motion in simple shapes influences visuomotor behavior even when task-irrelevant</article-title>. <source>Psychon. Bull. Rev</source>. <volume>23</volume>, <fpage>797</fpage>&#x02013;<lpage>802</lpage>. <pub-id pub-id-type="doi">10.3758/s13423-015-0966-5</pub-id><pub-id pub-id-type="pmid">26597889</pub-id></citation></ref>
<ref id="B66">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Walbrin</surname> <given-names>J.</given-names></name> <name><surname>Downing</surname> <given-names>P.</given-names></name> <name><surname>Koldewyn</surname> <given-names>K.</given-names></name></person-group> (<year>2018</year>). <article-title>Neural responses to visually observed social interactions</article-title>. <source>Neuropsychologia</source> <volume>112</volume>, <fpage>31</fpage>&#x02013;<lpage>39</lpage>. <pub-id pub-id-type="doi">10.1016/j.neuropsychologia.2018.02.023</pub-id><pub-id pub-id-type="pmid">29476765</pub-id></citation></ref>
<ref id="B67">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Ward</surname> <given-names>J. H.</given-names> <suffix>Jr.</suffix></name></person-group> (<year>1963</year>). <article-title>Hierarchical grouping to optimize an objective function</article-title>. <source>J. Am. Stat. Assoc</source>. <volume>58</volume>, <fpage>236</fpage>&#x02013;<lpage>244</lpage>. <pub-id pub-id-type="doi">10.1080/01621459.1963.10500845</pub-id></citation></ref>
<ref id="B68">
<citation citation-type="journal"><person-group person-group-type="author"><name><surname>Yamins</surname> <given-names>D. L.</given-names></name> <name><surname>Hong</surname> <given-names>H.</given-names></name> <name><surname>Cadieu</surname> <given-names>C. F.</given-names></name> <name><surname>Solomon</surname> <given-names>E. A.</given-names></name> <name><surname>Seibert</surname> <given-names>D.</given-names></name> <name><surname>DiCarlo</surname> <given-names>J. J.</given-names></name></person-group> (<year>2014</year>). <article-title>Performance-optimized hierarchical models predict neural responses in higher visual cortex</article-title>. <source>Proc. Natl. Acad. Sci. U.S.A</source>. <volume>111</volume>, <fpage>8619</fpage>&#x02013;<lpage>8624</lpage>. <pub-id pub-id-type="doi">10.1073/pnas.1403112111</pub-id><pub-id pub-id-type="pmid">24812127</pub-id></citation></ref>
</ref-list>
<fn-group>
<fn fn-type="financial-disclosure"><p><bold>Funding.</bold> This work was supported by the German Federal Ministry of Education and Research (BMBF FKZ 01GQ1704), the Human Frontiers Science Program (HFSP RGP0036/2016), the German Research Foundation (DFG GZ: KA 1258/15-1), and the European Research Council (ERC 2019-SyG-RELEVANCE-856495).</p>
</fn>
</fn-group>
</back>
</article>