<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE article PUBLIC "-//NLM//DTD Journal Publishing DTD v2.3 20070202//EN" "journalpublishing.dtd">
<article article-type="research-article" dtd-version="2.3" xml:lang="EN" xmlns:mml="http://www.w3.org/1998/Math/MathML" xmlns:xlink="http://www.w3.org/1999/xlink">
<front>
<journal-meta>
<journal-id journal-id-type="publisher-id">Front. Commun.</journal-id>
<journal-title>Frontiers in Communication</journal-title>
<abbrev-journal-title abbrev-type="pubmed">Front. Commun.</abbrev-journal-title>
<issn pub-type="epub">2297-900X</issn>
<publisher>
<publisher-name>Frontiers Media S.A.</publisher-name>
</publisher>
</journal-meta>
<article-meta>
<article-id pub-id-type="publisher-id">662240</article-id>
<article-id pub-id-type="doi">10.3389/fcomm.2021.662240</article-id>
<article-categories>
<subj-group subj-group-type="heading">
<subject>Communication</subject>
<subj-group>
<subject>Original Research</subject>
</subj-group>
</subj-group>
</article-categories>
<title-group>
<article-title>Multimodal Gestalts and Their Change Over Time: Is Routinization Also Grammaticalization?</article-title>
<alt-title alt-title-type="left-running-head">Stukenbrock</alt-title>
<alt-title alt-title-type="right-running-head">Multimodal Gestalts, Routinization, Grammaticalization</alt-title>
</title-group>
<contrib-group>
<contrib contrib-type="author" corresp="yes">
<name>
<surname>Stukenbrock</surname>
<given-names>Anja</given-names>
</name>
<xref ref-type="corresp" rid="c001">&#x2a;</xref>
<uri xlink:href="https://loop.frontiersin.org/people/829108/overview"/>
</contrib>
</contrib-group>
<aff>Chair of German Linguistics, University of Heidelberg, <addr-line>Heidelberg</addr-line>, <country>Germany</country>
</aff>
<author-notes>
<fn fn-type="edited-by">
<p>
<bold>Edited by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/891571/overview">Simona Pekarek Doehler</ext-link>, Universit&#xe9; de Neuch&#xe2;tel, Switzerland</p>
</fn>
<fn fn-type="edited-by">
<p>
<bold>Reviewed by:</bold> <ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1254693/overview">Ali-Reza Majlesi</ext-link>, Stockholm University, Sweden</p>
<p>
<ext-link ext-link-type="uri" xlink:href="https://loop.frontiersin.org/people/1270677/overview">Makoto Hayashi</ext-link>, Nagoya University, Japan</p>
</fn>
<corresp id="c001">&#x2a;Correspondence: Anja Stukenbrock, <email>anja.stukenbrock@gs.uni-heidelberg.de</email>
</corresp>
<fn fn-type="other">
<p>This article was submitted to Language Sciences, a section of the journal Frontiers in Communication</p>
</fn>
</author-notes>
<pub-date pub-type="epub">
<day>04</day>
<month>11</month>
<year>2021</year>
</pub-date>
<pub-date pub-type="collection">
<year>2021</year>
</pub-date>
<volume>6</volume>
<elocation-id>662240</elocation-id>
<history>
<date date-type="received">
<day>31</day>
<month>01</month>
<year>2021</year>
</date>
<date date-type="accepted">
<day>17</day>
<month>09</month>
<year>2021</year>
</date>
</history>
<permissions>
<copyright-statement>Copyright &#xa9; 2021 Stukenbrock.</copyright-statement>
<copyright-year>2021</copyright-year>
<copyright-holder>Stukenbrock</copyright-holder>
<license xlink:href="http://creativecommons.org/licenses/by/4.0/">
<p>This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these&#x20;terms.</p>
</license>
</permissions>
<abstract>
<p>Recently, the claim was put forward that grammar emerges from embodied conduct. This has led to a discussion in multimodal conversation analysis and interactional linguistics whether the routinization of embodied actions can be described in terms of grammar and grammaticalization. While particular items such as exophoric demonstratives and gestures are routinely delivered as multimodal constructions, i.e.,&#x20;as part of grammar, it is debatable whether this also holds for other candidates: e.g., loose couplings of verbal and embodied conduct, locally routinized, or ephemeral gestalts that do not endure beyond the context of their&#x20;use. My paper contributes to this discussion by proposing a distinction between two kinds of multimodal gestalts: socially sedimented multimodal gestalts (multimodal constructions), and locally assembled, ephemeral multimodal gestalts. To this end, I examine sedimented couplings of demonstratives and embodied practices in instructions, and the change of a locally assembled format over time. The data are in German and come from 12&#xa0;h of video-recordings of self-defense trainings for young women. In the course of the participants&#x2019; interactional history, the multimodal format of the participants&#x2019; actions changes. The changes concern formal and functional aspects of the resources used to accomplish those actions, their multimodal orchestration, and the temporality of their delivery. The paper makes four claims: 1. In their primordial use in co-present interaction, demonstratives are coupled with embodied practices and request addressees&#x2019; attention to the speaker&#x2019;s body, i.e.,&#x20;they are tightly and intercorporeally coupled with the embodied conduct of the participants; 2. gesturally used demonstratives are socially sedimented multimodal gestalts, i.e.,&#x20;multimodal constructions; 3. multimodal gestalts may be subject to transformations in the course of multiple repetitions; 4. in my data, the transformations lead to the emergence of a new, reduced format, which, while being locally routinized, is neither grammatical nor grammaticalized.</p>
</abstract>
<kwd-group>
<kwd>demonstratives</kwd>
<kwd>embodied demonstrations</kwd>
<kwd>multimodal gestalts</kwd>
<kwd>routinization</kwd>
<kwd>sedimentation</kwd>
<kwd>emergence</kwd>
<kwd>grammaticalization</kwd>
</kwd-group>
</article-meta>
</front>
<body>
<sec id="s1">
<title>Introduction</title>
<p>Spoken language and embodied practices have been studied in Conversation Analysis (<xref ref-type="bibr" rid="B68">Streeck at al., 2011</xref>; <xref ref-type="bibr" rid="B85">Stivers and Sidnell, 2012</xref>) and Interactional Linguistics (<xref ref-type="bibr" rid="B58">Selting and Couper-Kuhlen, 2001</xref>; <xref ref-type="bibr" rid="B16">Couper-Kuhlen and Selting, 2018</xref>) for decades. These closely related approaches have furthered our understanding of language as a fundamentally temporal phenomenon that adapts to, incorporates, and structurally reflects the dialogical, dynamic, and flexible nature of social interaction. Empirical studies within those frameworks provide evidence for on-line language production and understanding (<xref ref-type="bibr" rid="B4">Auer, 2009a</xref>), and to the incremental nature of grammatical and conversational structures (<xref ref-type="bibr" rid="B2">Deppermann and G&#xfc;nthner, 2015</xref>). Research on multimodality (<xref ref-type="bibr" rid="B68">Streeck et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B1">Deppermann and Streeck, 2018</xref>) has integrated the body in studying the temporality of language-in-interaction; it has also begun to investigate the local emergence of grammar-body-gestalts (<xref ref-type="bibr" rid="B46">Keevallik, 2015</xref>, <xref ref-type="bibr" rid="B47">2018a</xref>, <xref ref-type="bibr" rid="B45">2018b</xref>) and the change of embodied practices over time (<xref ref-type="bibr" rid="B72">Streeck, 2021</xref>).</p>
<p>Conversation-analytic and interaction-linguistic approaches resonate with Emergent Grammar (<xref ref-type="bibr" rid="B41">Hopper, 1987</xref>, <xref ref-type="bibr" rid="B42">2011</xref>), a linguistic paradigm originally developed in the context of grammaticalization (<xref ref-type="bibr" rid="B44">Hopper and Traugott, 2003</xref>). Grammaticalization research is interested in the emergence of grammatical structures in diachrony. In contrast to grammaticalization&#x2019;s focus on relatively stable grammatical structures, Emergent Grammar argues that grammar is never fixed or stable but is constantly evolving (<xref ref-type="bibr" rid="B43">Hopper, 2015</xref>). In the unfinished process of grammar as <italic>emergent</italic>, grammar is not prior to, but an epiphenomen of verbal interaction and ongoingly reshaped by it (<xref ref-type="bibr" rid="B42">Hopper, 2011</xref>, <xref ref-type="bibr" rid="B43">2015</xref>).</p>
<p>Interaction-linguistic work (<xref ref-type="bibr" rid="B71">Streeck, 1995</xref>, <xref ref-type="bibr" rid="B66">2009</xref>; <xref ref-type="bibr" rid="B6">Auer, 2009b</xref>; <xref ref-type="bibr" rid="B75">Stukenbrock, 2018a</xref>) provides evidence for homologies between grammar and interaction, in particular, between action projection and grammatical projection (<xref ref-type="bibr" rid="B7">Auer, 2005</xref>). These homologies are grounded in the temporal, online quality of grammar (<xref ref-type="bibr" rid="B4">Auer, 2009a</xref>; <xref ref-type="bibr" rid="B43">Hopper, 2015</xref>), suggesting a close relationship between grammar and interaction. Grammar can be seen &#x201c;as the historical result of sedimentation and (partly normative) regularization of certain interactional projection techniques&#x201d; (<xref ref-type="bibr" rid="B7">Auer, 2005</xref>:&#x20;33).</p>
<p>Interactional Linguistics explicitly &#x201c;recognizes the effects of past linguistic development, with its sedimentations and ritualizations, and of social historical institutionalization&#x201d; (<xref ref-type="bibr" rid="B16">Couper-Kuhlen and Selting, 2018</xref>: 542). An important characteristic in the interaction linguistic conception of grammar is therefore the sedimentation of a structure in time and social space. Recently, the claim has been made that grammar also emerges from embodied conduct (<xref ref-type="bibr" rid="B47">Keevallik, 2018a</xref>). This has stimulated a discussion in Conversation Analysis (CA) and Interactional Linguistics (IL) whether the routinization of embodied practices can be described in terms of grammar and grammaticalization, in other words, whether &#x201c;grammaticalization and bodily action [&#x2026;] go together&#x201d; (<xref ref-type="bibr" rid="B15">Couper-Kuhlen, 2018</xref>; <xref ref-type="bibr" rid="B70">Streeck, 2018</xref>).</p>
<p>The aim of my paper is to contribute to the discussion on grammar and the body by proposing a distinction between grammar-body constructions and ephemeral grammar-body gestalts, i.e.,&#x20;local, <italic>ad hoc</italic> assembled multimodal gestalts. To this end, I first investigate widely used, socially sedimented grammar-body constructions: couplings of demonstratives and embodied practices. I argue that these constitute prime examples of multimodal constructions (<xref ref-type="bibr" rid="B82">Stukenbrock, 2010</xref>, <xref ref-type="bibr" rid="B77">2015</xref>; <xref ref-type="bibr" rid="B59">Ningelgen and Auer, 2017</xref>) as part of grammar. They are grammaticalized ready-mades that language communities &#x201c;inherit&#x201d; from their ancestors. Second, I examine an <italic>ad hoc</italic> assembled multimodal gestalt and show how it changes in the course of multiple repetitions. As a locally routinized multimodal gestalt, it is not sedimented beyond the ephemeral context of its use and is therefore not grammaticalized. The data are in German and come from video-recorded self-defense trainings for young&#x20;women.</p>
<p>My paper is structured as follows: In the following section (<italic>Grammaticalization and embodied action</italic>), I discuss the central concepts that bear on my endeavor. Next, <italic>Data and Methodology</italic> are presented. In the first part of the analysis (<italic>Sedimented multimodal constructions as resources in social interaction</italic>), I analyze how grammar-body constructions (&#x201c;so&#x201d;/&#x201c;like this&#x201d; &#x2b; gaze &#x2b; embodied practices) are locally mobilized in social interaction: First, I focus on how gaze projects the focal space for an embodied action. Second, I investigate how the focal moment of bodily performance is indexed by &#x201c;so&#x201d;/&#x201c;like this&#x201d;. Third, I show that couplings of demonstratives and embodied practices form sedimented, yet temporally variable and flexible multimodal constructions. In contrast to the first part of the analysis, the second part investigates a locally assembled, ephemeral multimodal gestalt and tracks its formal and functional change through multiple repetitions: I set out with an analysis of the most elaborate format and subsequently show how the first repetition already exhibits reduction. Next, I illustrate that an increase in complexity indexes and reflects additions or changes in the speaker&#x2019;s utterance. Last, I examine how the format changes in the course of multiple repetitions and undergoes significant reductions. These emerge from routinization and promote automatization as discussed in the concluding section.</p>
<p>I put forward the following claims: 1. In their primordial use in co-present interaction, demonstratives are coupled with embodied practices and request addressees&#x2019; attention to the speaker&#x2019;s body (<xref ref-type="bibr" rid="B75">Stukenbrock, 2018a</xref>; <xref ref-type="bibr" rid="B74">2018b</xref>; <xref ref-type="bibr" rid="B78">2020a</xref>), i.e.,&#x20;they are tightly and intercorporeally coupled with the embodied conduct of the participants; 2. gesturally used demonstratives constitute socially sedimented multimodal gestalts, i.e.,&#x20;multimodal constructions; 3. multimodal gestalts (both grammaticalized or locally assembled) may be subject to transformations in the course of multiple repetitions; 4. in my data, these transformations lead to the emergence of a new, reduced format, which, while being locally routinized, is neither grammatical nor grammaticalized (<xref ref-type="bibr" rid="B44">Hopper and Traugott 2003</xref>).</p>
</sec>
<sec id="s2">
<title>Grammaticalization and embodied action: (when and how) do they go together?</title>
<p>The term <italic>grammaticalization</italic> refers to &#x201c;the change whereby lexical items and constructions come in certain linguistic contexts to serve grammatical functions and, once grammaticalized, continue to develop new grammatical functions&#x201d; (<xref ref-type="bibr" rid="B44">Hopper and Traugott, 2003</xref>: XV). In the process, their meaning becomes more general and abstract; they fit a broader range of contexts and increase in frequency. Generalization, change in distribution and increase in frequency are mutually reinforcing processes, since generalization facilitates use in more and varied contexts, which then also increases the frequency of the structure (<xref ref-type="bibr" rid="B14">Bybee, 2014</xref>: 157). Two perspectives are broadly distinguished: The diachronic perspective focuses on the sources and steps that linguistic structures undergo in the process of grammaticalization; in contrast, the synchronic perspective views grammaticalization as &#x201c;a syntactic, discourse pragmatic phenomenon, to be studied from the point of view of fluid patterns of language use&#x201d; (<xref ref-type="bibr" rid="B44">Hopper and Traugott, 2003</xref>:&#x20;2).</p>
<p>Grammaticalization holds that grammar &#x201c;is not a static, closed, or self-contained system, but [that it] is highly susceptible to change and highly affected by language use&#x201d; (<xref ref-type="bibr" rid="B14">Bybee, 2014</xref>: 145). The theory of <italic>Emergent Grammar</italic>, which was originally developed within the grammaticalization framework, goes much further and deconstructs the concept of grammar as a system altogether. This is expressed in the term <italic>emergent</italic>. It refers &#x201c;to the fact that a grammatical structure is always temporary and ephemeral&#x201d; (<xref ref-type="bibr" rid="B42">Hopper, 2011</xref>: 26), and that grammatical forms never become fixed or stable. In contrast, the term <italic>emerging</italic> refers to the traditional view of grammar as &#x201c;a stable system of rules and structures, which may &#x2018;emerge&#x2019; (i.e.,&#x20;come into existence) out of a less uniform mix&#x201d; (<xref ref-type="bibr" rid="B42">Hopper, 2011</xref>:&#x20;28).</p>
<p>Endeavors to adapt (<xref ref-type="bibr" rid="B5">Auer and Pf&#xe4;nder, 2011a</xref>; <xref ref-type="bibr" rid="B62">Pekarek Doehler, 2021</xref>; <xref ref-type="bibr" rid="B61">Pekarek Doehler and Balaman, 2021</xref>) and extend (<xref ref-type="bibr" rid="B31">Ford and Fox, 2015</xref>) Emergent Grammar (<xref ref-type="bibr" rid="B41">Hopper, 1987</xref>, <xref ref-type="bibr" rid="B42">2011</xref>) to examine both grammar-in-interaction as well as gestures (<xref ref-type="bibr" rid="B72">Streeck, 2021</xref>) document the fruitful synergies between Emergent Grammar and CA/IL. All three share the premise that the linear progression along the timeline (<xref ref-type="bibr" rid="B43">Hopper 2015</xref>: 252) is fundamental for our understanding of language and grammar. In a recent study on the local emergence of an ephemeral grammatical practice through reuse, Ford and Fox suggest &#x201c;a cline between ephemerality and sedimentation&#x201d; (<xref ref-type="bibr" rid="B31">Ford and Fox, 2015</xref>: 96). Although the practice does not &#x201c;survive&#x201d; the situation of its creation, and therefore does not move further towards sedimentation or grammaticalization, it is &#x201c;an ephemeral, temporally specific, manifestation of emergence in grammar&#x201d; that represents &#x201c;diachrony at its micro-level&#x201d; (<xref ref-type="bibr" rid="B31">Ford and Fox, 2015</xref>: 115). The authors propose a continuum in Emergent Grammar with a radically ephemeral pole and a sedimented pole at each end. Phenomena of Ephemeral Grammar are located at the far evanescent end of the continuum (<xref ref-type="bibr" rid="B31">Ford and Fox, 2015</xref>: 97). If we assume that phenomena of ephemeral grammar exhibit micro-level diachrony and routinization, how do we conceptualize phenomena on historical time scales, i.e.,&#x20;linguistic structures that emerge from routinization over decades and centuries, acquire high frequency and vast, context-independent distribution?</p>
<p>Key terms such as <italic>habituation</italic>, <italic>routinization</italic>, <italic>automatization,</italic> and <italic>sedimentation</italic> are used both in grammaticalization and in CA/IL. Grammaticalization researchers agree that grammaticalization is a form of <italic>ritualization</italic> (<xref ref-type="bibr" rid="B39">Haiman, 1994</xref>) or <italic>routinization</italic> (<xref ref-type="bibr" rid="B44">Hopper and Traugott, 2003</xref>). <xref ref-type="bibr" rid="B14">Bybee (2014</xref>: 153) defines it as a &#x201c;process of automatization of frequently occurring sequences of linguistic elements&#x201d; (cf. also <xref ref-type="bibr" rid="B39">Haiman, 1994</xref>). Automatization leads to repackaging of formerly separate units, which lose their identity, undergo formal reduction and semantic bleaching (<xref ref-type="bibr" rid="B44">Hopper and Traugott, 2003</xref>; <xref ref-type="bibr" rid="B14">Bybee, 2014</xref>). Bleaching, or generalization, is <italic>habituation</italic> to repeated items (<xref ref-type="bibr" rid="B14">Bybee, 2014</xref>: 157; <xref ref-type="bibr" rid="B39">Haiman, 1994</xref>). Habituation arises from &#x201c;a decline in the tendency to respond to stimuli that have become familiar&#x201d; (<xref ref-type="bibr" rid="B39">Haiman, 1994</xref>: 7); it is an effect of repetition. In short, grammatical items or constructions &#x201c;are automated, conventionalized units&#x201d; (<xref ref-type="bibr" rid="B14">Bybee, 2014</xref>: 157). Parallels have been drawn with non-linguistic habituation, ritualization, and automatization (<xref ref-type="bibr" rid="B39">Haiman, 1994</xref>; <xref ref-type="bibr" rid="B14">Bybee, 2014</xref>). These may be viewed as analogous (<xref ref-type="bibr" rid="B39">Haiman, 1994</xref>) or parallel (<xref ref-type="bibr" rid="B72">Streeck, 2021</xref>) rather than similar processes.</p>
<p>In this paper, I use the terms as follows. <italic>Routinization</italic> occurs through repetition; it is accomplished by the individual through reiterated actions and practices. <italic>Sedimentation</italic> is the social and socially shared outcome of jointly or collectively repeating and routinizing verbal and embodied practices. I distinguish between <italic>joint routinization</italic> and <italic>collective routinization</italic>. <italic>Joint routinization</italic> concerns participants engaged in a shared participation framework; they are mutually aware of one another and repeat certain practices and actions. An example would be dance classes (<xref ref-type="bibr" rid="B46">Keevallik, 2015</xref>). The encounters may take place face to face (<xref ref-type="bibr" rid="B19">Deppermann, 2018a</xref>, <xref ref-type="bibr" rid="B22">c</xref>; <xref ref-type="bibr" rid="B25">Deppermann and Schmidt, 2021</xref>) as well as in technically mediated or virtual environments (<xref ref-type="bibr" rid="B61">Pekarek Doehler and Balaman, 2021</xref>). Joint routinization may lead to local sedimentation within single encounters (<xref ref-type="bibr" rid="B80">Stukenbrock, 2020b</xref>) and across participants&#x2019; interactional histories (<xref ref-type="bibr" rid="B19">Deppermann, 2018a</xref>; <xref ref-type="bibr" rid="B25">Deppermann and Schmidt, 2021</xref>; <xref ref-type="bibr" rid="B61">Pekarek Doehler and Balaman, 2021</xref>). In contrast, <italic>collective routinization</italic> emerges across time and space among social groups whose members are not mutually aware of one another. An example would be generic uses of personal pronouns among groups of speakers who converge on this use without knowing that they do (<xref ref-type="bibr" rid="B50">Laberge and Sankoff, 1979</xref>; <xref ref-type="bibr" rid="B8">Auer and Stukenbrock, 2018</xref>). This may in the long run promote grammaticalization. I propose the term <italic>collective routinization</italic> as a heuristic to bridge the gap between micro-diachrony (<xref ref-type="bibr" rid="B31">Ford and Fox, 2015</xref>) and <italic>longue dur&#xe9;e,</italic> or macro-diachronic, phenomena classically studied in grammaticalization (<xref ref-type="bibr" rid="B44">Hopper and Traugott, 2003</xref>). As long as a format or structure remains a local phenomenon, it is not grammaticalized. For a format to be grammaticalized, it has to spread beyond the initial context of its use, expand and generalize across types of contexts (<xref ref-type="bibr" rid="B44">Hopper and Traugott, 2003</xref>; <xref ref-type="bibr" rid="B14">Bybee, 2014</xref>) until it becomes widely used in the language community. This is the case with demonstratives. In the course of <italic>longue dur&#xe9;e</italic> processes, they emerged as language universals (<xref ref-type="bibr" rid="B28">Diessel, 1999</xref>, <xref ref-type="bibr" rid="B27">2006</xref>; <xref ref-type="bibr" rid="B26">Diessel and Coventry, 2020</xref>) and were intricately connected to concurrent uses of embodied attention directing devices such as gestures (<xref ref-type="bibr" rid="B13">B&#xfc;hler, 1990[1934]</xref>). Gestures are an integral component of demonstratives in their primordial, exophoric use in face-to-face interaction. They are part and parcel of the grammaticalized format of demonstratives. Couplings of demonstratives and gestures are grammaticalized ready-mades that members of language communities &#x2018;inherit&#x2019; from their ancestors. This contrasts with the reduction and routinization of an <italic>ad hoc</italic> assembled multimodal gestalt. As the analysis will show, its transformation in the course of multiple repetitions indexically reflects and actively promotes routinization of the practices involved: routinization (and even automatization) of motor skills through repetition of self-defense practices; second, routinization of communicative practices through repetition of instructions.</p>
</sec>
<sec id="s3">
<title>Data and Methodology</title>
<p>The paper proposes a distinction between two kinds of multimodal gestalts: grammar-body constructions and ephemeral grammar-body assemblages. To contrast usages of a grammaticalized multimodal construction (<italic>so</italic>/&#x201c;like this&#x201d; &#x2b; embodied practices) with the emergence of an ephemeral multimodal assemblage, I track the occurrence of their uses in a series of embodied instructions delivered in self-defense trainings.</p>
<p>Instructions have been investigated in a range of settings such as driving (<xref ref-type="bibr" rid="B18">De Stefani and Gazin, 2014</xref>; <xref ref-type="bibr" rid="B19">Deppermann, 2018a</xref>, <xref ref-type="bibr" rid="B21">b</xref>,<xref ref-type="bibr" rid="B22">c</xref>; <xref ref-type="bibr" rid="B63">Rauniomaa et&#x20;al., 2018</xref>), air traffic control training (<xref ref-type="bibr" rid="B3">Arminen et&#x20;al., 2014</xref>), cooking (<xref ref-type="bibr" rid="B55">Mondada, 2014a</xref>), medical interaction (<xref ref-type="bibr" rid="B84">Svensson et&#x20;al., 2009</xref>; <xref ref-type="bibr" rid="B56">Mondada, 2014b</xref>), class room interaction (<xref ref-type="bibr" rid="B51">Lerner, 1995</xref>; Lindwall et&#x20;al., 2015), teaching and learning of bodily skills (<xref ref-type="bibr" rid="B53">Lindwall and Ekstr&#xf6;m, 2012</xref>; <xref ref-type="bibr" rid="B81">Stukenbrock, 2014</xref>; <xref ref-type="bibr" rid="B46">Keevallik, 2015</xref>; <xref ref-type="bibr" rid="B29">Evans and Lindwall, 2020</xref>). The focus has been on how embodied actions figure in the sequential and temporal organization of first and second action (<xref ref-type="bibr" rid="B53">Lindwall and Ekstr&#xf6;m, 2012</xref>; <xref ref-type="bibr" rid="B81">Stukenbrock, 2014</xref>; <xref ref-type="bibr" rid="B46">Keevallik, 2015</xref>), on multimodal practices of turn construction (<xref ref-type="bibr" rid="B46">Keevallik, 2015</xref>), and on changes of turn design over interactional histories (<xref ref-type="bibr" rid="B19">Deppermann, 2018a</xref>). Most relevant for my own interest in routinization and reduction are Deppermann&#x2019;s findings: Within the framework of interactional histories between driving instructor and student, instructions become increasingly shorter, syntactically less complex, and sequentially more condensed. A similar development will be observable in my&#x20;data.</p>
<p>My study is based on 12&#xa0;h of video material of self-defense trainings for young women. The participants followed the training voluntarily in their free time. Ethical review and approval were not required for this study. Informed consent was obtained from all participants. The data were recorded with a single, high-resolution video camera and imported into ELAN for verbal transcription and multimodal annotation. All data, including images of the participants, were anonymized. The images were transformed into drawings with the help of the program Tayasui Sketches (<ext-link ext-link-type="uri" xlink:href="https://tayasui.com/sketches/">https://tayasui.com/sketches/</ext-link>).</p>
<p>The data were recorded in different gyms with a focus on the trainer. Around 25 students participated in the classes. They had no previous experience with self-defense trainings. Apart from the trainer and the trainees, one or two student assistants regularly participated to help the trainer arrange materials such as gymnastic mats. In later sessions, they were recruited by the trainer as a partner to enact movement combinations in simulated encounters between victim and aggressor.</p>
<p>For this paper, only the recordings of the initial lessons were taken into consideration. The trainer introduced basic self-defense techniques that were first practiced on their own and then combined to form an embodied whole in the course of the first lesson. A longitudinal perspective across sessions is reserved for a follow-up study on how elements that are already part of the common ground are taken up in subsequent training sessions.</p>
<p>The following analysis is concerned with instructions that refer to self-defense techniques in shared training phases. Instructions that deal with organizational issues were not taken into account. Only cases were investigated in which instructing actions were 1) directed at the whole group and 2) designed to be followed by a performance of the instructed action.</p>
</sec>
<sec id="s4">
<title>Part I: Sedimented Multimodal Constructions as Resources in Social Interaction</title>
<p>The focus of the analysis in part I is on the grammar-body construction grounded on the demonstrative &#x201c;so&#x201d;/&#x201c;like this&#x201d;. It will be shown how embodied demonstrations of the trainer are indexed by the demonstrative &#x201c;so&#x201d; and locally designed to fit the addressees&#x2019; activities. Progressively assembling a set of resources to mark, co-index and thus emphasize significant moments of embodied actions creates multimodal densifications (&#x201c;multimodale Verdichtung&#x201d;, <xref ref-type="bibr" rid="B83">Stukenbrock, 2008</xref>, <xref ref-type="bibr" rid="B77">2015</xref>). Multimodal densifications arise from micro-projections at the beginning of an open gestalt and the fulfillment of those micro-projections within that gestalt. The term <italic>gestalt</italic> has been used in multimodal CA for more than 20&#xa0;years, most prominently in the works of Goodwin (2003, 2007), <xref ref-type="bibr" rid="B40">Heath (1986)</xref>, <xref ref-type="bibr" rid="B73">Streeck (1988)</xref> and others (<xref ref-type="bibr" rid="B68">Streeck et&#x20;al., 2011</xref>; <xref ref-type="bibr" rid="B17">De Stefani, 2014</xref>; <xref ref-type="bibr" rid="B24">Deppermann, 2015</xref>; <xref ref-type="bibr" rid="B57">Mondada, 2015</xref>, <xref ref-type="bibr" rid="B54">2016</xref>; <xref ref-type="bibr" rid="B1">Deppermann and Streeck, 2018</xref>). It has been deployed alongside other expressions such as <italic>multimodal packages</italic> or <italic>action packages</italic> (<xref ref-type="bibr" rid="B40">Heath, 1986</xref>; Goodwin, 2003, 2007; <xref ref-type="bibr" rid="B71">Streeck, 1995</xref>, <xref ref-type="bibr" rid="B66">2009</xref>). Multimodal gestalts are considered to be evanescent phenomena (<xref ref-type="bibr" rid="B57">Mondada, 2015</xref>). As such, they resemble phenomena of Ephemeral Grammar (<xref ref-type="bibr" rid="B31">Ford and Fox, 2015</xref>). However, couplings of demonstratives and embodied practices are not at the ephemeral end of the &#x201c;Emergent Grammar-continuum&#x201d; (<xref ref-type="bibr" rid="B42">Hopper, 2011</xref>; <xref ref-type="bibr" rid="B31">Ford and Fox, 2015</xref>). Rather, they are prime candidates to argue for multimodal constructions not as locally routinized phenomena, but as sedimented multimodal constructions. They have grammaticalized the context-bound conditions of their use&#x2014;this includes, first and foremost, embodied practices (<xref ref-type="bibr" rid="B13">B&#xfc;hler, 1990[1934]</xref>; <xref ref-type="bibr" rid="B77">Stukenbrock, 2015</xref>) to establish joint attention (<xref ref-type="bibr" rid="B28">Diessel, 1999</xref>, <xref ref-type="bibr" rid="B27">2006</xref>).</p>
<p>The analysis in the first part aims to show how a multimodal construction is deployed in social interaction. The analysis attests to stability as well as to the context-sensitive, temporal flexibility of the construction. It focuses on two components: 1. gaze as a resource to project the focal space for embodied demonstrations, 2. demonstratives as a resource to index the focal moment of an embodied demonstration and, therefore, as a request for&#x20;gaze.</p>
<p>The couplings investigated in part I are evanescent in real time in&#x20;situated social interaction. Nonetheless, they are robustly anchored in the language community&#x2019;s linguistic knowledge via the demonstrative. Demonstratives have grammaticalized our bodily experience with, and joint attention to phenomena in shared space (<xref ref-type="bibr" rid="B26">Diessel and Coventry, 2020</xref>; <xref ref-type="bibr" rid="B77">Stukenbrock, 2015</xref>, <xref ref-type="bibr" rid="B78">2020a</xref>).</p>
<sec id="s4-1">
<title>Projecting the Focal Space for Embodied Action by Gaze</title>
<p>The first extract<xref ref-type="fn" rid="FN1">
<sup>1</sup>
</xref> (&#x201c;short like this&#x201d;) shows the beginning of the first self-defense training. The trainer has announced that the students will learn how to mobilize their voice and bodies to protect the territory of the self (<xref ref-type="bibr" rid="B33">Goffman, 1971</xref>) against potential aggressors. She decomposes the task into smaller sub-units that are later integrated. We join the group in the course of the first instruction. It is about learning how to make a step forward. The starting point is to stand firmly on the ground. The instruction is addressed at the whole group. In order to be visible to all of them, the trainer has moved to the middle of the gym. The students are arranged around her in full-circle.</p>
<p>The instructional sequence consists of the trainer&#x2019;s instructing action (l. 1&#x2013;4) as first pair part (FPP), followed by the instructed action (l. 5) as embodied second pair part (SPP). It is brought to a close by the trainer&#x2019;s ratification (l. 6) in third position. The trainer&#x2019;s instruction is delivered as a multi-unit turn. Syntactically, it is built as a conditional construction: The protasis (l. 1&#x2013;2) formulates and bodily demonstrates the conditions under which the embodied action formulated and performed in the apodosis (l. 4) should be followed. For now, we focus on the multimodal delivery of the first turn constructional unit (TCU), the protasis of the conditional construction. It syntactically projects, first, a subordinate clause that is dependent on the predicate (l. 1: &#x201c;MERKT&#x201d;/&#x201c;realize&#x201d;, and second, the apodosis.</p>
<fig id="FX1" position="float">
<label>EXTRACT 1</label>
<caption>
<p>&#x201c;Short Like This&#x201d;</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g010.tif"/>
</fig>
<p>Our analysis focuses on the successive mobilization of linguistic and embodied resources that the trainer uses to project and highlight focal elements of her instruction. The first important moment occurs at the end of the first intonation phrase when the trainer projects a change in the attentional focus by shifting her gaze from the addressees (<xref ref-type="fig" rid="F1">Figure&#x20;1A</xref>) to her feet (<xref ref-type="fig" rid="F1">Figure&#x20;1B</xref>). Her gaze points to a new space, invites attention-sharing and projects an embodied activity within that focal&#x20;space.</p>
<fig id="F1" position="float">
<label>FIGURE 1</label>
<caption>
<p>Speaker gaze shift from addressees to focal space.</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g001.tif"/>
</fig>
<p>Extract 1 is a prime example of how embodied demonstrations are integrated into an unfolding verbal instruction. It demonstrates a key function of gaze in conjunction with modal demonstratives (<italic>so</italic>/&#x201c;like this&#x201d;) and embodied demonstrations. It projects a new space for embodied demonstrations indexed by <italic>so</italic>. Note that in the extract, the gaze shift precedes the demonstrative, which only comes at l. 2 (see transcript above). As a visible display of human vision, eye-gaze shifts publicly document changes in the attentional focus. In the present case, the gaze shift (l. 1) points to and projects the relevant space for the upcoming demonstration. Before the trainer delivers the demonstrative (l. 2), she thus invites her addressees to follow her line of regard (<xref ref-type="bibr" rid="B78">Stukenbrock, 2020a</xref>) and to orient to where the action is going to be.<xref ref-type="fn" rid="FN2">
<sup>2</sup>
</xref> In sum, gaze orientation prepares the <bold>focal space</bold> for an embodied demonstration. As we will see in the next section, the trainer also <bold>temporally</bold> marks the <bold>focal moment</bold> of the unfolding demonstration.</p>
</sec>
<sec id="s4-2">
<title>Marking the focal moment of the bodily performance with <italic>&#x201c;so&#x201d;</italic>/&#x201c;like this&#x201d;.</title>
<p>After the trainer has gaze-projected the focal space for the upcoming demonstration (extract 1), she uses the modal demonstrative &#x201c;sO&#x201d;/&#x201c;like this&#x201d; to index the focal moment and element of her demonstration. The demonstrative is part of the second TCU and precedes an adverbially used adjective (l. 2): &#x201c;ihr steht sO: KURZ da&#x201d;/&#x201c;you are standing there short like this&#x201d;. The demonstrative &#x201c;sO&#x201d;/&#x201c;like this&#x201d; is deployed in different constructions (<xref ref-type="bibr" rid="B82">Stukenbrock, 2010</xref>, <xref ref-type="bibr" rid="B77">2015</xref>) to index the manner of an action (<italic>so</italic> &#x2b; VERB), the quality of an object (<italic>so</italic> &#x2b; presentative constructions), or the degree to which an attributed quality (<italic>so</italic> &#x2b; ADJ./ADV.) applies to a phenomenon (<xref ref-type="bibr" rid="B82">Stukenbrock, 2010</xref>, <xref ref-type="bibr" rid="B77">2015</xref>). It is also used in type-indicative referential actions in conjunction with a noun phase and a concurrent pointing gesture (<xref ref-type="bibr" rid="B11">Balantani, 2021</xref>). It is to be distinguished from uses as a discourse marker (<xref ref-type="bibr" rid="B12">Barske and Golato, 2010</xref>), a quotative (<xref ref-type="bibr" rid="B34">Golato, 2000</xref>), and various other functions (cf. <xref ref-type="bibr" rid="B81">Stukenbrock, 2014</xref>, for an overview). In our example, the demonstrative <italic>so</italic> informs the addressees that the local meaning of the gradable adjective &#x201c;KURZ&#x201d;/&#x201c;short&#x201d; is to be gathered from the trainer&#x2019;s embodied action. In temporal terms, it indexes the moment in which the trainer repositions her foot (<xref ref-type="fig" rid="F2">Figure&#x20;2A</xref>) to reduce the space between her&#x20;feet.</p>
<fig id="F2" position="float">
<label>FIGURE 2</label>
<caption>
<p>&#x201c;So&#x201d;/&#x201c;like this&#x201d;, gaze shift and pointing mark the focal moment.</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g002.tif"/>
</fig>
<p>Grammatically, &#x201c;sO&#x201d;/&#x201c;like this&#x201d; marks the informational focus of utterance and embodied demonstration; it thus &#x201c;incorporate[s] the work of the [feet] into the grammatical structure of the talk&#x201d; (<xref ref-type="bibr" rid="B69">Streeck, 2002</xref>: 582). A moment later, the trainer also mobilizes a gesture to point to the space between her feet (<xref ref-type="fig" rid="F2">Figure 2B</xref>). Gaze, demonstrative, body movement, and pointing gesture all work together to highlight (<xref ref-type="bibr" rid="B35">Goodwin, 1994</xref>: 606) the crucial moment of her demonstration. Before the trainer continues the syntactic construction (i.e.,&#x20;the projected apodosis of the conditional construction), a pause ensues (l. 3). With frozen body posture, the trainer shifts gaze to the students to monitor their attention (<xref ref-type="fig" rid="F2">Figure&#x20;2C</xref>).</p>
<p>At the beginning of the next TCU (the apodosis, l. 4), the trainer shifts gaze once more to her feet (<xref ref-type="fig" rid="F3">Figure&#x20;3A</xref>) thus projecting another embodied action to come. The students engage in self-monitoring by looking down at their feet to assess their own spatial position. While describing the corrective body movement that deals with the problematic position demonstrated before, the trainer makes a step forward, and then reorients her gaze to monitor her students (<xref ref-type="fig" rid="F3">Figure&#x20;3B</xref>).</p>
<fig id="F3" position="float">
<label>FIGURE 3</label>
<caption>
<p>Gaze shift to floor and back to addressees.</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g003.tif"/>
</fig>
<p>By following the trainer&#x2019;s example and correcting their position (l. 5), the students deliver an embodied display of understanding, which is ratified by the trainer (l. 6: &#x201c;geNAU&#x201d;/&#x201c;right&#x201d;).</p>
</sec>
<sec id="s4-3">
<title>Sedimented Multimodal Constructions and Temporal Flexibility</title>
<p>In the data, we find temporally variable orders in which demonstratives, gaze shift, and embodied demonstration are mobilized in the local context. Temporal flexibility is not counter-evidence against the claim that couplings of demonstratives and embodied practices are contextually independent, multimodal constructions. On the contrary, flexibility has been from the outset an interactional prerequisite without which the core function of demonstratives would not have emerged: to establish joint attention on phenomena in the shared surroundings of copresent participants. The cross-context distribution (<xref ref-type="bibr" rid="B59">Ningelgen and Auer, 2017</xref>; <xref ref-type="bibr" rid="B82">Stukenbrock, 2010</xref>, <xref ref-type="bibr" rid="B81">2014</xref>, <xref ref-type="bibr" rid="B77">2015</xref>) of these temporally flexible, yet firmly established multimodal constructions has emerged from, and fueled the process of grammaticalization out of which demonstratives emerged as a unique class in linguistic history (<xref ref-type="bibr" rid="B27">Diessel, 2006</xref>; 2009; <xref ref-type="bibr" rid="B26">Diessel and Coventry, 2020</xref>).</p>
<p>The following extract exemplifies how temporal flexibility allows for variations within the multimodal construction. It documents a local, recipient-designed temporal ordering of gaze, modal demonstrative, and bodily action. It is delivered with respect to the participants&#x2019; attention and activities. As in extract 1, &#x201c;SO&#x201d;/&#x201c;like this&#x201d; is coupled with embodied demonstrations and speaker gaze shift from the addressees to the floor. The gaze shift indexes a new focal space to attend to. However, unlike in extract 1, gaze, demonstrative, and bodily demonstration are mobilized in a different temporal order. The trainer shifts her gaze only after the first delivery of the demonstrative, and concurrent with its repetition (l. 3). The trainer&#x2019;s body posture is already in place before the extract starts. She has remained in the stepping position that she assumed before and upholds it throughout the instruction.</p>
<fig id="FX2" position="float">
<label>EXTRACT 2</label>
<caption>
<p>&#x201c;the feet apart like this&#x201d;</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g011.tif"/>
</fig>
<p>The trainer starts a new instruction with a modal deontic (l. 1: &#x201c;ihr sollt&#x201d;/&#x201c;you must&#x201d;), moves her arms back and forth along her body, but then breaks off and pauses (l. 2) as some students are still involved in the previous exercise. She restarts with the modal demonstrative &#x201c;SO&#x201d;/&#x201c;like this&#x201d;, which is followed by a gradable adjective (&#x201c;WEIT&#x201d;/&#x201c;wide&#x201d;, l. 3). Instead of projecting a new space of attention by visibly reorienting her gaze to it, the trainer continues to monitor her addressees (<xref ref-type="fig" rid="F4">Figure&#x20;4</xref>). Since some of the students are not looking at her, the gaze shift would not be seen and hence interactionally useless.</p>
<fig id="F4" position="float">
<label>FIGURE 4</label>
<caption>
<p>Continuous gaze at addressees.</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g004.tif"/>
</fig>
<p>Up to this point, the demonstrative, instead of being preceded by a gaze shift, precedes the gaze shift. By this temporal ordering, the (first use of the) demonstrative serves as an audible request for addressee gaze (<xref ref-type="bibr" rid="B74">Stukenbrock, 2018b</xref>) at a moment when focused interaction and visual coorientation need to be re-established. The demonstrative hearably indexes that visible information is to be gathered from the trainer&#x2019;s embodied action. In order to understand the local meaning of &#x201c;SO&#x201d; with respect to the gradable adjective &#x201c;WEIT&#x201d;/&#x201c;wide like this&#x201c;, the addressees will have to look at the trainer.</p>
<p>After the first, multimodally &#x201c;lean&#x201d; occurrence of the demonstrative, the trainer shifts gaze from the students to the floor and performs two gestures to delineate the space projected by her body (<xref ref-type="fig" rid="F5">Figure&#x20;5A</xref>). Concurrent with her embodied actions, she repeats the modal demonstrative &#x201c;SO&#x201d;/&#x201c;like this&#x201d; (l. 2), freezes her body posture, and shifts gaze back to the students to monitor their attention (<xref ref-type="fig" rid="F5">Figure&#x20;5B</xref>).</p>
<fig id="F5" position="float">
<label>FIGURE 5</label>
<caption>
<p>Gaze shift to floor, &#x201c;so&#x201d; &#x2b; embodied actions <bold>(A)</bold>, and gaze shift back to addressees with frozen body posture <bold>(B)</bold>.</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g005.tif"/>
</fig>
<p>In contrast to the first extract, where the trainer&#x2019;s gaze shift to a new domain preceded demonstrative and embodied action, it is now the demonstrative (its first delivery) that precedes the gaze shift to the new domain: It implements a summons for addressee gaze (<xref ref-type="bibr" rid="B74">Stukenbrock, 2018b</xref>). This use is made contingent on the trainer&#x2019;s perception that some students are still engaged in finishing the previous exercise and not yet ready to look at&#x20;her.</p>
<p>The extract documents that the resources are recipient-designed to fit the addressees&#x2019; situated activities. Thus, while the resources (first and second use of modal demonstrative, embodied demonstration, gaze shift) are temporally calibrated to the addressees&#x2019; diverging foci of attention, they are still converging to &#x201c;embody&#x201d; the same kind of multimodal construction. The first, &#x201c;lean&#x201d; delivery of the format, which requested visual attention from unattending participants, is followed by a full multimodal delivery of the grammar-body construction in the course of the trainer&#x2019;s self-repair.</p>
<p>To sum up, the analysis in part I has shown that modal demonstratives (&#x201c;so&#x201d;/&#x201c;like this&#x201d;) are closely coupled with embodied actions. These constitute indispensable components without which the demonstrative would not be understood. The speaker&#x2019;s embodied actions have to be seen by the addressees in order for them to understand the local, indexical meaning of the demonstrative. Participants orient towards this need as a joint endeavor: The trainer designs and times her actions with respect to the addressees&#x2019; attention and availability. Evidence for this was given in extract 2, where the trainer deployed a modal demonstrative to summon the visual attention of non-attending addressees before she recycled the demonstrative as part of a full-fledged multimodal construction. Conversely, addressees consistently orient to exophorically used demonstratives as requests for visual attention by allocating their gaze to the speaker and attending to her embodied actions.</p>
<p>By default, requests for gaze are formulated by perceptual imperatives. However, they are also delivered by less specialized means, such as restarts and pauses (<xref ref-type="bibr" rid="B36">Goodwin, 1980</xref>), prospective indexicals (<xref ref-type="bibr" rid="B37">Goodwin, 1996</xref>), response cries (<xref ref-type="bibr" rid="B32">Goffman, 1981</xref>), noticings (<xref ref-type="bibr" rid="B48">Keisanen, 2012</xref>; <xref ref-type="bibr" rid="B76">Stukenbrock and Dao, 2019</xref>), and by combinations of those means (<xref ref-type="bibr" rid="B38">Goodwin and Goodwin, 2012</xref>). As we have seen, summons for gaze are also implemented by demonstratives. What is more, this is constitutive for the primordial function of demonstratives in phylo- and ontogenesis. The gaze-summoning property of demonstratives is inherently linked to speakers&#x2019; embodied actions and to the need of addressees to perceive those actions. Demonstratives are therefore &#x201c;by nature&#x201d; embodied&#x2014;i.e.,&#x20;multimodal constructions (<xref ref-type="bibr" rid="B59">Ningelgen and Auer, 2017</xref>; <xref ref-type="bibr" rid="B82">Stukenbrock, 2010</xref>, <xref ref-type="bibr" rid="B79">2017</xref>, <xref ref-type="bibr" rid="B75">2018a</xref>, <xref ref-type="bibr" rid="B78">2020a</xref>).</p>
</sec>
</sec>
<sec id="s5">
<title>Part II: Locally Assembled Multimodal Gestalts</title>
<p>I have argued that the multimodal couplings examined in part I are systematic and acquired as part of grammatical knowledge; they underwent grammaticalization long ago and constitute multimodal constructions. In part II, I will investigate multiple repetitions of a multimodal format in the course of the participants&#x2019; interactional history. Repetitions are crucial for the emergence of grammar: &#x201c;Grammar is nothing other (and nothing &#x201c;deeper&#x201d;) than repeated and automated motor action, and the best moment to study its emergence, as it were, is the first repetition&#x201d; (<xref ref-type="bibr" rid="B70">Streeck, 2018</xref>: 31). However, there are important differences between the local routinization of ephemeral phenomena and grammaticalization as a <italic>long dur&#xe9;e</italic>-process (<xref ref-type="bibr" rid="B70">Streeck, 2018</xref>, <xref ref-type="bibr" rid="B72">2021</xref>); the latter transcends particular participation frameworks, local communities of practice, generations, and even centuries. The grammar-body-gestalts investigated in this section are locally routinized. <italic>Via</italic> repetition, they are sedimented within and for that group. Concurrently, the format becomes increasingly reduced.</p>
<sec id="s5-1">
<title>The Elaborate Format</title>
<p>We begin with the most elaborate format and subsequently examine how the format is becoming leaner over time as components are gradually being abandoned. It consists of a&#x20;request &#x2018;to X something &#x201c;like this&#x201d; &#x2b; gaze to focal space &#x2b;&#x20;embodied demonstration&#x2019;. Extract 3 shows the full format. The trainer requests the students to place their hands on their hips in a particular way. The instructional action (l. 1&#x2013;2) is followed by an instructed action (l. 3) delivered by students. The sequence is closed as the trainer comments on the practice in third position (l.&#x20;4).</p>
<fig id="FX3" position="float">
<label>EXTRACT 3</label>
<caption>
<p>&#x201c;hands like this on the hips&#x201d; (MM_B1_00:15:22).</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g012.tif"/>
</fig>
<p>The instructional action (l. 1&#x2013;2) is delivered multimodally. At turn-beginning, the trainer is looking at her addressees (<xref ref-type="fig" rid="F6">Figure&#x20;6A</xref>). She lifts her hands, bends her head, and visibly shifts gaze to her hands (<xref ref-type="fig" rid="F6">Figure&#x20;6B</xref>), thus gaze-flagging (<xref ref-type="bibr" rid="B69">Streeck, 2002</xref>) her embodied demonstration as it emerges. She continues to gaze down as she moves her hands to her hips in a palm-away position (<xref ref-type="fig" rid="F6">Figure&#x20;6C</xref>).</p>
<fig id="F6" position="float">
<label>FIGURE 6</label>
<caption>
<p>Gaze shift from addressees to hands to index an embodied action.</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g006.tif"/>
</fig>
<p>In the course of the second intonation phrase, which contains the demonstrative &#x201c;SO&#x201d;/&#x201c;like this&#x201d; (l. 2), the trainer produces a&#x20;gestural stroke by quickly moving her hands sideways and hitting her hips (l.&#x20;1), palms away (<xref ref-type="fig" rid="F7">Figure&#x20;7A</xref>). The demonstrative is prosodically marked by a focal accent, and concurrently, the position of the hands is emphasized by a gestural beat, or baton (<xref ref-type="bibr" rid="B49">Kendon, 2004</xref>). A second, laterally performed baton occurs concurrently with the delivery of &#x201c;SO&#x201d; (l.&#x20;2).</p>
<fig id="F7" position="float">
<label>FIGURE 7</label>
<caption>
<p>Gesture strokes and gaze shift back to addressees.</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g007.tif"/>
</fig>
<p>After the multimodal gestalt is fulfilled and the turn completed, the trainer shifts gaze to the students (<xref ref-type="fig" rid="F7">Figure&#x20;7B</xref>). In conjunction with the high-rising intonation at the end (l. 2), the gaze shift mobilizes an embodied response (<xref ref-type="bibr" rid="B65">Stivers and Rossano, 2010</xref>). With a scrutinizing look (<xref ref-type="fig" rid="F7">Figure&#x20;7C</xref>), the trainer turns in a semi-circle to check how the students perform the instructed action.</p>
<p>In line with our previous analysis, we can observe a temporally fine-tuned mobilization of resources: While <bold>gaze</bold> projects <bold>the focal space</bold> for the embodied performance (cf. <italic>Projecting the focal space for embodied action by gaze</italic>), the <bold>demonstrative</bold> marks <bold>the&#x20;focal moment</bold> of the performance (cf. <italic>Marking the focal moment of the bodily performance with &#x201c;so&#x201d;/&#x201c;like this&#x201d;</italic>). Gaze, demonstrative, and gestural baton are assembled to co-index, by multimodal densification, the key moment of the trainer&#x2019;s instruction.</p>
</sec>
<sec id="s5-2">
<title>First Repetition and Reduction</title>
<p>Extract 4 documents the first repetition after the initial instruction in extract 3. Its turn-design differs from that in extract 3, and its multimodal delivery is significantly reduced. First, the trainer has to reorganize the students&#x2019; positions and manage the transition to the next round. While the discourse marker <italic>okay</italic> at turn-beginning (l. 1) marks the transition, the organizational instruction &#x201c;nochmal zuR&#xdc;CK&#x201d;/&#x201c;back again&#x201d; realigns the students in interactional space and brings them back to the by now familiar starting position. This is indicated by the temporal adverb &#x201c;nochmal&#x201d;/&#x201c;again&#x201d; (l. 1). It contrasts with the temporal marker &#x201c;erstmal&#x201d;/&#x201c;for a start&#x201d; in extract 3, and projects a second go. It is repeated with focal accent as part of the instruction proper (l. 3) and indicates familiarity to the students. The verbal instruction (l. 4) is accompanied by a hands-to-hips-movement and followed by the students&#x2019; performance of the instructed action (l.&#x20;4).</p>
<fig id="FX4" position="float">
<label>EXTRACT 4</label>
<caption>
<p>&#x201c;again hands like this on the hips&#x201d;.</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g013.tif"/>
</fig>
<p>Before the trainer delivers the instruction, she publicly displays that she is monitoring the students&#x2019; activities (l. 2, <xref ref-type="fig" rid="F8">Figure&#x20;8A</xref>). In contrast to extract 3, where she projected the focal space of the instruction by gaze, she now consistently looks at the students (<xref ref-type="fig" rid="F8">Figure&#x20;8A,B,C</xref>). By turning her head and visibly letting her gaze wander across the group (<xref ref-type="fig" rid="F8">Figure&#x20;8C</xref>), she documents that she is closely monitoring the students&#x2019; embodied response.</p>
<fig id="F8" position="float">
<label>FIGURE 8</label>
<caption>
<p>Consistent look at the students, absence of gaze projection.</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g008.tif"/>
</fig>
<p>Further reductions are observable: In extract 3, the instructing action was delivered in two intonation phrases (l. 1&#x2013;2). In contrast, it is compressed into a single one in extract 4 (l. 3). Whereas the trainer used a proposition with a deictic address term (&#x201c;ihr&#x201d;/&#x201c;you&#x201d;) and an inflected verb phrase (&#x201c;nehmt&#x201d;/&#x201c;take&#x201d;) in extract 3, she now uses a truncated deontic infinitive instead (on deontic infinitives cf. <xref ref-type="bibr" rid="B20">Deppermann, 2006</xref>). Moreover, she omits the gaze shift to the focal space (spatial projection), and downgrades the prosodic design of the demonstrative<xref ref-type="fn" rid="FN3">
<sup>3</sup>
</xref> by shifting the focal accent to the adverb (l. 3: &#x201c;NOCHmal&#x201d;/&#x201c;again&#x201d;). By repeatedly indexing that the instruction is already part of the common ground, the trainer accounts for a scaled-down version of the instruction: Visibly projecting the focal space by gaze and audibly emphasizing the crucial moment by a prosodically marked demonstrative is less important when these are already known to the participants. The reduction is summarized in <xref ref-type="table" rid="T1">Table&#x20;1</xref>.</p>
<table-wrap id="T1" position="float">
<label>TABLE 1</label>
<caption>
<p>Summary of reductions.</p>
</caption>
<table>
<thead valign="top">
<tr>
<th align="left">First delivery of instructing action (extract 3)</th>
<th align="center">First repetition of instructing action (extract 4)</th>
</tr>
</thead>
<tbody valign="top">
<tr>
<td align="left">speaker gaze flag: publicly visible projection of focal space</td>
<td align="left">no speaker gaze flag: interactionally known focal space</td>
</tr>
<tr>
<td align="left">demonstrative with focal accent: audible temporal projection of focal moment</td>
<td align="left">no focal accent: interactionally known focal moment</td>
</tr>
<tr>
<td align="left">deictic address term and descriptive verb phrase</td>
<td align="left">reduction to truncated deontic infinitive</td>
</tr>
<tr>
<td align="left">multi-unit turn, two intonation phrases</td>
<td align="left">single intonation phrase</td>
</tr>
</tbody>
</table>
</table-wrap>
<p>The short excursus in the next section contrasts our analysis of repetition, routinization, and reduction with the opposite case. When the trainer introduces new elements, the instruction becomes more complex again. Against this background, the eroding effect of multiple repetitions (cf. sub-section <italic>Local routinization and sedimentation through repetition and reduction</italic>) will become even more apparent. Furthermore, we can also see from the contrasting example how incipient routinization can be stopped or blocked.</p>
</sec>
<sec id="s5-3">
<title>Excursus: Meta-instructions to mark an addition, a change, or a new instruction</title>
<p>In this short excursus, it is argued that while multiple repetitions lead&#x20;to routinization, simplification, and reduction, the opposite&#x2014;introducing new elements&#x2014;motivates the use of extended, more complex formats. The choice and design of the format thus reflexively indexes familiarity and routinization or lack thereof.</p>
<p>The extract occurs after repetitions have already yielded initial reductions. However, it does not exhibit those reductions. On the contrary, it is more complex than the previous extract. The reason for this is that the trainer introduces a new element. She delivers a meta-instruction to announce that element. The meta-instruction establishes a hand clap as a timing signal for choric practicing.</p>
<p>Meta-instructions add a layer of reflexivity to the reflexivity and indexicality of situated social interaction by explicitly formulating an instruction about instructions. They establish local practices of co-orientation and co-ordination, and request attention to and alignment with those practices of practicing. They formulate practices for the local organization of instructions-in-interaction. Relevant for my argument is that meta-instructions, and more generally, meta-formulations (re-)increase the complexity of formats that may have begun to undergo reduction.</p>
<fig id="FX5" position="float">
<label>EXTRACT 5</label>
<caption>
<p>&#x201c;I clap my hands&#x201d;</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g014.tif"/>
</fig>
<p>The trainer starts with an announcement (l. 1&#x2013;3). She uses a pre-construction with &#x201c;SO&#x201d;/&#x201c;like this&#x201d; (l. 1) (&#x201c;Vorlaufkonstruktion mit <italic>so</italic>&#x201d;, cf. <xref ref-type="bibr" rid="B10">Auer, 2006</xref>), which projects a prosodically and syntactically complex turn. The subsequent bipartite turn delivers the meta-instruction (l. 2&#x2013;3) and fulfills the syntactic projection. It introduces the hand clap as a timing device for choric practicing. While the instructional object (stepping forward) is referred to as already known (l. 3: &#x201c;diesen schritt&#x201d;/&#x201c;this step&#x201d;), the method of choric practicing according to the hand clap is introduced as something new. It is defined as go-ahead for the students&#x2019; performance of the instructed action. In grammatical terms, it functions like a gesturally used temporal demonstrative that points to the moment of its utterance (<xref ref-type="bibr" rid="B30">Fillmore, 1997</xref>; <xref ref-type="bibr" rid="B52">Levinson, 2005</xref>).</p>
<p>The trainer delays the delivery of the hand clap and thereby holds back the students&#x2019; response. She inserts instructional details on how the step forward should (not) be done (l. 4&#x2013;7), and announces an assessment of trouble sources that the students may be exhibiting in the course of the performance (l. 8&#x2013;9). The trainer projects and designs an action trajectory that is composed of her hand clap as FPP, the students&#x2019; performance as SPP, and subsequently, further assessment and training phases that target the students&#x2019; problems as they become visible to the trainer&#x2019;s professional vision (<xref ref-type="bibr" rid="B35">Goodwin, 1994</xref>). By publicly anticipating problems, the trainer prospectively accounts for the need for future correction and repetition.</p>
<p>The trainer performs the hand clap with a large, sweeping movement, which prepares the stage for the audible go-ahead. Additionally, the hand clap is projected by a pre-positioned, prosodically marked verbal item: the conjunction &#x201c;UND&#x201d;/&#x201c;and&#x201d; (l. 10). The students respond by stepping forward after the hand clap.<xref ref-type="fn" rid="FN4">
<sup>4</sup>
</xref> The trainer acknowledges the performance and prepares the transition to the next round with &#x201c;oKAY&#x201d; (l.&#x20;12).</p>
<p>The trainer uses the hand clap as a device to structure the instructing action, insert details, anticipate problems, and delay the students&#x2019; performance by withholding the clap and making its delivery contingent on the ongoing activities. The sequential structure can be summarized as follows:</p>
<p>
<list list-type="simple">
<list-item>
<p>I. position: complex multi-unit turn of the trainer composed of</p>
</list-item>
<list-item>
<p>&#x2009;1) announcement, couched in a pre-construction (&#x201c;Vorlaufkonstruktion&#x201d;) with &#x201c;so&#x201d;/&#x201c;like this&#x201d;</p>
</list-item>
<list-item>
<p>&#x2009;2) meta-instruction to establish trainer&#x2019;s hand clap as go-ahead for students&#x2019; step forward</p>
</list-item>
<list-item>
<p>&#x2009;3) insertion of instructional details</p>
</list-item>
<list-item>
<p>&#x2009;4) preview of further assessment and repetition sequences</p>
</list-item>
<list-item>
<p>&#x2009;5) <italic>and</italic>-prefaced hand clap as go-ahead</p>
</list-item>
<list-item>
<p>II. position: students&#x2019; embodied response</p>
</list-item>
<list-item>
<p>III. position: ratification by trainer</p>
</list-item>
</list>
</p>
<p>The analysis shows that complex, multi-unit turns with pre-positioned announcements and meta-instructions reflexively constitute and index the additional effort to formulate changes in the instructional format. The complex format used to formulate new and unfamiliar elements contrasts with reductions exhibited as the result of repeating the familiar. Multiple repetitions and reductions may ultimately lead to the local emergence of a new format. This is studied in the next sub-section.</p>
</sec>
<sec id="s5-4">
<title>Local Routinization and Sedimentation Through Repetition and Reduction</title>
<p>Previously,&#x20;we&#x20;have seen how first repetitions already exhibit reductions. The short excursus on meta-instructions, in contrast, showed how the&#x20;introduction of new&#x20;elements leads to&#x20;increased complexity, which may eventually counteract routinization and reduction. In this&#x20;sub-section, we study how the complex, multi-unit turn format is once again changed and reduced in the&#x20;course of multiple repetitions. The analysis focuses on&#x20;reductions that emerge from progressive routinization of first and second actions, and on the concurrent temporal&#x20;compression that reflects and constitutes initial automatization.</p>
<fig id="FX6" position="float">
<label>EXTRACT 6</label>
<caption>
<p>&#x201c;step forward to the clap&#x201d;</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g015.tif"/>
</fig>
<p>Extract 6 occurs right after extract 5. It exemplifies how subsequent repetitions allow for further reductions. The reductions concern both the meta-instruction and the instruction proper. The hand clap has already been put to practice as a timing signal and is reused as a go-ahead in the subsequent instruction.</p>
<p>The first reduction concerns the meta-instruction. Whereas in extract 5 it was delivered in a syntactically, prosodically, and&#x20;pragmatically complete TCU and followed by the instructing action, it is now boiled down to a prepositional phrase (l. 04: &#x201c;auf_s klatschen&#x201d;/&#x201c;to the clap&#x201d;) and integrated into the instructing action (l. 04: &#x201c;und JETZT geht ihr <underline>auf_s klatschen</underline> mit dem Andern bein vor&#x201d;/&#x201c;and nowyou step forward&#x20;<underline>to the clap</underline> with the other leg&#x201d;). Although the clap is&#x20;already established as a go-ahead, the trainer recycles the&#x20;meta-instruction as part of a modified instruction: She&#x20;now requests the students to step forward with the other leg (l. 04).</p>
<p>As in extract 5, the trainer projects the hand clap by a prosodically marked <italic>and</italic>-preface (l. 05). In contrast to extract&#x20;5, however, she no longer visibly puts the hand clap on stage. Instead, it is latched to the <italic>and</italic>-preface and done very quickly. The students subsequently perform the instructed action, and the sequence is closed when the trainer, after turning around to monitor the students (l. 07), utters a ratification (l. 08: &#x201c;oKAY;&#x201d;).</p>
<p>The next extract documents further reductions. Again, the&#x20;trainer uses a meta-pragmatic announcement, but marks&#x20;the practice as already familiar by the modal adverb&#x20;&#x201c;wieder&#x201d;/&#x201c;again&#x201d; (l. 01). While the practice of clapping and most of the instructed action are treated as known, a new element is introduced: raising the arm when stepping forward (l. 02). In&#x20;contrast to extract 6 where the&#x20;announcement of the clap and the instruction were delivered in a single TCU, the trainer now constructs two&#x20;TCUs and thus foregrounds the arm raise as an instructional novelty.</p>
<fig id="FX7" position="float">
<label>EXTRACT 7</label>
<caption>
<p>&#x201c;Short Like This&#x201d;</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g016.tif"/>
</fig>
<p>A formal reduction and temporal acceleration occurs in the&#x20;adjacency pair of trainer&#x2019;s gestural go-ahead and students&#x2019;&#x20;embodied response (l. 3&#x2013;4). The trainer now omits the <italic>and</italic>-preface, which formerly projected the hand clap and gave the&#x20;students time to prepare. Instead, she claps immediately after the delivery of the instruction (l. 3). Subsequently, the&#x20;students step forward and raise their arms (l. 4). The&#x20;trainer uses the same item for ratification (l. 5: &#x201c;oKAY&#x201d;), but now latches an organizational instruction that&#x20;projects a next go. A double acceleration is thus accomplished: Omitting&#x20;the <italic>and</italic>-preface temporally compresses verbal instruction and gestural go-ahead; latching the organizational instruction to the ratification speeds up the succession of training rounds.</p>
<p>The next extract starts with a correction (l. 1) and an organizational instruction (l. 02). It implicates repetition and is marked as part of the interactional history by the temporal adverb &#x201c;NOCHmal&#x201d;/&#x201c;once more&#x201d; (l.&#x20;2).</p>
<fig id="FX8" position="float">
<label>EXTRACT 8</label>
<caption>
<p>&#x201c;and &#x2b; [hand clap &#x2b; zack]&#x201d;</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g017.tif"/>
</fig>
<p>For the first time, the trainer now leaves out the meta-pragmatic announcement. Instead, she re-introduces the <italic>and</italic>-preface (l. 04) to project the hand clap, and adds a new element: the vocalization &#x201c;ZACK&#x201d; (l. 04: &#x201c;U:ND ZACK&#x201d;/&#x201c;and zack&#x201d;). The interjection <italic>zack</italic> onomatopoetically indexes a sharp and violent movement (DWDS; GRIMM, Bd. 31, Sp. 10). In the context of self-defense trainings, it not only depicts these movement qualities, but mobilizes the students to perform the instructed action with utmost force and velocity. By synchronizing the delivery of vocalization and hand clap, the trainer performs a very short and sharp go-ahead signal. In contrast to the concise, synchronized delivery of hand clap and vocalization, she lengthens the pre-positioned conjunction &#x201c;U:ND&#x201d;/&#x201c;and&#x201d; (l. 04). The delay contrasts with and thus highlights the subsequent acceleration of the go-ahead, which invites a fast and forceful response.</p>
<p>This exercise is repeated two more times, with an explanatory sequence in between. The two repetitions are delivered in the reduced format with an <italic>and</italic>-preface to project the clap and a synchronized performance of clap and vocalization as go-ahead for the embodied response.</p>
<p>After the arm raise has been repeated several times, the trainer announces the last element to be integrated. The students will now also have to perform a scream. The scream has been practiced separately before. The trainer delivers the instruction in a more complex format. This choice is in line with our observations in the excursus on the increased complexity and length of instructions that introduce additions or changes.</p>
<fig id="FX9" position="float">
<label>EXTRACT 9</label>
<caption>
<p>&#x201c;and &#x2b; [hand clap &#x2b; zack]&#x201d;</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g018.tif"/>
</fig>
<p>First, the trainer returns to the meta-pragmatic announcement (l. 1) that she had left out before; second, she delivers the instruction in two intonation phrases (l. 2&#x2013;3). These are separated by a small pause. The syntactic gestalt of the first intonation phrase (l. 2) is incomplete and projects more to come. The pause between first and second intonation phrase becomes hearable as a turn-holding device, which slightly delays the second intonation phrase (l. 3), summons the students&#x2019; attention, and brings the new instructional component, the &#x201c;no-scream&#x201d; (l. 3), into focus. Next, the trainer uses the reduced format of <italic>and</italic>-preface, simultaneous vocalization, and clap (l. 4). After the students have integrated the new element (l. 5), the trainer formulates a positive assessment (l. 6: &#x201c;SUper&#x201d;/&#x201c;great&#x201d;). Subsequently, she requests the students to step back and projects a repetition (l.&#x20;7).</p>
<p>After a brief comment on the scream, the trainer recurs to the lean format. The lean version documents progressive reductions of the instructing FPP and concomitantly, a temporal compression between FPP and embodied&#x20;SPP.</p>
<fig id="FX10" position="float">
<label>EXTRACT 10</label>
<caption>
<p>&#x201c;Short Like This&#x201d;</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g019.tif"/>
</fig>
<p>While the <italic>and</italic>-prefaced coupling of vocalization and hand clap projects the temporal slot for the students&#x2019; embodied response, they have to infer from the interactional history how to design their action. In the present case, they understand the trainer&#x2019;s minimal signal as a go-ahead to repeat the previous action (l. 2). No explicit instruction tells them that they are requested to repeat the integration of the three elements they have practiced separately before.</p>
<p>The next extract attests to the local adaptability and temporal flexibility of the format once it has been established. These variations do not constitute counter-evidence to the observed integrity of the format as an oriented-to, recognizable gestalt. They are recipient-designed temporal calibrations. The local variations reflect and orient to the addressees&#x2019; attention and participation. The fact that the reduced format can be lengthened without being fragmentized is evidence to its beginning sedimentation in the local context. The formal and functional sedimentation of the format is the result of the participants&#x2019; joint routinization.</p>
<fig id="FX11" position="float">
<label>EXTRACT 11</label>
<caption>
<p>&#x201c;and &#x2b; [hand clap &#x2b; zack]&#x201d;</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g020.tif"/>
</fig>
<p>In extract 10, the trainer&#x2019;s action was designed and understood as a repetition of her action in extract 9. In the same way, her action in extract 11 is delivered and understood as a repetition of her actions in extracts 9 and 10. However, extract 11 exhibits a significant temporal variation: The <italic>and</italic>-preface is extremely lengthened and followed by a long pause before the go-ahead is delivered (l. 01). As some students are laughing among themselves (l. 01), the trainer delays her action by adapting it to the students&#x2019; activities and attentional&#x20;focus.</p>
<p>Originally, the hand clap was introduced as a go-ahead only, and later coupled with a vocalization. In order to project the occurrence of the go-ahead, the trainer used a prepositioned conjunction, the <italic>and</italic>-preface. The format &#x201c;<italic>and</italic> &#x2b; [clap &#x2b; vocalization]&#x201c; was used in two sequential contexts: 1) after an instructing action with a new component to initiate and time the students&#x2019; performance of the new practice, 2) to invite and time repetitions of an established practice. After several alternations between 1) and 2), the format began to index, even after insertions, by inference alone, the most recent practice. In other words, it has progressively assumed the meaning and function of what has been left out, and has finally become a shibboleth for the instructing action.</p>
<p>Although both devices, the <italic>and</italic>-preface and the clap, project and time what comes in the subsequent slot, their function developed along different paths in the course of the participants&#x2019; interactional history. Whereas the trainer explicitly established the timing function of the clap by a meta-pragmatic announcement, the projecting function of the <italic>and</italic>-preface emerged in practice.<xref ref-type="fn" rid="FN5">
<sup>5</sup>
</xref> It is, moreover, based on the projective properties of syntax in German (<xref ref-type="bibr" rid="B9">Auer, 2015</xref>). This can neither be claimed for the clap nor for the vocalization, notwithstanding the fact that they also project what comes next. However, their projective force is grounded in interaction, and not in grammar (<xref ref-type="bibr" rid="B7">Auer, 2005</xref>).</p>
<p>The <italic>and</italic>-preface is combined with the clap to form a syntagma&#x20;of progressively projecting timing resources. The same holds for the vocalization, which was introduced and routinized by practice and which inherited the function of the meta-pragmatically established and synchronously performed clap (<xref ref-type="fig" rid="F9">Figure 9</xref>).</p>
<fig id="F9" position="float">
<label>FIGURE 9</label>
<caption>
<p>Projection of timing resources.</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g009.tif"/>
</fig>
<p>After the format has been repeated several times, the trainer returns to the minimal version even after insertion sequences. She no longer goes back to a more complex format in order to redesign the FPP. This is further evidence to an increased sedimentation of the format &#x201c;<italic>and</italic> &#x2b; [clap &#x2b; vocalization]&#x201d;.</p>
</sec>
<sec id="s5-5">
<title>When Drill Takes Over to Automatize Motor Actions</title>
<p>The last extract documents that in the course of multiple repetitions, the format undergoes still further reduction. Extreme reduction and acceleration finally transform routinization into automatization. Note that this not only constitutes a qualitative change, but once more raises the question of grammaticalization, if grammaticalization is automatization (<xref ref-type="bibr" rid="B14">Bybee, 2014</xref>) and grammar &#x201c;nothing other than [&#x2026;] automated motor action&#x201d; (<xref ref-type="bibr" rid="B70">Streeck, 2018</xref>: 31). This question will be discussed in the final section.</p>
<p>Extract 12 shows the maximally reduced format. The trainer now simply claps, and the students subsequently perform the instructed action.</p>
<fig id="FX12" position="float">
<label>EXTRACT 12</label>
<caption>
<p>&#x201c;Clap Only&#x201d;</p>
</caption>
<graphic xlink:href="fcomm-06-662240-g021.tif"/>
</fig>
<p>This extreme reduction enables an even faster transition between first and second action, between clap and step. At the same time, it significantly accelerates the succession of repetitive goes at the same action. In order to accelerate and automatize students&#x2019; motor actions, the trainer progressively shortens her action, accelerates, routinizes, and finally automatizes the temporal succession of FPP and SPP. Training units that are repeated over and over again undergo acceleration, dynamization, and automatization. These features reflect not only a quantitative, but also a qualitative change in the participants&#x2019; exercising practice: It is ultimately transformed into drill. Drill as a practice in military and in sports serves routinization and automatization of motor actions performed innumerable times at high velocity.</p>
<p>The phenomena described in this sub-section exhibit striking parallels with processes of grammaticalization (cf. <italic>Grammaticalization and embodied action: (when and how) do they go together?</italic>). On the one hand, the extracts testify to progressive routinization and acceleration of motor actions that the students repeat multiple times <bold>in order to automatize and incarnate them</bold> as part of their repertoire of self-defense techniques. On the other hand, the extracts document a process of routinization, sedimentation, and even automatization that takes place on a different plane: communication. Embodied resources are used and coupled with speech <bold>in order to communicate, to deliver verbal actions</bold>; they are not repeated in order to learn and automatize language&#x2014;as in old-school language teaching &#x2013;, but in order to deliver and structure verbal actions, and to project and time addressees&#x2019; embodied responses. In the activities under investigation, the latter&#x2014;reducing and accelerating communicative actions&#x2014;is in the service of the former&#x2014;accelerating and routinizing motor actions. These processes are not separate, but intertwined, they reflexively constitute and index co-emerging properties. The communicative practices used to teach self-defense practices inherit properties of the latter while the latter are shaped by the communicative practices of the former.</p>
</sec>
</sec>
<sec sec-type="discussion" id="s6">
<title>Discussion</title>
<p>The aim of this study was to contribute to the recent grammar-body-debate by proposing a distinction between two kinds of grammar-body-gestalts: 1. socially sedimented, grammaticalized multimodal constructions, and 2. locally routinized ephemeral gestalts. Evidence for the first type was provided in part I of the analysis by an examination of modal demonstratives, embodied practices and concurrent gaze behavior. The focus was on demonstrations indexed by the modal demonstrative <italic>so</italic>/&#x201c;like this&#x201d; and &#x201c;flagged&#x201d; (<xref ref-type="bibr" rid="B69">Streeck, 2002</xref>) by speaker gaze. In line with typological, historical, and interaction linguistic studies on demonstratives, it was argued that the primordial function of demonstratives is to establish joint&#x20;attention on phenomena in the participants&#x2019; surroundings, and that this makes embodied devices indispensable (<xref ref-type="bibr" rid="B13">B&#xfc;hler, 1990[1934]</xref>; <xref ref-type="bibr" rid="B26">Diessel and Coventry, 2020</xref>; <xref ref-type="bibr" rid="B78">Stukenbrock, 2020a</xref>). Embodied practices are made of participants&#x2019; motor actions; these unfold in time, exhibit &#x201c;inner duration&#x201d; (&#x201c;innere Dauer&#x201d;, <xref ref-type="bibr" rid="B67">Streeck, 2007</xref>: 158), and are interpersonally coordinated. Temporal flexibility is therefore an interactional prerequisite without which demonstratives as multimodal constructions could not have emerged. In short, temporal flexibility is the sedimented historic result of concrete, situated, temporally fine-tuned uses of those grammar-body-gestalts in language history.</p>
<p>In social interaction, the use of these constructions is made contingent on the local context, the resources are mobilized, recipient-designed, and temporally calibrated to fit participants&#x2019; ongoing activities. In other words, while these constructions are made of emerging (historically sedimented, grammaticalized) constructions; they are delivered in context-sensitive ways as emergent constructions. Here, variation and innovation take place, and new ephemeral multimodal gestalts emerge. When these are reiterated, routinized, and distributed across contexts, they may eventually become grammaticalized.</p>
<p>In part II, I investigated the emergence of such an ephemeral multimodal assemblage and its micro-diachronic changes. It was shown that in the course of multiple repetitions, the multimodal gestalt underwent formal reduction and functional change. Although the observed processes and changes are similar to those described in grammaticalization, radically different temporal scales and social-distributional dimensions are involved. As long as a format or structure remains a local phenomenon, it is not grammaticalized. It has to spread beyond the initial context of its use, expand, and generalize across types of contexts (<xref ref-type="bibr" rid="B44">Hopper and Traugott, 2003</xref>; <xref ref-type="bibr" rid="B14">Bybee, 2014</xref>) until it becomes widely used in the language community and part of its shared linguistic repertoire or knowledge as an effect of &#x201c;social historical institutionalization&#x201c; (<xref ref-type="bibr" rid="B16">Couper-Kuhlen and Selting, 2018</xref>:&#x20;542).</p>
<p>In sum, multimodal gestalts with different histories are evoked in social interaction. Ephemeral multimodal gestalts are not grammaticalized and have no place in grammar. I do not claim that locally occurring, ephemeral gestalts cannot be grammaticalized. Rather, my proposition is to distinguish between micro-diachronic and historical processes, and to consider joint routinization and collective routinization as subsequent stages along a path towards grammaticalization. The cradle for such a development may be the movement of a practice from the ephemeral pole to the sedimented pole of the Emergent Grammar-continuum (<xref ref-type="bibr" rid="B31">Ford, and Fox, 2015</xref>). But is has to move on beyond the sedimented pole of Emergent Grammar and along the grammaticalization path (<xref ref-type="bibr" rid="B44">Hopper and Traugott, 2003</xref>) that leads to social sedimentation and institutionalization across contexts. This view approaches (multimodal) constructions both as emerging and emergent (<xref ref-type="bibr" rid="B60">Auer and Pf&#xe4;nder, 2011b</xref>). It emphasizes that &#x201c;[t]here is no need to exclude routines from an emergentist approach&#x201d; (<xref ref-type="bibr" rid="B5">Auer and Pf&#xe4;nder, 2011a</xref>: 18), and, in turn, that emergent constructions are the stuff that emerging constructions are made off. It acknowledges linguistic knowledge, <italic>longue dur&#xe9;e</italic> sedimentations, and routines as fundamental to the temporal organization of spoken language (<xref ref-type="bibr" rid="B16">Couper-Kuhlen and Selting, 2018</xref>). By fueling participants&#x2019; expectations, sedimented routines enable participants to project what comes next. At the same time, they lay the grounds for improvisation and breach of expectations (<xref ref-type="bibr" rid="B5">Auer and Pf&#xe4;nder, 2011a</xref>)&#x2014;and for a mutual incorporation of linguistic and embodied structures and their potential grammaticalization over historical&#x20;time.</p>
<p>My observations reverberate with Streeck&#x2019;s discussion of the&#x20;parallels between grammaticalization in language and the&#x20;emancipation of gestures. Streeck observes that &#x201c;grammaticalization gives us a model how to approach the issue of gesture&#x2019;s (ongoing) evolution&#x201d; (<xref ref-type="bibr" rid="B72">Streeck, 2021</xref>: 110)&#x2014;and by extension, it may also give us a model how to approach grammar-body couplings investigated in this paper. Streeck emphasizes parallels in the evolution of gesture and language, but he does not claim that gestures are grammaticalizing. Instead, he&#x20;suggests that the processes observable in gestures and in spoken languages &#x201c;are broadly characteristic of human cultural and symbolic evolution&#x201d; (<xref ref-type="bibr" rid="B72">Streeck, 2021</xref>: 01, footnote 1). This leaves open the status&#x20;of grammar-body-couplings: Are they composed of structures that evolve in parallel, or are they integrated into a whole and undergo change, routinization, social sedimentation, and eventually grammaticalization? A first answer to this question is given in this paper: to distinguish between <italic>ad hoc</italic> assembled, ephemeral grammar-body-gestalts, and socially sedimented multimodal constructions that have grammaticalized the embodied context of their use over time. While repetition and <italic>joint routinization</italic> of an ephemeral gestalt may lead to the local sedimentation of that gestalt among participants who are mutually engaged in shared activities, <italic>collective routinization</italic> emerges across time and space among social groups whose members are not mutually aware of one another. From here, a practice may or may not start to move along the grammaticalization path (<xref ref-type="bibr" rid="B44">Hopper and Traugott, 2003</xref>).</p>
</sec>
</body>
<back>
<sec id="s7">
<title>Data Availability Statement</title>
<p>The video data for this study are not publicly available.</p>
</sec>
<sec id="s8">
<title>Ethics Statement</title>
<p>Ethical review and approval was not required for this study. Informed consent was obtained from all participants.</p>
</sec>
<sec id="s9">
<title>Author Contributions</title>
<p>AS developed the theoretical framework, collected the data with the support of her team, carried out the empirical analysis, and wrote the article.</p>
</sec>
<sec sec-type="COI-statement" id="s10">
<title>Conflict of Interest</title>
<p>The author declares that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.</p>
</sec>
<sec sec-type="disclaimer" id="s12">
<title>Publisher&#x2019;s Note</title>
<p>All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.</p>
</sec>
<ack>
<p>I thank the editors of this special issue for their invitation to contribute to this fascinating topic, and for embarking on the adventure of editing it in this format. I also thank the reviewers for critical reading and helpful comments on a previous version of this paper. I am indebted to the Freiburg Institute for Advanced Studies (FRIAS), University of Freiburg/Breisgau, for support of my research on deixis in face-to-face interaction.</p>
</ack>
<sec id="s11">
<title>Supplementary Material</title>
<p>The Supplementary Material for this article can be found online at: <ext-link ext-link-type="uri" xlink:href="https://www.frontiersin.org/articles/10.3389/fcomm.2021.662240/full#supplementary-material">https://www.frontiersin.org/articles/10.3389/fcomm.2021.662240/full&#x23;supplementary-material</ext-link>
</p>
<supplementary-material xlink:href="DataSheet1.docx" id="SM1" mimetype="application/docx" xmlns:xlink="http://www.w3.org/1999/xlink"/>
</sec>
<fn-group>
<fn id="FN1">
<label>1</label>
<p>For reasons of space, only one example is shown in this section. Examples of the grammar-body construction with <italic>so</italic> can be found in the literature (<xref ref-type="bibr" rid="B59">Ningelgen and Auer, 2017</xref>; <xref ref-type="bibr" rid="B82">Stukenbrock, 2010</xref>, <xref ref-type="bibr" rid="B81">2014</xref>, <xref ref-type="bibr" rid="B77">2015</xref>). Current research on demonstratives provides further evidence for embodiment as part of grammaticalization (<xref ref-type="bibr" rid="B26">Diessel and Coventry, 2020</xref>).</p>
</fn>
<fn id="FN2">
<label>2</label>
<p>Although the video data do not allow precise observations of the students&#x2019; gaze directions, those who are visible at that moment can be seen to slightly accommodate their head orientation downwards.</p>
</fn>
<fn id="FN3">
<label>3</label>
<p>Note that this observation does not question the observation that demonstratives (modal as well as spatial) when used exophorically, and gesturally, in face-to-face interaction, bear the focal accent of the intonation phrase. They do, and only in an uptake or repeated use may the resources, in this case the accent, be reduced.</p>
</fn>
<fn id="FN4">
<label>4</label>
<p>Although the clap is projected by other resources, the students never move forward in synchrony with the&#x20;clap.</p>
</fn>
<fn id="FN5">
<label>5</label>
<p>Questions that emerge from this are, which elements lend themselves to being introduced <italic>en passant</italic>, in and through practice alone, and which elements are, in contrast, metapragmatically established, and why is this so. These problems cannot be discussed here. They are topics for further investigation.</p>
</fn>
</fn-group>
<ref-list>
<title>References</title>
<ref id="B3">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Arminen</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Koskela</surname>
<given-names>I.</given-names>
</name>
<name>
<surname>Palukka</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Multimodal Production of Second Pair Parts in Air Traffic Control Training</article-title>. <source>J.&#x20;Pragmatics</source> <volume>65</volume>, <fpage>46</fpage>&#x2013;<lpage>62</lpage>. <pub-id pub-id-type="doi">10.1016/j.pragma.2014.01.004</pub-id> </citation>
</ref>
<ref id="B7">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Auer</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2005</year>). <article-title>Projection in Interaction and Projection in Grammar</article-title>. <source>Text &#x26; Talk</source> <volume>25</volume>, <fpage>7</fpage>&#x2013;<lpage>36</lpage>. <pub-id pub-id-type="doi">10.1515/text.2005.25.1.7</pub-id> </citation>
</ref>
<ref id="B10">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Auer</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2006</year>). &#x201c;<article-title>Construction Grammar meets Conversation: Einige &#xdc;berlegungen am Beispiel von &#x2018;so&#x2019;-Konstruktionen</article-title>,&#x201d; in <source>Konstruktionen in der Interaktion Linguistik</source>. Editors <person-group person-group-type="editor">
<name>
<surname>G&#xfc;nthner</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Imo</surname>
<given-names>W.</given-names>
</name>
</person-group> (<publisher-loc>Berlin, New York</publisher-loc>: <publisher-name>de Gruyter</publisher-name>), <fpage>291</fpage>&#x2013;<lpage>314</lpage>. </citation>
</ref>
<ref id="B4">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Auer</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2009a</year>). <article-title>On-line Syntax: Thoughts on the Temporality of Spoken Language</article-title>. <source>Lang. Sci.</source> <volume>31</volume>, <fpage>1</fpage>&#x2013;<lpage>13</lpage>. <pub-id pub-id-type="doi">10.1016/j.langsci.2007.10.004</pub-id> </citation>
</ref>
<ref id="B6">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Auer</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2009b</year>). <article-title>Projection and Minimalistic Syntax in Interaction</article-title>. <source>Discourse Process.</source> <volume>46</volume>, <fpage>180</fpage>&#x2013;<lpage>205</lpage>. <pub-id pub-id-type="doi">10.1080/01638530902728934</pub-id> </citation>
</ref>
<ref id="B9">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Auer</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>The Temporality of Language in Interaction: Projection and Latency</article-title>,&#x201d; in <source>Temporality in Interaction</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>G&#xfc;nthner</surname>
<given-names>S.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam, Philadelphia</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>27</fpage>&#x2013;<lpage>56</lpage>. <pub-id pub-id-type="doi">10.1075/slsi.27.01aue</pub-id> </citation>
</ref>
<ref id="B5">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Auer</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Pf&#xe4;nder</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2011a</year>). &#x201c;<article-title>Constructions: Emergent or Emerging</article-title>,&#x201d; in <source>Constructions: Emerging and Emergent</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Auer</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Pf&#xe4;nder</surname>
<given-names>S.</given-names>
</name>
</person-group> (<publisher-loc>Berlin, New York</publisher-loc>: <publisher-name>De Gruyter</publisher-name>), <fpage>1</fpage>&#x2013;<lpage>21</lpage>. </citation>
</ref>
<ref id="B60">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Auer</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Pf&#xe4;nder</surname>
<given-names>S.</given-names>
</name>
</person-group> (Editors) (<year>2011b</year>). <source>Constructions: Emerging and Emergent</source> (<publisher-loc>Berlin, Boston</publisher-loc>: <publisher-name>De Gruyter</publisher-name>). </citation>
</ref>
<ref id="B8">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Auer</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Stukenbrock</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>When &#x27;You&#x27; Means &#x27;I&#x27;: The German 2Nd Ps.Sg. Pronoun Du between Genericity and Subjectivity</article-title>. <source>Open Linguistics</source> <volume>4</volume>, <fpage>280</fpage>&#x2013;<lpage>309</lpage>. <pub-id pub-id-type="doi">10.1515/opli-2018-0015</pub-id> </citation>
</ref>
<ref id="B11">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Balantani</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Reference Construction in Interaction: The Case of Type-Indicative &#x201c;so&#x201d;</article-title>. <source>J.&#x20;Pragmatics</source> <volume>181</volume>, <fpage>241</fpage>&#x2013;<lpage>258</lpage>. <pub-id pub-id-type="doi">10.1016/j.pragma.2021.05.024</pub-id> </citation>
</ref>
<ref id="B12">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Barske</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Golato</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>German So: Managing Sequence and Action</article-title>. <source>Text &#x26; Talk</source> <volume>30</volume>, <fpage>245</fpage>&#x2013;<lpage>266</lpage>. <pub-id pub-id-type="doi">10.1515/text.2010.013</pub-id> </citation>
</ref>
<ref id="B13">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>B&#xfc;hler</surname>
<given-names>K.</given-names>
</name>
</person-group> (<year>1990[1934]</year>). <source>Theory of Language</source>. <publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>. </citation>
</ref>
<ref id="B14">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Bybee</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2014</year>). &#x201c;<article-title>Cognitive Processes in Grammaticalization,</article-title>&#x201d; in <source>The New Psychology of Language. Cognitive and Functional Approaches to Language Structure</source>. Editor <person-group person-group-type="editor">
<name>
<surname>Tomasello</surname>
<given-names>M.</given-names>
</name>
</person-group> (<publisher-loc>Mahwah, NJ</publisher-loc>: <publisher-name>Lawrence Erlbaum Associates</publisher-name>). </citation>
</ref>
<ref id="B15">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Couper-Kuhlen</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Finding a Place for Body Movement in Grammar</article-title>. <source>Res. Lang. Soc. Interaction</source> <volume>51</volume>, <fpage>22</fpage>&#x2013;<lpage>25</lpage>. <pub-id pub-id-type="doi">10.1080/08351813.2018.1413888</pub-id> </citation>
</ref>
<ref id="B16">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Couper-Kuhlen</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Selting</surname>
<given-names>M.</given-names>
</name>
</person-group> (<year>2018</year>). <source>Interactional Linguistics: Studying Language in Social Interaction</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>. </citation>
</ref>
<ref id="B17">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>De Stefani</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>2014</year>). &#x201c;<article-title>Establishing Joint Orientation towards Commercial Objects in a&#x20;Self-Service Store: How Practices of Categorisation Matter</article-title>,&#x201d; in <source>Interacting With Objects</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Nevile</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Haddington</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Heinemann</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Rauniomaa</surname>
<given-names>M.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>271</fpage>&#x2013;<lpage>294</lpage>. <pub-id pub-id-type="doi">10.1075/z.186.12ste</pub-id> </citation>
</ref>
<ref id="B18">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>De Stefani</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Gazin</surname>
<given-names>A.-D.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Instructional Sequences in Driving Lessons: Mobile Participants and the Temporal and Sequential Organization of Actions</article-title>. <source>J.&#x20;Pragmatics</source> <volume>65</volume>, <fpage>63</fpage>&#x2013;<lpage>79</lpage>. <pub-id pub-id-type="doi">10.1016/j.pragma.2013.08.020</pub-id> </citation>
</ref>
<ref id="B20">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2006</year>). &#x201c;<article-title>Deontische Infinitivkonstruktionen: Syntax, Semantik, Pragmatik und interaktionale Verwendung</article-title>,&#x201d; in <source>Konstruktionen in der Interaktion</source>. Editors <person-group person-group-type="editor">
<name>
<surname>G&#xfc;nthner</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Imo</surname>
<given-names>W.</given-names>
</name>
</person-group> (<publisher-loc>Berlin, New York</publisher-loc>: <publisher-name>De Gruyter</publisher-name>), <fpage>239</fpage>&#x2013;<lpage>262</lpage>. </citation>
</ref>
<ref id="B24">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Retrospection and Understanding in Interaction</article-title>,&#x201d; in <source>Temporality in Interaction</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>G&#xfc;nthner</surname>
<given-names>S.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>57</fpage>&#x2013;<lpage>94</lpage>. <pub-id pub-id-type="doi">10.1075/slsi.27.02dep</pub-id> </citation>
</ref>
<ref id="B19">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2018a</year>). &#x201c;<article-title>Changes in Turn-Design over Interactional Histories - the Case of Instructions in Driving School Lessons</article-title>,&#x201d; in <source>Time in Embodied Interaction: Synchronicity and Sequentiality of Multimodal Resources</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>293</fpage>&#x2013;<lpage>324</lpage>. <pub-id pub-id-type="doi">10.1075/pbns.293.09dep</pub-id> </citation>
</ref>
<ref id="B21">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2018b</year>). <article-title>Editorial: Instructions in Driving Lessons</article-title>. <source>Int. J.&#x20;Appl. Linguist</source> <volume>28</volume>, <fpage>221</fpage>&#x2013;<lpage>225</lpage>. <pub-id pub-id-type="doi">10.1111/ijal.12206</pub-id> </citation>
</ref>
<ref id="B22">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2018c</year>). <article-title>Instruction Practices in German Driving Lessons: Differential Uses of Declaratives and Imperatives</article-title>. <source>Int. J.&#x20;Appl. Linguist</source> <volume>28</volume>, <fpage>265</fpage>&#x2013;<lpage>282</lpage>. <pub-id pub-id-type="doi">10.1111/ijal.12198</pub-id> </citation>
</ref>
<ref id="B2">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>G&#x00FC;nthner</surname>
<given-names>S.</given-names>
</name>
</person-group> (Editors) (<year>2015</year>). <source>Temporality in Interaction</source> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>).</citation>
</ref>
<ref id="B1">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
</person-group> (Editors) (<year>2018</year>). <source>Time in Embodied Interaction</source> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>).</citation>
</ref>
<ref id="B23">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Pekarek Doehler</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>Longitudinal Conversation Analysis - Introduction to the Special Issue</article-title>. <source>Res. Lang. Soc. Interaction</source> <volume>54</volume>, <fpage>127</fpage>&#x2013;<lpage>141</lpage>. <pub-id pub-id-type="doi">10.1080/08351813.2021.1899707</pub-id> </citation>
</ref>
<ref id="B25">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Schmidt</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>How Shared Meanings and Uses Emerge over an Interactional History: Wabi Sabi in a Series of Theater Rehearsals</article-title>. <source>Res. Lang. Soc. Interaction</source> <volume>54</volume>, <fpage>203</fpage>&#x2013;<lpage>224</lpage>. <pub-id pub-id-type="doi">10.1080/08351813.2021.1899714</pub-id> </citation>
</ref>
<ref id="B28">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Diessel</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>1999</year>). <source>Demonstratives. Form, Function, and Grammaticalization</source>. <publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>. </citation>
</ref>
<ref id="B27">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Diessel</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2006</year>). <article-title>Demonstratives, Joint Attention, and the Emergence of Grammar</article-title>. <source>Cogn. Linguistics</source> <volume>17</volume>, <fpage>463</fpage>&#x2013;<lpage>489</lpage>. <pub-id pub-id-type="doi">10.1515/cog.2006.015</pub-id> </citation>
</ref>
<ref id="B26">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Diessel</surname>
<given-names>H.</given-names>
</name>
<name>
<surname>Coventry</surname>
<given-names>K. R.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Demonstratives in Spatial Language and Social Interaction: An Interdisciplinary Review</article-title>. <source>Front. Psychol.</source> <volume>11</volume>. <pub-id pub-id-type="doi">10.3389/fpsyg.2020.555265</pub-id> </citation>
</ref>
<ref id="B29">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Evans</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>Lindwall</surname>
<given-names>O.</given-names>
</name>
</person-group> (<year>2020</year>). <article-title>Show Them or Involve Them? Two Organizations of Embodied Instruction</article-title>. <source>Res. Lang. Soc. Interaction</source> <volume>53</volume>, <fpage>223</fpage>&#x2013;<lpage>246</lpage>. <pub-id pub-id-type="doi">10.1080/08351813.2020.1741290</pub-id> </citation>
</ref>
<ref id="B30">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Fillmore</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>1997</year>). <source>Lectures on Deixis</source>. <publisher-loc>Stanford, CA</publisher-loc>: <publisher-name>University of Chicago Press</publisher-name>. </citation>
</ref>
<ref id="B31">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Ford</surname>
<given-names>C. E.</given-names>
</name>
<name>
<surname>Fox</surname>
<given-names>B. A.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Ephemeral Grammar: At the Far End of Emergence</article-title>,&#x201d; in <source>Temporality in Interaction</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>G&#xfc;nthner</surname>
<given-names>S.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam, Philadelphia</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>95</fpage>&#x2013;<lpage>120</lpage>. <comment>Available at: <ext-link ext-link-type="uri" xlink:href="https://www.jbe-platform.com/content/books/9789027268990-slsi.27.03for">https://www.jbe-platform.com/content/books/9789027268990-slsi.27.03for</ext-link>
</comment>. <pub-id pub-id-type="doi">10.1075/slsi.27.03for</pub-id> </citation>
</ref>
<ref id="B33">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Goffman</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>1971</year>). <source>Relations in Public: Microstudies of the Public Order</source>. <publisher-loc>New York</publisher-loc>: <publisher-name>Basic Books</publisher-name>. </citation>
</ref>
<ref id="B32">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Goffman</surname>
<given-names>E.</given-names>
</name>
</person-group> (<year>1981</year>). <source>Forms of Talk</source>. <publisher-loc>Philadelphia</publisher-loc>: <publisher-name>University of Pennsylvania Press</publisher-name>. </citation>
</ref>
<ref id="B34">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Golato</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2000</year>). <article-title>An innovative German quotative for reporting on embodied actions: Und ich so/und er so &#x27;and i&#x27;m like/and he&#x27;s like&#x27;</article-title>. <source>J.&#x20;Pragmatics</source> <volume>32</volume>, <fpage>29</fpage>&#x2013;<lpage>54</lpage>. <pub-id pub-id-type="doi">10.1016/s0378-2166(99)00030-2</pub-id> </citation>
</ref>
<ref id="B36">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Goodwin</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>1980</year>). <article-title>Restarts, Pauses, and the Achievement of a State of Mutual Gaze at Turn-Beginning</article-title>. <source>Sociological Inq.</source> <volume>50</volume>, <fpage>272</fpage>&#x2013;<lpage>302</lpage>. <pub-id pub-id-type="doi">10.1111/j.1475-682x.1980.tb00023.x</pub-id> </citation>
</ref>
<ref id="B35">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Goodwin</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>1994</year>). <article-title>Professional Vision</article-title>. <source>Am. Anthropologist</source> <volume>96</volume>, <fpage>606</fpage>&#x2013;<lpage>633</lpage>. <pub-id pub-id-type="doi">10.1525/aa.1994.96.3.02a00100</pub-id> </citation>
</ref>
<ref id="B37">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Goodwin</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>1996</year>). &#x201c;<article-title>Transparent Vision</article-title>,&#x201d; in <source>Interaction and Grammar</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Ochs</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Schegloff</surname>
<given-names>E. A.</given-names>
</name>
<name>
<surname>Thompson</surname>
<given-names>S. A.</given-names>
</name>
</person-group> (<publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>), <fpage>370</fpage>&#x2013;<lpage>404</lpage>. <pub-id pub-id-type="doi">10.1017/cbo9780511620874.008</pub-id> </citation>
</ref>
<ref id="B38">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Goodwin</surname>
<given-names>M. H.</given-names>
</name>
<name>
<surname>Goodwin</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Car Talk: Integrating Texts, Bodies, and Changing Landscapes</article-title>. <source>Semiotica</source> <volume>2012</volume>, <fpage>257</fpage>&#x2013;<lpage>286</lpage>. <pub-id pub-id-type="doi">10.1515/sem-2012-0063</pub-id> </citation>
</ref>
<ref id="B39">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Haiman</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>1994</year>). &#x201c;<article-title>Ritualization and the Development of Language</article-title>,&#x201d; in <source>Perspectives on Grammaticalization</source>. Editor <person-group person-group-type="editor">
<name>
<surname>Pagliuca</surname>
<given-names>W.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam, Philadelphia</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>3</fpage>&#x2013;<lpage>28</lpage>. <pub-id pub-id-type="doi">10.1075/cilt.109.07hai</pub-id> </citation>
</ref>
<ref id="B40">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Heath</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>1986</year>). <source>Body Movement and Speech in Medical Interaction</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>. </citation>
</ref>
<ref id="B41">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Hopper</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>1987</year>). <article-title>Emergent Grammar</article-title>. <source>Ann. Meet. Berkeley Linguis. Soc.</source> <volume>13</volume>, <fpage>139</fpage>&#x2013;<lpage>157</lpage>. <pub-id pub-id-type="doi">10.3765/bls.v13i0.1834</pub-id> </citation>
</ref>
<ref id="B42">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Hopper</surname>
<given-names>P. J.</given-names>
</name>
</person-group> (<year>2011</year>). &#x201c;<article-title>Emergent Grammar and Temporality in Interactional Linguistics</article-title>,&#x201d; in <source>Constructions: Emerging and Emergent</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Auer</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Pf&#xe4;nder</surname>
<given-names>S.</given-names>
</name>
</person-group> (<publisher-loc>Berlin; Boston</publisher-loc>: <publisher-name>De Gruyter</publisher-name>), <fpage>22</fpage>&#x2013;<lpage>44</lpage>. </citation>
</ref>
<ref id="B43">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Hopper</surname>
<given-names>P. J.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Hermann Paul&#x27;s Emergent Grammar</article-title>,&#x201d; in <source>Hermann Paul&#x2019;s &#x201c;Principles of Language History&#x201d; Revisited</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Auer</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Murray</surname>
<given-names>R. W.</given-names>
</name>
<name>
<surname>Hopper</surname>
<given-names>P. J.</given-names>
</name>
</person-group> (<publisher-loc>Berlin, Boston</publisher-loc>: <publisher-name>De Gruyter</publisher-name>), <fpage>237</fpage>&#x2013;<lpage>255</lpage>. <pub-id pub-id-type="doi">10.1515/9783110348842-012</pub-id> </citation>
</ref>
<ref id="B44">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Hopper</surname>
<given-names>P. J.</given-names>
</name>
<name>
<surname>Traugott</surname>
<given-names>E. C.</given-names>
</name>
</person-group> (<year>2003</year>). <source>Grammaticalization</source>. <edition>2nd Edn.</edition> <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>. </citation>
</ref>
<ref id="B46">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Keevallik</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Coordinating the Temporalities of Talk and Dance</article-title>,&#x201d; in <source>Temporality in Interaction</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>G&#xfc;nthner</surname>
<given-names>S.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>309</fpage>&#x2013;<lpage>336</lpage>. <pub-id pub-id-type="doi">10.1075/slsi.27.10kee</pub-id> </citation>
</ref>
<ref id="B47">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Keevallik</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2018a</year>). <article-title>What Does Embodied Interaction Tell Us about Grammar</article-title>. <source>Res. Lang. Soc. Interaction</source> <volume>51</volume>, <fpage>1</fpage>&#x2013;<lpage>21</lpage>. <pub-id pub-id-type="doi">10.1080/08351813.2018.1413887</pub-id> </citation>
</ref>
<ref id="B45">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Keevallik</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2018b</year>). &#x201c;<article-title>The Temporal Organization of Conversation while Mucking Out a Sheep Stable</article-title>,&#x201d; in <source>Time in Embodied Interaction: Synchronicity and Sequentiality of Multimodal Resources</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam, Philadelphia</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>97</fpage>&#x2013;<lpage>122</lpage>. <pub-id pub-id-type="doi">10.1075/pbns.293.03kee</pub-id> </citation>
</ref>
<ref id="B48">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Keisanen</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>&#x201c;Uh-oh, We Were Going There&#x201d;: Environmentally Occasioned Noticings of Trouble in In-Car Interaction</article-title>. <source>Semiotica</source> <volume>2012</volume>, <fpage>197</fpage>&#x2013;<lpage>222</lpage>. <pub-id pub-id-type="doi">10.1515/sem-2012-0061</pub-id> </citation>
</ref>
<ref id="B49">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Kendon</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2004</year>). <source>Gesture. Visible Action as Utterance</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>. </citation>
</ref>
<ref id="B50">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Laberge</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Sankoff</surname>
<given-names>G.</given-names>
</name>
</person-group> (<year>1979</year>). &#x201c;<article-title>Anything <italic>You</italic> Can Do</article-title>,&#x201d; in <source>Syntax and Semantics, Vol 12: Discourse and Syntax</source>. Editor <person-group person-group-type="editor">
<name>
<surname>Giv&#xf2;n</surname>
<given-names>T.</given-names>
</name>
</person-group> (<publisher-loc>New York</publisher-loc>: <publisher-name>Academic Press</publisher-name>), <fpage>419</fpage>&#x2013;<lpage>440</lpage>. <pub-id pub-id-type="doi">10.1163/9789004368897_018</pub-id> </citation>
</ref>
<ref id="B51">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lerner</surname>
<given-names>G. H.</given-names>
</name>
</person-group> (<year>1995</year>). <article-title>Turn Design and the Organization of Participation in Instructional Activities</article-title>. <source>Discourse Process.</source> <volume>19</volume>, <fpage>111</fpage>&#x2013;<lpage>131</lpage>. <pub-id pub-id-type="doi">10.1080/01638539109544907</pub-id> </citation>
</ref>
<ref id="B52">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Levinson</surname>
<given-names>S. C.</given-names>
</name>
</person-group> (<year>2005</year>). <source>Pragmatics</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>. </citation>
</ref>
<ref id="B53">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Lindwall</surname>
<given-names>O.</given-names>
</name>
<name>
<surname>Ekstr&#xf6;m</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2012</year>). <article-title>Instruction-in-Interaction: The Teaching and Learning of a Manual Skill</article-title>. <source>Hum. Stud.</source> <volume>35</volume>, <fpage>27</fpage>&#x2013;<lpage>49</lpage>. <pub-id pub-id-type="doi">10.1007/s10746-012-9213-5</pub-id> </citation>
</ref>
<ref id="B54">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mondada</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2016</year>). <article-title>Challenges of Multimodality: Language and the Body in Social Interaction</article-title>. <source>J.&#x20;Sociolinguistics</source> <volume>20</volume>, <fpage>336</fpage>&#x2013;<lpage>366</lpage>. <pub-id pub-id-type="doi">10.1111/josl.1_12177</pub-id> </citation>
</ref>
<ref id="B55">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Mondada</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2014a</year>). &#x201c;<article-title>Cooking Instructions and the Shaping of Things in the Kitchen</article-title>,&#x201d; in <source>Interacting With Objects: Language, Materiality, and Social Activity</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Nevile</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Heinemann</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Rauniomaa</surname>
<given-names>M.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>199</fpage>&#x2013;<lpage>226</lpage>. <pub-id pub-id-type="doi">10.1075/z.186.09mon</pub-id> </citation>
</ref>
<ref id="B56">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Mondada</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2014b</year>). <article-title>Instructions in the Operating Room: How the Surgeon Directs Their Assistant&#x27;s Hands</article-title>. <source>Discourse Stud.</source> <volume>16</volume>, <fpage>131</fpage>&#x2013;<lpage>161</lpage>. <pub-id pub-id-type="doi">10.1177/1461445613515325</pub-id> </citation>
</ref>
<ref id="B57">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Mondada</surname>
<given-names>L.</given-names>
</name>
</person-group> (<year>2015</year>). &#x201c;<article-title>Multimodal Completions</article-title>,&#x201d; in <source>Temporality in Interaction</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>G&#xfc;nthner</surname>
<given-names>S.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>267</fpage>&#x2013;<lpage>308</lpage>. <pub-id pub-id-type="doi">10.1075/slsi.27.09mon</pub-id> </citation>
</ref>
<ref id="B59">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Ningelgen</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Auer</surname>
<given-names>P.</given-names>
</name>
</person-group> (<year>2017</year>). <article-title>Is There a Multimodal Construction Based on Non-deictic So in German</article-title>. <source>Linguistics Vanguard</source> <volume>3</volume>, <fpage>1</fpage>&#x2013;<lpage>15</lpage>. <pub-id pub-id-type="doi">10.1515/lingvan-2016-0051</pub-id> </citation>
</ref>
<ref id="B62">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Pekarek Doehler</surname>
<given-names>S.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>How Grammar Grows Out of Social Interaction: From Multi-Unit to Single Unit Question</article-title>. <source>Open Linguistics</source>. </citation>
</ref>
<ref id="B61">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Pekarek Doehler</surname>
<given-names>S.</given-names>
</name>
<name>
<surname>Balaman</surname>
<given-names>U.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>The Routinization of Grammar as a Social Action Format: A Longitudinal Study of Video-Mediated Interactions</article-title>. <source>Res. Lang. Soc. Interaction</source> <volume>54</volume>, <fpage>183</fpage>&#x2013;<lpage>202</lpage>. <pub-id pub-id-type="doi">10.1080/08351813.2021.1899710</pub-id> </citation>
</ref>
<ref id="B63">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Rauniomaa</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Lehtonen</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Summala</surname>
<given-names>H.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Noticings with Instructional Implications in Post-Licence Driver Training</article-title>. <source>Int. J.&#x20;Appl. Linguist</source> <volume>28</volume>, <fpage>326</fpage>&#x2013;<lpage>346</lpage>. <pub-id pub-id-type="doi">10.1111/ijal.12199</pub-id> </citation>
</ref>
<ref id="B64">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Sch&#xfc;tz</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Luckmann</surname>
<given-names>T.</given-names>
</name>
</person-group> (<year>1973</year>). <source>The Structures of the Life-World</source>. <publisher-loc>London</publisher-loc>: <publisher-name>Heinemann</publisher-name>. </citation>
</ref>
<ref id="B65">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stivers</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Rossano</surname>
<given-names>F.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>Mobilizing Response</article-title>. <source>Res. Lang. Soc. Interaction</source> <volume>43</volume>, <fpage>3</fpage>&#x2013;<lpage>31</lpage>. <pub-id pub-id-type="doi">10.1080/08351810903471258</pub-id> </citation>
</ref>
<ref id="B58">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Selting</surname>
<given-names>M.</given-names>
</name>
<name>
<surname>Couper-Kuhlen</surname>
<given-names>E.</given-names>
</name>
</person-group> (Editors) (<year>2001</year>). <source>Studies in Interactional Linguistics</source> (<publisher-loc>Amsterdam; Philadelphia</publisher-loc>: <publisher-name>Benjamins</publisher-name>).</citation>
</ref>
<ref id="B73">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>1988</year>). <article-title>The Significance of Gesture</article-title>. <source>IPrAPiP</source> <volume>2</volume>, <fpage>60</fpage>&#x2013;<lpage>83</lpage>. <pub-id pub-id-type="doi">10.1075/iprapip.2.1-2.03str</pub-id> </citation>
</ref>
<ref id="B71">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>1995</year>). &#x201c;<article-title>On Projection</article-title>,&#x201d; in <source>Social Intelligence and interaction. Expressions and Implications of the Social Bias in Human Intelligence</source>. Editor <person-group person-group-type="editor">
<name>
<surname>Goody</surname>
<given-names>E. N.</given-names>
</name>
</person-group> (<publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>), <fpage>87</fpage>&#x2013;<lpage>110</lpage>. <pub-id pub-id-type="doi">10.1017/cbo9780511621710.007</pub-id> </citation>
</ref>
<ref id="B69">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2002</year>). <article-title>Grammars, Words, and Embodied Meanings: On the Uses and Evolution of So and like</article-title>. <source>J.&#x20;Commun.</source> <volume>52</volume>, <fpage>581</fpage>&#x2013;<lpage>596</lpage>. <pub-id pub-id-type="doi">10.1111/j.1460-2466.2002.tb02563.x</pub-id> </citation>
</ref>
<ref id="B67">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2007</year>). &#x201c;<article-title>Geste und verstreichende Zeit,</article-title>&#x201d; in <source>Gespr&#xe4;ch als Prozess. Linguistische Aspekte der Zeitlichkeit verbaler Interaktion Studien zur deutschen Sprache</source>. Editor <person-group person-group-type="editor">
<name>
<surname>Hausendorf</surname>
<given-names>H.</given-names>
</name>
</person-group> (<publisher-loc>T&#xfc;bingen</publisher-loc>: <publisher-name>Narr</publisher-name>), <fpage>157</fpage>&#x2013;<lpage>180</lpage>. </citation>
</ref>
<ref id="B66">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Forward-Gesturing</article-title>. <source>Discourse Process.</source> <volume>46</volume>, <fpage>161</fpage>&#x2013;<lpage>179</lpage>. <pub-id pub-id-type="doi">10.1080/01638530902728793</pub-id> </citation>
</ref>
<ref id="B70">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2018</year>). <article-title>Grammaticalization and Bodily Action: Do They Go Together</article-title>. <source>Res. Lang. Soc. Interaction</source> <volume>51</volume>, <fpage>26</fpage>&#x2013;<lpage>32</lpage>. <pub-id pub-id-type="doi">10.1080/08351813.2018.1413889</pub-id> </citation>
</ref>
<ref id="B72">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
</person-group> (<year>2021</year>). <article-title>The Emancipation of Gestures</article-title>. <source>Il</source> <volume>1</volume> (<issue>1</issue>), <fpage>90</fpage>&#x2013;<lpage>122</lpage>. <pub-id pub-id-type="doi">10.1075/il.20013.str</pub-id> </citation>
</ref>
<ref id="B68">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Goodwin</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>LeBaron</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2011</year>). <source>Embodied Interaction: Language and Body in the Material World</source>. <publisher-loc>Cambridge</publisher-loc>: <publisher-name>Cambridge University Press</publisher-name>. </citation>
</ref>
<ref id="B85">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Stivers</surname>
<given-names>T.</given-names>
</name>
<name>
<surname>Sidnell</surname>
<given-names>J.</given-names>
</name>
</person-group> (Editors) (<year>2012</year>). <source>Handbook of Conversation Analysis</source> (<publisher-loc>Malden, MA</publisher-loc>: <publisher-name>Wiley-Blackwell</publisher-name>). </citation>
</ref>
<ref id="B83">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stukenbrock</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2008</year>). <article-title>&#x201c;Wo ist der Hauptschmerz?&#x201d; - Zeigen am eigenen K&#xf6;rper in der medizinischen Kommunikation</article-title>. <source>Gespr&#xe4;chsforschung &#x2013; Online-Zeitschrift zur verbalen Interaktion</source> <volume>9</volume>, <fpage>1</fpage>&#x2013;<lpage>33</lpage>. </citation>
</ref>
<ref id="B82">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stukenbrock</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2010</year>). <article-title>&#xdc;berlegungen zu einem multimodalen Verst&#xe4;ndnis der gesprochenen Sprache am Beispiel deiktischer Verwendungsweisen des&#x20;Ausdrucks &#x201c;so</article-title>. <source>InLiSt - Interaction and Linguistic Structures</source> <volume>47</volume>, <fpage>1</fpage>&#x2013;<lpage>23</lpage>. </citation>
</ref>
<ref id="B81">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stukenbrock</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2014</year>). <article-title>Take the Words Out of My Mouth: Verbal Instructions as&#x20;Embodied Practices</article-title>. <source>J.&#x20;Pragmatics</source> <volume>65</volume>, <fpage>80</fpage>&#x2013;<lpage>102</lpage>. <pub-id pub-id-type="doi">10.1016/j.pragma.2013.08.017</pub-id> </citation>
</ref>
<ref id="B77">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Stukenbrock</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2015</year>). <source>Deixis in der Face-to-Face-Interaktion</source>. <publisher-loc>Berlin, Boston</publisher-loc>: <publisher-name>De Gruyter</publisher-name>. </citation>
</ref>
<ref id="B79">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Stukenbrock</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2017</year>). &#x201c;<article-title>Intercorporeal Phantasms: Kinesthetic Alignment with Imagined Bodies in Self-Defense Training</article-title>,&#x201d; in <source>Intercorporeality: Emerging Socialities in Interaction</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Meyer</surname>
<given-names>C.</given-names>
</name>
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
<name>
<surname>Jordan</surname>
<given-names>J.&#x20;S.</given-names>
</name>
</person-group> (<publisher-loc>Oxford</publisher-loc>: <publisher-name>Oxford University Press</publisher-name>), <fpage>237</fpage>&#x2013;<lpage>263</lpage>. </citation>
</ref>
<ref id="B75">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Stukenbrock</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2018a</year>). &#x201c;<article-title>Forward-Looking: Where Do We Go With Multimodal Projections?</article-title>,&#x201d; in <source>Modalities and Temporalities: Convergences and Divergences of Bodily Resources in Interaction</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Deppermann</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Streeck</surname>
<given-names>J.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>31</fpage>&#x2013;<lpage>68</lpage>. <pub-id pub-id-type="doi">10.1075/pbns.293.01stu</pub-id> </citation>
</ref>
<ref id="B74">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Stukenbrock</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2018b</year>). &#x201c;<article-title>Mobile Dual Eye-Tracking in Face-to-Face Interaction: The Case of Deixis and Joint Attention</article-title>,&#x201d; in <source>Eye-Tracking in Interaction</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Br&#xf4;ne</surname>
<given-names>G.</given-names>
</name>
<name>
<surname>Oben</surname>
<given-names>B.</given-names>
</name>
</person-group> (<publisher-loc>Amsterdam</publisher-loc>: <publisher-name>John Benjamins</publisher-name>), <fpage>265</fpage>&#x2013;<lpage>302</lpage>. <pub-id pub-id-type="doi">10.1075/ais.10.11stu</pub-id> </citation>
</ref>
<ref id="B78">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Stukenbrock</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2020a</year>). <article-title>Deixis, Meta-Perceptive Gaze Practices, and the Interactional Achievement of Joint Attention</article-title>. <source>Front. Psychol.</source> <volume>11</volume>. <pub-id pub-id-type="doi">10.3389/fpsyg.2020.01779</pub-id> </citation>
</ref>
<ref id="B80">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Stukenbrock</surname>
<given-names>A.</given-names>
</name>
</person-group> (<year>2020b</year>). &#x201c;<article-title>Mit Blick auf die Geste - Multimodale Verfestigungen in der Interaktion</article-title>,&#x201d; in <source>Verfestigungen in der Interaktion</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Weidner</surname>
<given-names>B.</given-names>
</name>
<name>
<surname>K&#xf6;nig</surname>
<given-names>K.</given-names>
</name>
<name>
<surname>Imo</surname>
<given-names>W.</given-names>
</name>
<name>
<surname>Wegner</surname>
<given-names>L.</given-names>
</name>
</person-group> (<publisher-loc>Berlin, Boston</publisher-loc>: <publisher-name>De Gruyter</publisher-name>), <fpage>231</fpage>&#x2013;<lpage>262</lpage>. <pub-id pub-id-type="doi">10.1515/9783110637502-010</pub-id> </citation>
</ref>
<ref id="B76">
<citation citation-type="book">
<person-group person-group-type="author">
<name>
<surname>Stukenbrock</surname>
<given-names>A.</given-names>
</name>
<name>
<surname>Dao</surname>
<given-names>A. N.</given-names>
</name>
</person-group> (<year>2019</year>). &#x201c;<article-title>Joint Attention in Passing: What Dual&#x20;Mobile Eye Tracking Reveals about Gaze in Coordinating Embodied&#x20;Activities at a Market</article-title>,&#x201d; in <source>Embodied Activities in Face-To-Face and Mediated Settings</source>. Editors <person-group person-group-type="editor">
<name>
<surname>Reber</surname>
<given-names>E.</given-names>
</name>
<name>
<surname>Gerhardt</surname>
<given-names>C.</given-names>
</name>
</person-group> (<publisher-loc>Basingstoke</publisher-loc>: <publisher-name>Palgrave Macmillan</publisher-name>), <fpage>177</fpage>&#x2013;<lpage>213</lpage>. <pub-id pub-id-type="doi">10.1007/978-3-319-97325-8_6</pub-id> </citation>
</ref>
<ref id="B84">
<citation citation-type="journal">
<person-group person-group-type="author">
<name>
<surname>Svensson</surname>
<given-names>M. S.</given-names>
</name>
<name>
<surname>Luff</surname>
<given-names>P.</given-names>
</name>
<name>
<surname>Heath</surname>
<given-names>C.</given-names>
</name>
</person-group> (<year>2009</year>). <article-title>Embedding Instruction in Practice: Contingency and Collaboration during Surgical Training</article-title>. <source>Sociol. Health Illness</source> <volume>31</volume>, <fpage>889</fpage>&#x2013;<lpage>906</lpage>. <pub-id pub-id-type="doi">10.1111/j.1467-9566.2009.01195.x</pub-id> </citation>
</ref>
</ref-list>
</back>
</article>